From: Kevin Wolf <kwolf@redhat.com>
To: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: keith.busch@intel.com, jsnow@redhat.com, qemu-block@nongnu.org,
qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com,
den@virtuozzo.com, famz@redhat.com, vsementsov@virtuozzo.com
Subject: Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section"
Date: Fri, 7 Dec 2018 13:26:47 +0100 [thread overview]
Message-ID: <20181207122647.GE5119@linux.fritz.box> (raw)
In-Reply-To: <20181205122326.26625-1-dplotnikov@virtuozzo.com>
Am 05.12.2018 um 13:23 hat Denis Plotnikov geschrieben:
> At the time, the "drained section" doesn't protect Block Driver State
> from the requests appearing in the vCPU threads.
> This could lead to the data loss because of request coming to
> an unexpected BDS.
>
> For example, when a request comes to ide controller from the guest,
> the controller creates a request coroutine and executes the coroutine
> in the vCPU thread. If another thread(iothread) has entered the
> "drained section" on a BDS with bdrv_drained_begin, which protects
> BDS' AioContext from external requests, and released the AioContext
> because of finishing some coroutine by the moment of the request
> appearing at the ide controller, the controller acquires the AioContext
> and executes its request without any respect to the entered
> "drained section" producing any kinds of data inconsistency.
>
> The patch prevents this case by putting requests from external threads to
> the queue on AioContext while the context is protected for external requests
> and executes those requests later on the external requests protection removing.
>
> Also, the patch marks requests generated in a vCPU thread as external ones
> to make use of the request postponing.
>
> How to reproduce:
> 1. start vm with an ide disk and a linux guest
> 2. in the guest run: dd if=... of=... bs=4K count=100000 oflag=direct
> 3. (qemu) drive_mirror "disk-name"
> 4. wait until block job can receive block_job_complete
> 5. (qemu) block_job_complete "disk-name"
> 6. blk_aio_p[read|write]v may appear in vCPU context (here is the race)
>
> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
This is getting closer, but I'd like to see two more major changes:
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 0ca25dfec6..8512bda44e 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -19,6 +19,7 @@
> #include "qemu/event_notifier.h"
> #include "qemu/thread.h"
> #include "qemu/timer.h"
> +#include "qemu/coroutine.h"
>
> typedef struct BlockAIOCB BlockAIOCB;
> typedef void BlockCompletionFunc(void *opaque, int ret);
> @@ -130,6 +131,11 @@ struct AioContext {
> QEMUTimerListGroup tlg;
>
> int external_disable_cnt;
> + /* Queue to store the requests coming when the context is disabled for
> + * external requests.
> + * Don't use a separate lock for protection relying the context lock
> + */
> + CoQueue postponed_reqs;
Why involve the AioContext at all? This could all be kept at the
BlockBackend level without extending the layering violation that
aio_disable_external() is.
BlockBackends get notified when their root node is drained, so hooking
things up there should be as easy, if not even easier than in
AioContext.
> /* Number of AioHandlers without .io_poll() */
> int poll_disable_cnt;
> @@ -483,6 +489,15 @@ static inline void aio_timer_init(AioContext *ctx,
> */
> int64_t aio_compute_timeout(AioContext *ctx);
>
> +/**
> + * aio_co_enter:
> + * @ctx: the context to run the coroutine
> + * @co: the coroutine to run
> + *
> + * Enter a coroutine in the specified AioContext.
> + */
> +void aio_co_enter(AioContext *ctx, struct Coroutine *co);
> +
> /**
> * aio_disable_external:
> * @ctx: the aio context
> @@ -491,9 +506,17 @@ int64_t aio_compute_timeout(AioContext *ctx);
> */
> static inline void aio_disable_external(AioContext *ctx)
> {
> + aio_context_acquire(ctx);
> atomic_inc(&ctx->external_disable_cnt);
> + aio_context_release(ctx);
> }
This acquire/release pair looks rather useless?
> +static void run_postponed_co(void *opaque)
> +{
> + AioContext *ctx = (AioContext *) opaque;
> +
> + qemu_co_queue_restart_all(&ctx->postponed_reqs);
> +}
> /**
> * aio_enable_external:
> * @ctx: the aio context
> @@ -504,12 +527,17 @@ static inline void aio_enable_external(AioContext *ctx)
> {
> int old;
>
> + aio_context_acquire(ctx);
> old = atomic_fetch_dec(&ctx->external_disable_cnt);
> assert(old > 0);
> if (old == 1) {
> + Coroutine *co = qemu_coroutine_create(run_postponed_co, ctx);
> + aio_co_enter(ctx, co);
> +
> /* Kick event loop so it re-arms file descriptors */
> aio_notify(ctx);
> }
> + aio_context_release(ctx);
> }
>
> /**
> @@ -564,15 +592,6 @@ void aio_co_schedule(AioContext *ctx, struct Coroutine *co);
> */
> void aio_co_wake(struct Coroutine *co);
>
> -/**
> - * aio_co_enter:
> - * @ctx: the context to run the coroutine
> - * @co: the coroutine to run
> - *
> - * Enter a coroutine in the specified AioContext.
> - */
> -void aio_co_enter(AioContext *ctx, struct Coroutine *co);
> -
> /**
> * Return the AioContext whose event loop runs in the current thread.
> *
> diff --git a/include/block/block.h b/include/block/block.h
> index 7f5453b45b..0685a73975 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -83,8 +83,14 @@ typedef enum {
> */
> BDRV_REQ_SERIALISING = 0x80,
>
> + /*
> + * marks requests comming from external sources,
> + * e.g block requests from dma running in the vCPU thread
> + */
> + BDRV_REQ_EXTERNAL = 0x100,
I don't like this one because BlockBackends should be used _only_ from
external sources.
I know that this isn't quite true today and there are still block jobs
calling BlockBackend internally while handling a BDS request, but I hope
with Vladimir's backup patches going it, this can go away.
So I suggest that for the time being, we invert the flag and have a
BDRV_REQ_INTERNAL instead, which is only used for BlockBackend requests,
hopefully doesn't have to be used in many places and which can go away
eventually.
> /* Mask of valid flags */
> - BDRV_REQ_MASK = 0xff,
> + BDRV_REQ_MASK = 0xfff,
> } BdrvRequestFlags;
>
> typedef struct BlockSizes {
Kevin
next prev parent reply other threads:[~2018-12-07 12:27 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-05 12:23 [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section" Denis Plotnikov
2018-12-07 12:26 ` Kevin Wolf [this message]
2018-12-10 12:14 ` Denis Plotnikov
2018-12-10 12:25 ` Kevin Wolf
2018-12-11 16:55 ` Denis Plotnikov
2018-12-12 12:24 ` Kevin Wolf
2018-12-13 11:07 ` Denis Plotnikov
2018-12-13 12:20 ` Kevin Wolf
2018-12-14 11:54 ` Denis Plotnikov
2018-12-18 8:53 ` Denis Plotnikov
2019-01-09 8:18 ` Denis Plotnikov
2019-01-15 7:22 ` Denis Plotnikov
2019-01-17 12:57 ` [Qemu-devel] PING: " Denis Plotnikov
2019-01-17 14:23 ` Kevin Wolf
2019-01-18 7:43 ` Denis Plotnikov
[not found] ` <20190313160412.GF5167@linux.fritz.box>
2019-04-02 8:35 ` [Qemu-devel] " Denis Plotnikov
2019-04-09 10:01 ` Kevin Wolf
2019-04-09 10:01 ` Kevin Wolf
2019-06-21 9:16 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2019-06-21 9:59 ` Vladimir Sementsov-Ogievskiy
2019-06-24 9:46 ` Denis Plotnikov
2019-06-26 8:46 ` Denis Plotnikov
2019-06-28 12:32 ` Kevin Wolf
2019-07-02 14:41 ` Denis Plotnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181207122647.GE5119@linux.fritz.box \
--to=kwolf@redhat.com \
--cc=den@virtuozzo.com \
--cc=dplotnikov@virtuozzo.com \
--cc=famz@redhat.com \
--cc=jsnow@redhat.com \
--cc=keith.busch@intel.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.