From: Kevin Wolf <kwolf@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, mreitz@redhat.com
Subject: Re: [Qemu-devel] [PATCH v5 4/7] blkdebug: Add pass-through write_zero and discard support
Date: Wed, 15 Feb 2017 16:53:39 +0100 [thread overview]
Message-ID: <20170215155339.GD4935@noname.redhat.com> (raw)
In-Reply-To: <20170214192525.18624-5-eblake@redhat.com>
Am 14.02.2017 um 20:25 hat Eric Blake geschrieben:
> In order to test the effects of artificial geometry constraints
> on operations like write zero or discard, we first need blkdebug
> to manage these actions. It also allows us to inject errors on
> those operations, just like we can for read/write/flush.
>
> We can also test the contract promised by the block layer; namely,
> if a device has specified limits on alignment or maximum size,
> then those limits must be obeyed (for now, the blkdebug driver
> merely inherits limits from whatever it is wrapping, but the next
> patch will further enhance it to allow specific limit overrides).
>
> This patch intentionally refuses to service requests smaller than
> the requested alignments; this is because an upcoming patch adds
> a qemu-iotest to prove that the block layer is correctly handling
> fragmentation, but the test only works if there is a way to tell
> the difference at artificial alignment boundaries when blkdebug is
> using a larger-than-default alignment. If we let the blkdebug
> layer always defer to the underlying layer, which potentially has
> a smaller granularity, the iotest will be thwarted.
>
> Tested by setting up an NBD server with export 'foo', then invoking:
> $ ./qemu-io
> qemu-io> open -o driver=blkdebug blkdebug::nbd://localhost:10809/foo
> qemu-io> d 0 15M
> qemu-io> w -z 0 15M
>
> Pre-patch, the server never sees the discard (it was silently
> eaten by the block layer); post-patch it is passed across the
> wire. Likewise, pre-patch the write is always passed with
> NBD_WRITE (with 15M of zeroes on the wire), while post-patch
> it can utilize NBD_WRITE_ZEROES (for less traffic).
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
>
> ---
> v5: include 2017 copyright
> v4: correct error injection to respect byte range, tweak formatting
> v3: rebase to byte-based read/write, improve docs on why no
> partial write zero passthrough
> v2: new patch
> ---
> block/blkdebug.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 86 insertions(+)
>
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 37094a2..b2d5f7d 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -1,6 +1,7 @@
> /*
> * Block protocol for I/O error injection
> *
> + * Copyright (C) 2016-2017 Red Hat, Inc.
> * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
> *
> * Permission is hereby granted, free of charge, to any person obtaining a copy
> @@ -382,6 +383,11 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
> goto out;
> }
>
> + bs->supported_write_flags = BDRV_REQ_FUA &
> + bs->file->bs->supported_write_flags;
> + bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
> + bs->file->bs->supported_zero_flags;
> +
> /* Set request alignment */
> align = qemu_opt_get_size(opts, "align", 0);
> if (align < INT_MAX && is_power_of_2(align)) {
> @@ -511,6 +517,84 @@ static int blkdebug_co_flush(BlockDriverState *bs)
> return bdrv_co_flush(bs->file->bs);
> }
>
> +static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
> + int64_t offset, int count,
> + BdrvRequestFlags flags)
> +{
> + BDRVBlkdebugState *s = bs->opaque;
> + BlkdebugRule *rule = NULL;
> + uint32_t align = MAX(bs->bl.request_alignment,
> + bs->bl.pwrite_zeroes_alignment);
> +
> + /* Only pass through requests that are larger than requested
> + * preferred alignment (so that we test the fallback to writes on
> + * unaligned portions), and check that the block layer never hands
> + * us anything crossing an alignment boundary. */
"crossing an alignment boundary" isn't really what we're interested in
(and also not what your code checks), but just that things are properly
aligned.
> + if (count < align) {
> + return -ENOTSUP;
> + }
> + assert(QEMU_IS_ALIGNED(offset, align));
> + assert(QEMU_IS_ALIGNED(count, align));
> + if (bs->bl.max_pwrite_zeroes) {
> + assert(count <= bs->bl.max_pwrite_zeroes);
> + }
> +
> + QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
> + uint64_t inject_offset = rule->options.inject.offset;
> +
> + if (inject_offset == -1 ||
> + (inject_offset >= offset && inject_offset < offset + count))
> + {
> + break;
> + }
> + }
> +
> + if (rule && rule->options.inject.error) {
> + return inject_error(bs, rule);
> + }
> +
> + return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
> +}
> +
> +static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs,
> + int64_t offset, int count)
> +{
> + BDRVBlkdebugState *s = bs->opaque;
> + BlkdebugRule *rule = NULL;
> + uint32_t align = bs->bl.pdiscard_alignment;
> +
> + /* Only pass through requests that are larger than requested
> + * minimum alignment, and ensure that unaligned requests do not
> + * cross optimum discard boundaries. */
> + if (count < bs->bl.request_alignment) {
> + return -ENOTSUP;
> + }
> + assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
> + assert(QEMU_IS_ALIGNED(count, bs->bl.request_alignment));
> + if (align && count >= align) {
> + assert(QEMU_IS_ALIGNED(offset, align));
> + assert(QEMU_IS_ALIGNED(count, align));
Here, in contrast, I think you really want to do what the comment says
(because the contract is that you get head, bulk and tail of a discard
request separately), but the code fails to do so: We could have
count < align, but still cross a optimum discard alignment boundary if
offset is misaligned, too, and we have two partial accesses (i.e. head
and tail in the same call).
> + }
> + if (bs->bl.max_pdiscard) {
> + assert(count <= bs->bl.max_pdiscard);
> + }
> +
> + QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
> + uint64_t inject_offset = rule->options.inject.offset;
> +
> + if (inject_offset == -1 ||
> + (inject_offset >= offset && inject_offset < offset + count))
> + {
> + break;
> + }
> + }
> +
> + if (rule && rule->options.inject.error) {
> + return inject_error(bs, rule);
> + }
This piece of code is duplicated in each I/O function. Should we
consider factoring it out?
> + return bdrv_co_pdiscard(bs->file->bs, offset, count);
> +}
Kevin
next prev parent reply other threads:[~2017-02-15 15:53 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-14 19:25 [Qemu-devel] [PATCH v5 0/7] add blkdebug tests Eric Blake
2017-02-14 19:25 ` [Qemu-devel] [PATCH v5 1/7] qcow2: Assert that cluster operations are aligned Eric Blake
2017-02-15 12:25 ` Kevin Wolf
2017-03-06 23:34 ` Eric Blake
2017-02-14 19:25 ` [Qemu-devel] [PATCH v5 2/7] qcow2: Discard/zero clusters by byte count Eric Blake
2017-02-15 15:46 ` Kevin Wolf
2017-02-14 19:25 ` [Qemu-devel] [PATCH v5 3/7] blkdebug: Sanity check block layer guarantees Eric Blake
2017-02-14 19:25 ` [Qemu-devel] [PATCH v5 4/7] blkdebug: Add pass-through write_zero and discard support Eric Blake
2017-02-15 15:53 ` Kevin Wolf [this message]
2017-02-14 19:25 ` [Qemu-devel] [PATCH v5 5/7] blkdebug: Simplify override logic Eric Blake
2017-02-15 15:58 ` Kevin Wolf
2017-02-14 19:25 ` [Qemu-devel] [PATCH v5 6/7] blkdebug: Add ability to override unmap geometries Eric Blake
2017-02-15 16:20 ` Kevin Wolf
2017-03-07 21:14 ` Eric Blake
2017-02-14 19:25 ` [Qemu-devel] [PATCH v5 7/7] tests: Add coverage for recent block geometry fixes Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170215155339.GD4935@noname.redhat.com \
--to=kwolf@redhat.com \
--cc=eblake@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).