From: Stefan Hajnoczi <stefanha@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Fam Zheng" <fam@euphon.net>,
"Markus Armbruster" <armbru@redhat.com>,
"Laurent Vivier" <lvivier@redhat.com>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Wen Congyang" <wencongyang2@huawei.com>,
"Kevin Wolf" <kwolf@redhat.com>,
"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
"Richard Henderson" <richard.henderson@linaro.org>,
"David Hildenbrand" <david@redhat.com>,
"Hanna Reitz" <hreitz@redhat.com>,
"Xie Changlong" <xiechanglong.d@gmail.com>,
"Eduardo Habkost" <eduardo@habkost.net>,
qemu-block@nongnu.org, "Eric Blake" <eblake@redhat.com>,
"John Snow" <jsnow@redhat.com>,
afaria@redhat.com, "Jeff Cody" <codyprime@gmail.com>,
"Yanan Wang" <wangyanan55@huawei.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Raphael Norwitz" <raphael.norwitz@nutanix.com>,
sgarzare@redhat.com, integration@gluster.org,
"Peter Xu" <peterx@redhat.com>,
"Richard W.M. Jones" <rjones@redhat.com>,
"Thomas Huth" <thuth@redhat.com>,
"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
"Denis V. Lunev" <den@openvz.org>
Subject: Re: [PATCH v7 00/13] blkio: add libblkio BlockDriver
Date: Wed, 26 Oct 2022 14:57:21 -0400 [thread overview]
Message-ID: <Y1mDEWUCJQoMjUyj@fedora> (raw)
In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 8867 bytes --]
On Thu, Oct 13, 2022 at 02:58:55PM -0400, Stefan Hajnoczi wrote:
> v7:
> - Add nvme-io_uring and virtio-blk-vhost-user syntax examples to commit description [Markus]
> - Add missing nvme-io_uring QAPI [Markus, Alberto]
> - Rename mem-regions-pinned to may-pin-mem-regions [Alberto]
> - Fix value/bs->bl.max_iov mix-up [Stefano]
> v6:
> - Add untested nvme-io_uring driver. Please test in your nested NVMe environment, Alberto. [Alberto]
> - Map blkio mem regions only when necessary to reduce conflicts with RAM discard [Alberto]
> - Reduce duplication by having a single blkio_virtio_blk_common_open() function [Alberto]
> - Avoid duplication in BlockDriver definitions using a macro [Alberto]
> - Avoid ram block registrar segfault [Stefano]
> - Use QLIST_FOREACH_SAFE() in ram block notifier code so callbacks can remove themselves
> v5:
> - Drop "RFC" since libblkio 1.0 has been released and the library API is stable
> - Disable BDRV_REQ_REGISTERED_BUF if we run out of blkio_mem_regions. The
> bounce buffer slow path is taken when there are not enough blkio_mem_regions
> to cover guest RAM. [Hanna & David Hildenbrand]
> - Call ram_block_discard_disable() when mem-region-pinned property is true or
> absent [David Hildenbrand]
> - Use a bounce buffer pool instead of allocating/freeing a buffer for each
> request. This reduces the number of blkio_mem_regions required for bounce
> buffers to 1 and avoids frequent blkio_mem_region_map/unmap() calls.
> - Switch to .bdrv_co_*() instead of .bdrv_aio_*(). Needed for the bounce buffer
> pool's CoQueue.
> v4:
> - Patch 1:
> - Add virtio-blk-vhost-user driver [Kevin]
> - Drop .bdrv_parse_filename() and .bdrv_needs_filename for virtio-blk-vhost-vdpa [Stefano]
> - Add copyright and license header [Hanna]
> - Drop .bdrv_parse_filename() in favor of --blockdev or json: [Hanna]
> - Clarify that "filename" is always non-NULL for io_uring [Hanna]
> - Check that virtio-blk-vhost-vdpa "path" option is non-NULL [Hanna]
> - Fix virtio-blk-vhost-vdpa cache.direct=off logic [Hanna]
> - Use macros for driver names [Hanna]
> - Assert that the driver name is valid [Hanna]
> - Update "readonly" property name to "read-only" [Hanna]
> - Call blkio_detach_aio_context() in blkio_close() [Hanna]
> - Avoid uint32_t * to int * casts in blkio_refresh_limits() [Hanna]
> - Remove write zeroes and discard from the todo list [Hanna]
> - Use PRIu32 instead of %d for uint32_t [Hanna]
> - Fix error messages with buf-alignment instead of optimal-io-size [Hanna]
> - Call map/unmap APIs since libblkio alloc/free APIs no longer do that
> - Update QAPI schema QEMU version to 7.2
> - Patch 5:
> - Expand the BDRV_REQ_REGISTERED_BUF flag passthrough and drop assert(!flags)
> in drivers [Hanna]
> - Patch 7:
> - Fix BLK->BDRV typo [Hanna]
> - Make BlockRAMRegistrar handle failure [Hanna]
> - Patch 8:
> - Replace memory_region_get_fd() approach with qemu_ram_get_fd()
> - Patch 10:
> - Use (void)ret; to discard unused return value [Hanna]
> - libblkio's blkio_unmap_mem_region() API no longer has a return value
> - Check for registered bufs that cross RAMBlocks [Hanna]
> - Patch 11:
> - Handle bdrv_register_buf() errors [Hanna]
> v3:
> - Add virtio-blk-vhost-vdpa for vdpa-blk devices including VDUSE
> - Add discard and write zeroes support
> - Rebase and adopt latest libblkio APIs
> v2:
> - Add BDRV_REQ_REGISTERED_BUF to bs.supported_write_flags [Stefano]
> - Use new blkioq_get_num_completions() API
> - Implement .bdrv_refresh_limits()
>
> This patch series adds a QEMU BlockDriver for libblkio
> (https://gitlab.com/libblkio/libblkio/), a library for high-performance block
> device I/O. This work was presented at KVM Forum 2022 and slides are available
> here:
> https://static.sched.com/hosted_files/kvmforum2022/8c/libblkio-kvm-forum-2022.pdf
>
> The second patch adds the core BlockDriver and most of the libblkio API usage.
> Three libblkio drivers are included:
> - io_uring
> - virtio-blk-vhost-user
> - virtio-blk-vhost-vdpa
>
> The remainder of the patch series reworks the existing QEMU bdrv_register_buf()
> API so virtio-blk emulation efficiently map guest RAM for libblkio - some
> libblkio drivers require that I/O buffer memory is pre-registered (think VFIO,
> vhost, etc).
>
> Vladimir requested performance results that show the effect of the
> BDRV_REQ_REGISTERED_BUF flag. I ran the patches against qemu-storage-daemon's
> vhost-user-blk export with iodepth=1 bs=512 to see the per-request overhead due
> to bounce buffer allocation/mapping:
>
> Name IOPS Error
> bounce-buf 4373.81 ± 0.01%
> registered-buf 13062.80 ± 0.67%
>
> The BDRV_REQ_REGISTERED_BUF optimization version is about 3x faster.
>
> See the BlockDriver struct in block/blkio.c for a list of APIs that still need
> to be implemented. The core functionality is covered.
>
> Regarding the design: each libblkio driver is a separately named BlockDriver.
> That means there is an "io_uring" BlockDriver and not a generic "libblkio"
> BlockDriver. This way QAPI and open parameters are type-safe and mandatory
> parameters can be checked by QEMU.
>
> Stefan Hajnoczi (13):
> coroutine: add flag to re-queue at front of CoQueue
> blkio: add libblkio block driver
> numa: call ->ram_block_removed() in ram_block_notifer_remove()
> block: pass size to bdrv_unregister_buf()
> block: use BdrvRequestFlags type for supported flag fields
> block: add BDRV_REQ_REGISTERED_BUF request flag
> block: return errors from bdrv_register_buf()
> numa: use QLIST_FOREACH_SAFE() for RAM block notifiers
> block: add BlockRAMRegistrar
> exec/cpu-common: add qemu_ram_get_fd()
> stubs: add qemu_ram_block_from_host() and qemu_ram_get_fd()
> blkio: implement BDRV_REQ_REGISTERED_BUF optimization
> virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint
>
> MAINTAINERS | 7 +
> meson_options.txt | 2 +
> qapi/block-core.json | 77 +-
> meson.build | 9 +
> include/block/block-common.h | 9 +
> include/block/block-global-state.h | 10 +-
> include/block/block_int-common.h | 15 +-
> include/exec/cpu-common.h | 1 +
> include/hw/virtio/virtio-blk.h | 2 +
> include/qemu/coroutine.h | 15 +-
> include/sysemu/block-backend-global-state.h | 4 +-
> include/sysemu/block-ram-registrar.h | 37 +
> block.c | 14 +
> block/blkio.c | 1008 +++++++++++++++++++
> block/blkverify.c | 4 +-
> block/block-backend.c | 8 +-
> block/block-ram-registrar.c | 58 ++
> block/crypto.c | 4 +-
> block/file-posix.c | 1 -
> block/gluster.c | 1 -
> block/io.c | 101 +-
> block/mirror.c | 2 +
> block/nbd.c | 1 -
> block/nvme.c | 20 +-
> block/parallels.c | 1 -
> block/qcow.c | 2 -
> block/qed.c | 1 -
> block/raw-format.c | 2 +
> block/replication.c | 1 -
> block/ssh.c | 1 -
> block/vhdx.c | 1 -
> hw/block/virtio-blk.c | 39 +-
> hw/core/numa.c | 26 +-
> qemu-img.c | 6 +-
> softmmu/physmem.c | 5 +
> stubs/physmem.c | 13 +
> tests/qtest/modules-test.c | 3 +
> util/qemu-coroutine-lock.c | 9 +-
> util/vfio-helpers.c | 5 +-
> block/meson.build | 2 +
> scripts/meson-buildoptions.sh | 3 +
> stubs/meson.build | 1 +
> 42 files changed, 1435 insertions(+), 96 deletions(-)
> create mode 100644 include/sysemu/block-ram-registrar.h
> create mode 100644 block/blkio.c
> create mode 100644 block/block-ram-registrar.c
> create mode 100644 stubs/physmem.c
>
> --
> 2.37.3
>
Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
prev parent reply other threads:[~2022-10-26 19:01 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-13 18:58 [PATCH v7 00/13] blkio: add libblkio BlockDriver Stefan Hajnoczi
2022-10-13 18:58 ` [PATCH v7 01/13] coroutine: add flag to re-queue at front of CoQueue Stefan Hajnoczi
2022-10-13 18:58 ` [PATCH v7 02/13] blkio: add libblkio block driver Stefan Hajnoczi
2022-10-19 9:48 ` Stefano Garzarella
2022-10-13 18:58 ` [PATCH v7 03/13] numa: call ->ram_block_removed() in ram_block_notifer_remove() Stefan Hajnoczi
2022-10-13 18:58 ` [PATCH v7 04/13] block: pass size to bdrv_unregister_buf() Stefan Hajnoczi
2022-10-13 18:59 ` [PATCH v7 05/13] block: use BdrvRequestFlags type for supported flag fields Stefan Hajnoczi
2022-10-13 18:59 ` [PATCH v7 06/13] block: add BDRV_REQ_REGISTERED_BUF request flag Stefan Hajnoczi
2022-10-13 18:59 ` [PATCH v7 07/13] block: return errors from bdrv_register_buf() Stefan Hajnoczi
2022-10-13 18:59 ` [PATCH v7 08/13] numa: use QLIST_FOREACH_SAFE() for RAM block notifiers Stefan Hajnoczi
2022-10-13 18:59 ` [PATCH v7 09/13] block: add BlockRAMRegistrar Stefan Hajnoczi
2022-10-19 9:50 ` Stefano Garzarella
2022-10-13 18:59 ` [PATCH v7 10/13] exec/cpu-common: add qemu_ram_get_fd() Stefan Hajnoczi
2022-10-13 18:59 ` [PATCH v7 11/13] stubs: add qemu_ram_block_from_host() and qemu_ram_get_fd() Stefan Hajnoczi
2022-10-13 18:59 ` [PATCH v7 12/13] blkio: implement BDRV_REQ_REGISTERED_BUF optimization Stefan Hajnoczi
2022-10-19 9:51 ` Stefano Garzarella
2022-10-13 18:59 ` [PATCH v7 13/13] virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint Stefan Hajnoczi
2022-10-19 9:51 ` Stefano Garzarella
2022-10-19 9:54 ` [PATCH v7 00/13] blkio: add libblkio BlockDriver Stefano Garzarella
2022-10-26 18:57 ` Stefan Hajnoczi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1mDEWUCJQoMjUyj@fedora \
--to=stefanha@redhat.com \
--cc=afaria@redhat.com \
--cc=armbru@redhat.com \
--cc=codyprime@gmail.com \
--cc=david@redhat.com \
--cc=den@openvz.org \
--cc=eblake@redhat.com \
--cc=eduardo@habkost.net \
--cc=f4bug@amsat.org \
--cc=fam@euphon.net \
--cc=hreitz@redhat.com \
--cc=integration@gluster.org \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=lvivier@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=raphael.norwitz@nutanix.com \
--cc=richard.henderson@linaro.org \
--cc=rjones@redhat.com \
--cc=sgarzare@redhat.com \
--cc=thuth@redhat.com \
--cc=vsementsov@yandex-team.ru \
--cc=wangyanan55@huawei.com \
--cc=wencongyang2@huawei.com \
--cc=xiechanglong.d@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.