All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: qemu-devel@nongnu.org, "Yanan Wang" <wangyanan55@huawei.com>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"John Snow" <jsnow@redhat.com>, "Denis V. Lunev" <den@openvz.org>,
	"Xie Changlong" <xiechanglong.d@gmail.com>,
	"Eric Blake" <eblake@redhat.com>,
	integration@gluster.org, "David Hildenbrand" <david@redhat.com>,
	"Wen Congyang" <wencongyang2@huawei.com>,
	"Laurent Vivier" <lvivier@redhat.com>,
	"Richard W.M. Jones" <rjones@redhat.com>,
	afaria@redhat.com, "Fam Zheng" <fam@euphon.net>,
	"Thomas Huth" <thuth@redhat.com>,
	"Hanna Reitz" <hreitz@redhat.com>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Peter Xu" <peterx@redhat.com>,
	"Raphael Norwitz" <raphael.norwitz@nutanix.com>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"Jeff Cody" <codyprime@gmail.com>,
	qemu-block@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v5 00/12] blkio: add libblkio BlockDriver
Date: Thu, 6 Oct 2022 13:32:26 -0400	[thread overview]
Message-ID: <Yz8RKiROmD5TjHNx@fedora> (raw)
In-Reply-To: <CAGxU2F5aE1P-atJ04SmUHrrW5QWbU80E_H7JBNaBzFkoVd7wOw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 12351 bytes --]

On Thu, Oct 06, 2022 at 02:18:35PM +0200, Stefano Garzarella wrote:
> Hi Stefan,
> I tested this series with the vDPA block simulator in Linux v6.0.
> It worked well, but I had a segfault when blkio_map_mem_region() fails.
> 
> In my case, it failed because I forgot to increase the `memlock` limit 
> with `ulimit -l unlimited` since the simulator still requires locking 
> all the VM memory.
> 
> QEMU failed with this messages:
> $ qemu-system-x86_64 ... \
>   -blockdev node-name=drive_src1,driver=virtio-blk-vhost-vdpa,path=/dev/vhost-vdpa-0,cache.direct=on
> qemu-system-x86_64: -device virtio-blk-pci,id=src1,bootindex=2,drive=drive_src1,iothread=iothread0: Failed to add blkio mem region 0x7f60bfe00000 with size 536870912: Bad address (os error 14)
> qemu-system-x86_64: -device virtio-blk-pci,id=src1,bootindex=2,drive=drive_src1,iothread=iothread0: Failed to add blkio mem region 0x7f60be400000 with size 16777216: Bad address (os error 14)
> [1]    20803 segmentation fault
> 
> IIUC this could be related to a double call to 
> ram_block_notifier_remove() in the error path of ram_block_added() 
> (block/block-ram-registrar.c) callback.
> 
> Maybe we should call blk_register_buf() only if r->ok is true.

Thanks for pointing this out!

The notifiers don't use QLIST_FOREACH_SAFE() so calling
ram_block_notifier_remove() from within .ram_block_added() is incorrect.

Stefan

> 
> The stack trace is the following:
> #0  0x00005641a8b097dd in ram_block_notifier_remove (n=n@entry=0x5641ab8354d8) at ../hw/core/numa.c:850
> #1  0x00005641a89c4701 in ram_block_added
>      (n=0x5641ab8354d8, host=<optimized out>, size=<optimized out>, max_size=<optimized out>)
>     at ../block/block-ram-registrar.c:20
> #2  0x00005641a8b083af in ram_block_notify_add_single (rb=0x5641ab3d2810, opaque=0x5641ab8354d8)
>     at ../hw/core/numa.c:820
> #3  0x00005641a8b92d8d in qemu_ram_foreach_block
>     (func=func@entry=0x5641a8b08370 <ram_block_notify_add_single>, opaque=opaque@entry=0x5641ab8354d8)
>     at ../softmmu/physmem.c:3571
> #4  0x00005641a8b097af in ram_block_notifier_add (n=n@entry=0x5641ab8354d8) at ../hw/core/numa.c:844
> #5  0x00005641a89c474f in blk_ram_registrar_init (r=r@entry=0x5641ab8354d0, blk=<optimized out>)
>     at ../block/block-ram-registrar.c:46
> #6  0x00005641a8affe88 in virtio_blk_device_realize (dev=0x5641ab835230, errp=0x7ffc72657190)
>     at ../hw/block/virtio-blk.c:1239
> #7  0x00005641a8b4b716 in virtio_device_realize (dev=0x5641ab835230, errp=0x7ffc726571f0)
>     at ../hw/virtio/virtio.c:3684
> #8  0x00005641a8c2049b in device_set_realized (obj=<optimized out>, value=<optimized out>, errp=0x7ffc72657380)
>     at ../hw/core/qdev.c:553
> #9  0x00005641a8c24738 in property_set_bool
>     (obj=0x5641ab835230, v=<optimized out>, name=<optimized out>, opaque=0x5641aa518820, errp=0x7ffc72657380)
>     at ../qom/object.c:2273
> #10 0x00005641a8c27683 in object_property_set
>    
>      (obj=obj@entry=0x5641ab835230, name=name@entry=0x5641a8e9719a "realized", v=v@entry=0x5641ab83f5b0, errp=errp@entry=0x7ffc72657380) at ../qom/object.c:1408
> #11 0x00005641a8c2a97f in object_property_set_qobject
>     (obj=obj@entry=0x5641ab835230, name=name@entry=0x5641a8e9719a "realized", value=value@entry=0x5641ab83f4f0, errp=errp@entry=0x7ffc72657380) at ../qom/qom-qobject.c:28
> #12 0x00005641a8c27c84 in object_property_set_bool
>     (obj=0x5641ab835230, name=0x5641a8e9719a "realized", value=<optimized out>, errp=0x7ffc72657380)
>     at ../qom/object.c:1477
> #13 0x00005641a8929b74 in pci_qdev_realize (qdev=<optimized out>, errp=<optimized out>) at ../hw/pci/pci.c:2218
> #14 0x00005641a8c2049b in device_set_realized (obj=<optimized out>, value=<optimized out>, errp=0x7ffc726575b0)
>     at ../hw/core/qdev.c:553
> #15 0x00005641a8c24738 in property_set_bool
>     (obj=0x5641ab82ce70, v=<optimized out>, name=<optimized out>, opaque=0x5641aa518820, errp=0x7ffc726575b0)
>     at ../qom/object.c:2273
> #16 0x00005641a8c27683 in object_property_set
> 
> Thanks,
> Stefano
> 
> On Tue, Sep 27, 2022 at 9:34 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > v5:
> > - Drop "RFC" since libblkio 1.0 has been released and the library API is stable
> > - Disable BDRV_REQ_REGISTERED_BUF if we run out of blkio_mem_regions. The
> >   bounce buffer slow path is taken when there are not enough blkio_mem_regions
> >   to cover guest RAM. [Hanna & David Hildenbrand]
> > - Call ram_block_discard_disable() when mem-region-pinned property is true or
> >   absent [David Hildenbrand]
> > - Use a bounce buffer pool instead of allocating/freeing a buffer for each
> >   request. This reduces the number of blkio_mem_regions required for bounce
> >   buffers to 1 and avoids frequent blkio_mem_region_map/unmap() calls.
> > - Switch to .bdrv_co_*() instead of .bdrv_aio_*(). Needed for the bounce buffer
> >   pool's CoQueue.
> > v4:
> > - Patch 1:
> >   - Add virtio-blk-vhost-user driver [Kevin]
> >   - Drop .bdrv_parse_filename() and .bdrv_needs_filename for virtio-blk-vhost-vdpa [Stefano]
> >   - Add copyright and license header [Hanna]
> >   - Drop .bdrv_parse_filename() in favor of --blockdev or json: [Hanna]
> >   - Clarify that "filename" is always non-NULL for io_uring [Hanna]
> >   - Check that virtio-blk-vhost-vdpa "path" option is non-NULL [Hanna]
> >   - Fix virtio-blk-vhost-vdpa cache.direct=off logic [Hanna]
> >   - Use macros for driver names [Hanna]
> >   - Assert that the driver name is valid [Hanna]
> >   - Update "readonly" property name to "read-only" [Hanna]
> >   - Call blkio_detach_aio_context() in blkio_close() [Hanna]
> >   - Avoid uint32_t * to int * casts in blkio_refresh_limits() [Hanna]
> >   - Remove write zeroes and discard from the todo list [Hanna]
> >   - Use PRIu32 instead of %d for uint32_t [Hanna]
> >   - Fix error messages with buf-alignment instead of optimal-io-size [Hanna]
> >   - Call map/unmap APIs since libblkio alloc/free APIs no longer do that
> >   - Update QAPI schema QEMU version to 7.2
> > - Patch 5:
> >   - Expand the BDRV_REQ_REGISTERED_BUF flag passthrough and drop assert(!flags)
> >     in drivers [Hanna]
> > - Patch 7:
> >   - Fix BLK->BDRV typo [Hanna]
> >   - Make BlockRAMRegistrar handle failure [Hanna]
> > - Patch 8:
> >   - Replace memory_region_get_fd() approach with qemu_ram_get_fd()
> > - Patch 10:
> >   - Use (void)ret; to discard unused return value [Hanna]
> >   - libblkio's blkio_unmap_mem_region() API no longer has a return value
> >   - Check for registered bufs that cross RAMBlocks [Hanna]
> > - Patch 11:
> >   - Handle bdrv_register_buf() errors [Hanna]
> > v3:
> > - Add virtio-blk-vhost-vdpa for vdpa-blk devices including VDUSE
> > - Add discard and write zeroes support
> > - Rebase and adopt latest libblkio APIs
> > v2:
> > - Add BDRV_REQ_REGISTERED_BUF to bs.supported_write_flags [Stefano]
> > - Use new blkioq_get_num_completions() API
> > - Implement .bdrv_refresh_limits()
> >
> > This patch series adds a QEMU BlockDriver for libblkio
> > (https://gitlab.com/libblkio/libblkio/), a library for high-performance block
> > device I/O. This work was presented at KVM Forum 2022 and slides are available
> > here:
> > https://static.sched.com/hosted_files/kvmforum2022/8c/libblkio-kvm-forum-2022.pdf
> >
> > The second patch adds the core BlockDriver and most of the libblkio API usage.
> > Three libblkio drivers are included:
> > - io_uring
> > - virtio-blk-vhost-user
> > - virtio-blk-vhost-vdpa
> >
> > The remainder of the patch series reworks the existing QEMU bdrv_register_buf()
> > API so virtio-blk emulation efficiently map guest RAM for libblkio - some
> > libblkio drivers require that I/O buffer memory is pre-registered (think VFIO,
> > vhost, etc).
> >
> > Vladimir requested performance results that show the effect of the
> > BDRV_REQ_REGISTERED_BUF flag. I ran the patches against qemu-storage-daemon's
> > vhost-user-blk export with iodepth=1 bs=512 to see the per-request overhead due
> > to bounce buffer allocation/mapping:
> >
> > Name                                   IOPS   Error
> > bounce-buf                          4373.81 ± 0.01%
> > registered-buf                     13062.80 ± 0.67%
> >
> > The BDRV_REQ_REGISTERED_BUF optimization version is about 3x faster.
> >
> > See the BlockDriver struct in block/blkio.c for a list of APIs that still need
> > to be implemented. The core functionality is covered.
> >
> > Regarding the design: each libblkio driver is a separately named BlockDriver.
> > That means there is an "io_uring" BlockDriver and not a generic "libblkio"
> > BlockDriver. This way QAPI and open parameters are type-safe and mandatory
> > parameters can be checked by QEMU.
> >
> > Stefan Hajnoczi (12):
> >   coroutine: add flag to re-queue at front of CoQueue
> >   blkio: add libblkio block driver
> >   numa: call ->ram_block_removed() in ram_block_notifer_remove()
> >   block: pass size to bdrv_unregister_buf()
> >   block: use BdrvRequestFlags type for supported flag fields
> >   block: add BDRV_REQ_REGISTERED_BUF request flag
> >   block: return errors from bdrv_register_buf()
> >   block: add BlockRAMRegistrar
> >   exec/cpu-common: add qemu_ram_get_fd()
> >   stubs: add qemu_ram_block_from_host() and qemu_ram_get_fd()
> >   blkio: implement BDRV_REQ_REGISTERED_BUF optimization
> >   virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint
> >
> >  MAINTAINERS                                 |    7 +
> >  meson_options.txt                           |    2 +
> >  qapi/block-core.json                        |   53 +-
> >  meson.build                                 |    9 +
> >  include/block/block-common.h                |    9 +
> >  include/block/block-global-state.h          |   10 +-
> >  include/block/block_int-common.h            |   15 +-
> >  include/exec/cpu-common.h                   |    1 +
> >  include/hw/virtio/virtio-blk.h              |    2 +
> >  include/qemu/coroutine.h                    |   15 +-
> >  include/sysemu/block-backend-global-state.h |    4 +-
> >  include/sysemu/block-ram-registrar.h        |   37 +
> >  block.c                                     |   14 +
> >  block/blkio.c                               | 1017 +++++++++++++++++++
> >  block/blkverify.c                           |    4 +-
> >  block/block-backend.c                       |    8 +-
> >  block/block-ram-registrar.c                 |   54 +
> >  block/crypto.c                              |    4 +-
> >  block/file-posix.c                          |    1 -
> >  block/gluster.c                             |    1 -
> >  block/io.c                                  |  101 +-
> >  block/mirror.c                              |    2 +
> >  block/nbd.c                                 |    1 -
> >  block/nvme.c                                |   20 +-
> >  block/parallels.c                           |    1 -
> >  block/qcow.c                                |    2 -
> >  block/qed.c                                 |    1 -
> >  block/raw-format.c                          |    2 +
> >  block/replication.c                         |    1 -
> >  block/ssh.c                                 |    1 -
> >  block/vhdx.c                                |    1 -
> >  hw/block/virtio-blk.c                       |   39 +-
> >  hw/core/numa.c                              |   17 +
> >  qemu-img.c                                  |    6 +-
> >  softmmu/physmem.c                           |    5 +
> >  stubs/physmem.c                             |   13 +
> >  tests/qtest/modules-test.c                  |    3 +
> >  util/qemu-coroutine-lock.c                  |    9 +-
> >  util/vfio-helpers.c                         |    5 +-
> >  block/meson.build                           |    2 +
> >  scripts/meson-buildoptions.sh               |    3 +
> >  stubs/meson.build                           |    1 +
> >  42 files changed, 1412 insertions(+), 91 deletions(-)
> >  create mode 100644 include/sysemu/block-ram-registrar.h
> >  create mode 100644 block/blkio.c
> >  create mode 100644 block/block-ram-registrar.c
> >  create mode 100644 stubs/physmem.c
> >
> > --
> > 2.37.3
> >
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply	other threads:[~2022-10-06 18:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-27 19:34 [PATCH v5 00/12] blkio: add libblkio BlockDriver Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 01/12] coroutine: add flag to re-queue at front of CoQueue Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 02/12] blkio: add libblkio block driver Stefan Hajnoczi
2022-09-28  5:27   ` Markus Armbruster
2022-09-28 20:10     ` Stefan Hajnoczi
2022-10-06 16:41   ` Alberto Faria
2022-10-06 18:56     ` Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 03/12] numa: call ->ram_block_removed() in ram_block_notifer_remove() Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 04/12] block: pass size to bdrv_unregister_buf() Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 05/12] block: use BdrvRequestFlags type for supported flag fields Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 06/12] block: add BDRV_REQ_REGISTERED_BUF request flag Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 07/12] block: return errors from bdrv_register_buf() Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 08/12] block: add BlockRAMRegistrar Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 09/12] exec/cpu-common: add qemu_ram_get_fd() Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 10/12] stubs: add qemu_ram_block_from_host() and qemu_ram_get_fd() Stefan Hajnoczi
2022-09-27 19:34 ` [PATCH v5 11/12] blkio: implement BDRV_REQ_REGISTERED_BUF optimization Stefan Hajnoczi
2022-09-28 19:21   ` Stefan Hajnoczi
2022-09-28 20:12     ` Alberto Campinho Faria
2022-10-06 18:00       ` Stefan Hajnoczi
2022-10-06 18:09         ` Alberto Faria
2022-10-06 18:46           ` Stefan Hajnoczi
2022-10-06 18:54             ` Alberto Faria
2022-09-27 19:34 ` [PATCH v5 12/12] virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint Stefan Hajnoczi
2022-10-06 12:18 ` [PATCH v5 00/12] blkio: add libblkio BlockDriver Stefano Garzarella
2022-10-06 17:32   ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yz8RKiROmD5TjHNx@fedora \
    --to=stefanha@redhat.com \
    --cc=afaria@redhat.com \
    --cc=armbru@redhat.com \
    --cc=codyprime@gmail.com \
    --cc=david@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=f4bug@amsat.org \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=integration@gluster.org \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=raphael.norwitz@nutanix.com \
    --cc=richard.henderson@linaro.org \
    --cc=rjones@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=thuth@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    --cc=wangyanan55@huawei.com \
    --cc=wencongyang2@huawei.com \
    --cc=xiechanglong.d@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.