From: Stefan Hajnoczi <stefanha@redhat.com>
To: Coiby Xu <coiby.xu@gmail.com>
Cc: kwolf@redhat.com,
"open list:Block layer core" <qemu-block@nongnu.org>,
qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>,
bharatlkmlkvm@gmail.com, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v8 3/4] vhost-user block device backend server
Date: Thu, 11 Jun 2020 16:24:52 +0100 [thread overview]
Message-ID: <20200611152452.GC77457@stefanha-x1.localdomain> (raw)
In-Reply-To: <20200604233538.256325-4-coiby.xu@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 9655 bytes --]
On Fri, Jun 05, 2020 at 07:35:37AM +0800, Coiby Xu wrote:
> +static void coroutine_fn vu_block_virtio_process_req(void *opaque)
> +{
> + struct req_data *data = opaque;
> + VuServer *server = data->server;
> + VuVirtq *vq = data->vq;
> + VuVirtqElement *elem = data->elem;
> + uint32_t type;
> + VuBlockReq *req;
> +
> + VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
> + BlockBackend *backend = vdev_blk->backend;
> +
> + struct iovec *in_iov = elem->in_sg;
> + struct iovec *out_iov = elem->out_sg;
> + unsigned in_num = elem->in_num;
> + unsigned out_num = elem->out_num;
> + /* refer to hw/block/virtio_blk.c */
> + if (elem->out_num < 1 || elem->in_num < 1) {
> + error_report("virtio-blk request missing headers");
> + free(elem);
> + return;
> + }
> +
> + req = g_new0(VuBlockReq, 1);
elem was allocated with enough space for VuBlockReq. Can this allocation
be eliminated?
typedef struct VuBlockReq {
- VuVirtqElement *elem;
+ VuVirtqElement elem;
int64_t sector_num;
size_t size;
struct virtio_blk_inhdr *in;
struct virtio_blk_outhdr out;
VuServer *server;
struct VuVirtq *vq;
} VuBlockReq;
req = vu_queue_pop(vu_dev, vq, sizeof(*req));
> + req->server = server;
> + req->vq = vq;
> + req->elem = elem;
> +
> + if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
> + sizeof(req->out)) != sizeof(req->out))) {
> + error_report("virtio-blk request outhdr too short");
> + goto err;
> + }
> +
> + iov_discard_front(&out_iov, &out_num, sizeof(req->out));
> +
> + if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
> + error_report("virtio-blk request inhdr too short");
> + goto err;
> + }
> +
> + /* We always touch the last byte, so just see how big in_iov is. */
> + req->in = (void *)in_iov[in_num - 1].iov_base
> + + in_iov[in_num - 1].iov_len
> + - sizeof(struct virtio_blk_inhdr);
> + iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
> +
> +
> + type = le32toh(req->out.type);
This implementation assumes the request is always little-endian. This is
true for VIRTIO 1.0+ but not for older versions. Please check that
VIRTIO_F_VERSION_1 has been set.
In QEMU code the le32_to_cpu(), le64_to_cpu(), etc are common used
instead of le32toh(), etc.
> + switch (type & ~VIRTIO_BLK_T_BARRIER) {
> + case VIRTIO_BLK_T_IN:
> + case VIRTIO_BLK_T_OUT: {
> + ssize_t ret = 0;
> + bool is_write = type & VIRTIO_BLK_T_OUT;
> + req->sector_num = le64toh(req->out.sector);
> +
> + int64_t offset = req->sector_num * vdev_blk->blk_size;
> + QEMUIOVector *qiov = g_new0(QEMUIOVector, 1);
This can be allocated on the stack:
QEMUIOVector qiov;
> +static void vhost_user_blk_server_free(VuBlockDev *vu_block_device)
> +{
I'm unsure why this is a separate from vu_block_free(). Neither of these
functions actually free VuBlockDev, so the name could be changed to
vhost_user_blk_server_stop().
> + if (!vu_block_device) {
> + return;
> + }
> + vhost_user_server_stop(&vu_block_device->vu_server);
> + vu_block_free(vu_block_device);
> +
> +}
> +
> +/*
> + * A exported drive can serve multiple multiple clients simutateously,
> + * thus no need to export the same drive twice.
This comment is outdated. Only one client is served at a time.
> +static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
> + Error **errp)
> +{
> +
> + const char *name = vu_block_device->node_name;
> + SocketAddress *addr = vu_block_device->addr;
> + char *unix_socket = vu_block_device->addr->u.q_unix.path;
> +
> + if (vu_block_dev_find(name)) {
> + error_setg(errp, "Vhost-user-blk server with node-name '%s' "
> + "has already been started",
> + name);
> + return;
> + }
I think blk_new() permissions should prevent multiple writers. Having
multiple readers would be okay. Therefore this check can be removed.
> +
> + if (vu_block_dev_find_by_unix_socket(unix_socket)) {
> + error_setg(errp, "Vhost-user-blk server with with socket_path '%s' "
> + "has already been started", unix_socket);
> + return;
> + }
Is it a problem if the same path is reused? I don't see an issue if the
user creates a vhost-user-blk server, connects a client, unlinks the
UNIX domain socket, and creates a new vhost-user-blk server with the
same path. It might be a little confusing but if the user wants to do
it, I don't see a reason to stop them.
> +
> + if (!vu_block_init(vu_block_device, errp)) {
> + return;
> + }
> +
> +
> + AioContext *ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
> +
> + if (!vhost_user_server_start(VHOST_USER_BLK_MAX_QUEUES, addr, ctx,
> + &vu_block_device->vu_server,
> + NULL, &vu_block_iface,
> + errp)) {
In the previous patch I mentioned that calling g_free(server) is
probably unexpected and here is an example of why that can be a problem.
vu_server is a struct field, not an independent heap-allocated object,
so calling g_free(server) will result in undefined behavior (freeing an
object that was not allocated with g_new()).
> + goto error;
> + }
> +
> + QTAILQ_INSERT_TAIL(&vu_block_devs, vu_block_device, next);
> + blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
> + blk_aio_detach, vu_block_device);
> + return;
> +
> + error:
> + vu_block_free(vu_block_device);
> +}
> +
> +static void vu_set_node_name(Object *obj, const char *value, Error **errp)
> +{
> + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
> +
> + if (vus->node_name) {
> + error_setg(errp, "evdev property already set");
> + return;
> + }
Setting it twice is okay, we just need to g_free(vus->node_name).
> +
> + vus->node_name = g_strdup(value);
> +}
> +
> +static char *vu_get_node_name(Object *obj, Error **errp)
> +{
> + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
> + return g_strdup(vus->node_name);
> +}
> +
> +
> +static void vu_set_unix_socket(Object *obj, const char *value,
> + Error **errp)
> +{
> + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
> +
> + if (vus->addr) {
> + error_setg(errp, "unix_socket property already set");
> + return;
> + }
Setting it twice is okay, we just need to
g_free(vus->addr->u.q_unix.path) and g_free(vus->addr).
> +static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
> + void *opaque, Error **errp)
> +{
> + VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
> +
> + Error *local_err = NULL;
> + uint32_t value;
> +
> + visit_type_uint32(v, name, &value, &local_err);
> + if (local_err) {
> + goto out;
> + }
> + if (value != BDRV_SECTOR_SIZE && value != 4096) {
> + error_setg(&local_err,
> + "Property '%s.%s' can only take value 512 or 4096",
> + object_get_typename(obj), name);
> + goto out;
> + }
Please see hw/core/qdev-properties.c:set_blocksize() for input
validation checks (min=512, max=32768, must be a power of 2). This code
can be moved into a common utility function so that both
hw/core/qdev-properties.c and vhost-user-blk-server.c can use it.
> +
> + vus->blk_size = value;
> +
> +out:
> + error_propagate(errp, local_err);
> + vus->blk_size = value;
> +}
> +
> +
> +static void vhost_user_blk_server_instance_finalize(Object *obj)
> +{
> + VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
> +
> + vhost_user_blk_server_free(vub);
> +}
> +
> +static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
> +{
> + Error *local_error = NULL;
> + VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
> +
> + vhost_user_blk_server_start(vub, &local_error);
After this call succeeds the properties should become read-only
("writable", "node-name", "unix-socket", etc) to prevent modification at
runtime.
I think the easiest way to do that is by keeping a bool field in
VuBlockDev that the property setter functions can check.
> +
> + if (local_error) {
> + error_propagate(errp, local_error);
> + return;
> + }
> +}
> +
> +static void vhost_user_blk_server_class_init(ObjectClass *klass,
> + void *class_data)
> +{
> + UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
> + ucc->complete = vhost_user_blk_server_complete;
> +
> + object_class_property_add_bool(klass, "writable",
> + vu_get_block_writable,
> + vu_set_block_writable);
> +
> + object_class_property_add_str(klass, "node-name",
> + vu_get_node_name,
> + vu_set_node_name);
> +
> + object_class_property_add_str(klass, "unix-socket",
> + vu_get_unix_socket,
> + vu_set_unix_socket);
> +
> + object_class_property_add(klass, "blk-size", "uint32",
> + vu_get_blk_size, vu_set_blk_size,
> + NULL, NULL);
include/hw/block/block.h:DEFINE_BLOCK_PROPERTIES_BASE calls this
property "logical_block_size". Please use the same name for consistency.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2020-06-11 15:30 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-04 23:35 [PATCH v8 0/4] vhost-user block device backend implementation Coiby Xu
2020-06-04 23:35 ` [PATCH v8 1/4] Allow vu_message_read to be replaced Coiby Xu
2020-06-11 10:45 ` Stefan Hajnoczi
2020-06-11 11:26 ` Marc-André Lureau
2020-06-04 23:35 ` [PATCH v8 2/4] generic vhost user server Coiby Xu
2020-06-11 13:14 ` Stefan Hajnoczi
2020-06-14 18:43 ` Coiby Xu
2020-06-04 23:35 ` [PATCH v8 3/4] vhost-user block device backend server Coiby Xu
2020-06-11 15:24 ` Stefan Hajnoczi [this message]
2020-06-14 19:04 ` Coiby Xu
2020-06-04 23:35 ` [PATCH v8 4/4] new qTest case to test the vhost-user-blk-server Coiby Xu
2020-06-05 5:01 ` Thomas Huth
2020-06-05 6:22 ` Coiby Xu
2020-06-05 9:25 ` Thomas Huth
2020-06-05 13:27 ` Coiby Xu
2020-06-11 12:37 ` [PATCH v8 0/4] vhost-user block device backend implementation Stefano Garzarella
2020-06-14 18:46 ` Coiby Xu
2020-06-15 8:46 ` Stefano Garzarella
2020-06-16 6:55 ` Coiby Xu
2020-06-11 15:27 ` Stefan Hajnoczi
2020-06-12 15:58 ` Coiby Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200611152452.GC77457@stefanha-x1.localdomain \
--to=stefanha@redhat.com \
--cc=bharatlkmlkvm@gmail.com \
--cc=coiby.xu@gmail.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).