From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: qemu-devel@nongnu.org
Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com>,
changchun.ouyang@intel.com, mst@redhat.com
Subject: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support
Date: Tue, 8 Sep 2015 15:38:46 +0800 [thread overview]
Message-ID: <1441697927-16456-7-git-send-email-yuanhan.liu@linux.intel.com> (raw)
In-Reply-To: <1441697927-16456-1-git-send-email-yuanhan.liu@linux.intel.com>
From: Ouyang Changchun <changchun.ouyang@intel.com>
This patch is initially based a patch from Nikolay Nikolaev.
Here is the latest version for adding vhost-user multiple queue support,
by creating a nc and vhost_net pair for each queue.
What differs from last version is that this patch addresses two major
concerns from Michael, and fixes one hidden bug.
- Concern #1: no feedback when the backend can't support # of
requested queues(by providing queues=# option).
Here we address this issue by querying VHOST_USER_PROTOCOL_F_MQ
protocol features first, if not set, it means the backend don't
support mq feature, and let max_queues be 1. Otherwise, we send
another message, VHOST_USER_GET_QUEUE_NUM, for getting the max_queues
the backend supports.
At vhost-user initiation stage(net_vhost_user_start), we then initiate
one queue first, which, in the meantime, also gets the max_queues.
We then do a simple compare: if requested_queues > max_queues, we
exit(I guess it's safe to exit here, as the vm is not running yet).
- Concern #2: some messages are sent more times than necessary.
We came an agreement with Michael that we could categorize vhost
user messages to 2 types: none-vring specific messages, which should
be sent only once, and vring specific messages, which should be sent
per queue.
Here I introduced a helper function vhost_user_one_time_request(),
which lists following messages as none-vring specific messages:
VHOST_USER_GET_FEATURES
VHOST_USER_SET_FEATURES
VHOST_USER_GET_PROTOCOL_FEATURES
VHOST_USER_SET_PROTOCOL_FEATURES
VHOST_USER_SET_OWNER
VHOST_USER_RESET_DEVICE
VHOST_USER_SET_MEM_TABLE
VHOST_USER_GET_QUEUE_NUM
For above messages, we simply ignore them when they are not sent the first
time.
I also observed a hidden bug from last version. We register the char dev
event handler N times, which is not necessary, as well as buggy: A later
register overwrites the former one, as qemu_chr_add_handlers() will not
chain those handlers, but instead overwrites the old one. So, in theory,
invoking qemu_chr_add_handlers N times will not end up with calling the
handler N times.
However, the reason the handler is invoked N times is because we start
the backend(the connection server) first, and hence when net_vhost_user_init()
is executed, the connection is already established, and qemu_chr_add_handlers()
then invoke the handler immediately, which just looks like we invoke the
handler(net_vhost_user_event) directly from net_vhost_user_init().
The solution I came up with is to make VhostUserState as an upper level
structure, making it includes N nc and vhost_net pairs:
struct VhostUserNetPeer {
NetClientState *nc;
VHostNetState *vhost_net;
};
typedef struct VhostUserState {
CharDriverState *chr;
bool running;
int queues;
struct VhostUserNetPeer peers[];
} VhostUserState;
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---
docs/specs/vhost-user.txt | 13 +++++
hw/virtio/vhost-user.c | 31 ++++++++++-
include/net/net.h | 1 +
net/vhost-user.c | 136 ++++++++++++++++++++++++++++++++--------------
qapi-schema.json | 6 +-
qemu-options.hx | 5 +-
6 files changed, 146 insertions(+), 46 deletions(-)
diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 43db9b4..99d79be 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -135,6 +135,19 @@ As older slaves don't support negotiating protocol features,
a feature bit was dedicated for this purpose:
#define VHOST_USER_F_PROTOCOL_FEATURES 30
+Multiple queue support
+-------------------
+Multiple queue is treated as a protocal extension, hence the slave has to
+implement protocol features first. Multiple queues is supported only when
+the protocol feature VHOST_USER_PROTOCOL_F_MQ(bit 0) is set.
+
+The max # of queues the slave support can be queried with message
+VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the # of requested
+queues is bigger than that.
+
+As all queues share one connection, the master use a unique index for each
+queue in the sent message to identify one specified queue.
+
Message types
-------------
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8046bc0..11e46b5 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -187,6 +187,23 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
0 : -1;
}
+static bool vhost_user_one_time_request(VhostUserRequest request)
+{
+ switch (request) {
+ case VHOST_USER_GET_FEATURES:
+ case VHOST_USER_SET_FEATURES:
+ case VHOST_USER_GET_PROTOCOL_FEATURES:
+ case VHOST_USER_SET_PROTOCOL_FEATURES:
+ case VHOST_USER_SET_OWNER:
+ case VHOST_USER_RESET_DEVICE:
+ case VHOST_USER_SET_MEM_TABLE:
+ case VHOST_USER_GET_QUEUE_NUM:
+ return true;
+ default:
+ return false;
+ }
+}
+
static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
void *arg)
{
@@ -206,6 +223,14 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
else
msg_request = request;
+ /*
+ * For none-vring specific requests, like VHOST_USER_GET_FEATURES,
+ * we just need send it once in the first time. For later such
+ * request, we just ignore it.
+ */
+ if (vhost_user_one_time_request(msg_request) && dev->vq_index != 0)
+ return 0;
+
msg.request = msg_request;
msg.flags = VHOST_USER_VERSION;
msg.size = 0;
@@ -268,17 +293,20 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
case VHOST_USER_SET_VRING_NUM:
case VHOST_USER_SET_VRING_BASE:
memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
+ msg.addr.index += dev->vq_index;
msg.size = sizeof(m.state);
break;
case VHOST_USER_GET_VRING_BASE:
memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
+ msg.addr.index += dev->vq_index;
msg.size = sizeof(m.state);
need_reply = 1;
break;
case VHOST_USER_SET_VRING_ADDR:
memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
+ msg.addr.index += dev->vq_index;
msg.size = sizeof(m.addr);
break;
@@ -286,7 +314,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
case VHOST_USER_SET_VRING_CALL:
case VHOST_USER_SET_VRING_ERR:
file = arg;
- msg.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
+ msg.u64 = (file->index + dev->vq_index) & VHOST_USER_VRING_IDX_MASK;
msg.size = sizeof(m.u64);
if (ioeventfd_enabled() && file->fd > 0) {
fds[fd_num++] = file->fd;
@@ -330,6 +358,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
error_report("Received bad msg size.");
return -1;
}
+ msg.state.index -= dev->vq_index;
memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
break;
default:
diff --git a/include/net/net.h b/include/net/net.h
index 6a6cbef..6f20656 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -92,6 +92,7 @@ struct NetClientState {
NetClientDestructor *destructor;
unsigned int queue_index;
unsigned rxfilter_notify_enabled:1;
+ void *opaque;
};
typedef struct NICState {
diff --git a/net/vhost-user.c b/net/vhost-user.c
index 2d6bbe5..7d4ac69 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -15,10 +15,16 @@
#include "qemu/config-file.h"
#include "qemu/error-report.h"
+struct VhostUserNetPeer {
+ NetClientState *nc;
+ VHostNetState *vhost_net;
+};
+
typedef struct VhostUserState {
- NetClientState nc;
CharDriverState *chr;
- VHostNetState *vhost_net;
+ bool running;
+ int queues;
+ struct VhostUserNetPeer peers[];
} VhostUserState;
typedef struct VhostUserChardevProps {
@@ -29,48 +35,75 @@ typedef struct VhostUserChardevProps {
VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
{
- VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
+ VhostUserState *s = nc->opaque;
assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
- return s->vhost_net;
-}
-
-static int vhost_user_running(VhostUserState *s)
-{
- return (s->vhost_net) ? 1 : 0;
+ return s->peers[nc->queue_index].vhost_net;
}
static int vhost_user_start(VhostUserState *s)
{
VhostNetOptions options;
+ VHostNetState *vhost_net;
+ int max_queues;
+ int i = 0;
- if (vhost_user_running(s)) {
+ if (s->running)
return 0;
- }
options.backend_type = VHOST_BACKEND_TYPE_USER;
- options.net_backend = &s->nc;
options.opaque = s->chr;
- s->vhost_net = vhost_net_init(&options);
+ options.net_backend = s->peers[i].nc;
+ vhost_net = s->peers[i++].vhost_net = vhost_net_init(&options);
+
+ max_queues = vhost_net_get_max_queues(vhost_net);
+ if (s->queues >= max_queues) {
+ error_report("you are asking more queues than supported: %d\n",
+ max_queues);
+ return -1;
+ }
+
+ for (; i < s->queues; i++) {
+ options.net_backend = s->peers[i].nc;
+
+ s->peers[i].vhost_net = vhost_net_init(&options);
+ if (!s->peers[i].vhost_net)
+ return -1;
+ }
+ s->running = true;
- return vhost_user_running(s) ? 0 : -1;
+ return 0;
}
static void vhost_user_stop(VhostUserState *s)
{
- if (vhost_user_running(s)) {
- vhost_net_cleanup(s->vhost_net);
+ int i;
+ VHostNetState *vhost_net;
+
+ if (!s->running)
+ return;
+
+ for (i = 0; i < s->queues; i++) {
+ vhost_net = s->peers[i].vhost_net;
+ if (vhost_net)
+ vhost_net_cleanup(vhost_net);
}
- s->vhost_net = 0;
+ s->running = false;
}
static void vhost_user_cleanup(NetClientState *nc)
{
- VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
+ VhostUserState *s = nc->opaque;
+ VHostNetState *vhost_net = s->peers[nc->queue_index].vhost_net;
+
+ if (vhost_net)
+ vhost_net_cleanup(vhost_net);
- vhost_user_stop(s);
qemu_purge_queued_packets(nc);
+
+ if (nc->queue_index == s->queues - 1)
+ free(s);
}
static bool vhost_user_has_vnet_hdr(NetClientState *nc)
@@ -89,7 +122,7 @@ static bool vhost_user_has_ufo(NetClientState *nc)
static NetClientInfo net_vhost_user_info = {
.type = NET_CLIENT_OPTIONS_KIND_VHOST_USER,
- .size = sizeof(VhostUserState),
+ .size = sizeof(NetClientState),
.cleanup = vhost_user_cleanup,
.has_vnet_hdr = vhost_user_has_vnet_hdr,
.has_ufo = vhost_user_has_ufo,
@@ -97,18 +130,25 @@ static NetClientInfo net_vhost_user_info = {
static void net_vhost_link_down(VhostUserState *s, bool link_down)
{
- s->nc.link_down = link_down;
+ NetClientState *nc;
+ int i;
- if (s->nc.peer) {
- s->nc.peer->link_down = link_down;
- }
+ for (i = 0; i < s->queues; i++) {
+ nc = s->peers[i].nc;
- if (s->nc.info->link_status_changed) {
- s->nc.info->link_status_changed(&s->nc);
- }
+ nc->link_down = link_down;
+
+ if (nc->peer) {
+ nc->peer->link_down = link_down;
+ }
- if (s->nc.peer && s->nc.peer->info->link_status_changed) {
- s->nc.peer->info->link_status_changed(s->nc.peer);
+ if (nc->info->link_status_changed) {
+ nc->info->link_status_changed(nc);
+ }
+
+ if (nc->peer && nc->peer->info->link_status_changed) {
+ nc->peer->info->link_status_changed(nc->peer);
+ }
}
}
@@ -118,7 +158,8 @@ static void net_vhost_user_event(void *opaque, int event)
switch (event) {
case CHR_EVENT_OPENED:
- vhost_user_start(s);
+ if (vhost_user_start(s) < 0)
+ exit(1);
net_vhost_link_down(s, false);
error_report("chardev \"%s\" went up", s->chr->label);
break;
@@ -131,24 +172,28 @@ static void net_vhost_user_event(void *opaque, int event)
}
static int net_vhost_user_init(NetClientState *peer, const char *device,
- const char *name, CharDriverState *chr)
+ const char *name, VhostUserState *s)
{
NetClientState *nc;
- VhostUserState *s;
+ CharDriverState *chr = s->chr;
+ int i;
- nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
+ for (i = 0; i < s->queues; i++) {
+ nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
- snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
- chr->label);
+ snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
+ i, chr->label);
- s = DO_UPCAST(VhostUserState, nc, nc);
+ /* We don't provide a receive callback */
+ nc->receive_disabled = 1;
- /* We don't provide a receive callback */
- s->nc.receive_disabled = 1;
- s->chr = chr;
- nc->queue_index = 0;
+ nc->queue_index = i;
+ nc->opaque = s;
- qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
+ s->peers[i].nc = nc;
+ }
+
+ qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event, s);
return 0;
}
@@ -227,8 +272,10 @@ static int net_vhost_check_net(void *opaque, QemuOpts *opts, Error **errp)
int net_init_vhost_user(const NetClientOptions *opts, const char *name,
NetClientState *peer, Error **errp)
{
+ int queues;
const NetdevVhostUserOptions *vhost_user_opts;
CharDriverState *chr;
+ VhostUserState *s;
assert(opts->kind == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
vhost_user_opts = opts->vhost_user;
@@ -244,6 +291,11 @@ int net_init_vhost_user(const NetClientOptions *opts, const char *name,
return -1;
}
+ queues = vhost_user_opts->has_queues ? vhost_user_opts->queues : 1;
+ s = g_malloc0(sizeof(VhostUserState) +
+ queues * sizeof(struct VhostUserNetPeer));
+ s->queues = queues;
+ s->chr = chr;
- return net_vhost_user_init(peer, "vhost_user", name, chr);
+ return net_vhost_user_init(peer, "vhost_user", name, s);
}
diff --git a/qapi-schema.json b/qapi-schema.json
index 67fef37..55c33db 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2480,12 +2480,16 @@
#
# @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false).
#
+# @queues: #optional number of queues to be created for multiqueue vhost-user
+# (default: 1) (Since 2.5)
+#
# Since 2.1
##
{ 'struct': 'NetdevVhostUserOptions',
'data': {
'chardev': 'str',
- '*vhostforce': 'bool' } }
+ '*vhostforce': 'bool',
+ '*queues': 'int' } }
##
# @NetClientOptions
diff --git a/qemu-options.hx b/qemu-options.hx
index 77f5853..5bfa7a3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1963,13 +1963,14 @@ The hubport netdev lets you connect a NIC to a QEMU "vlan" instead of a single
netdev. @code{-net} and @code{-device} with parameter @option{vlan} create the
required hub automatically.
-@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off]
+@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off][,queues=n]
Establish a vhost-user netdev, backed by a chardev @var{id}. The chardev should
be a unix domain socket backed one. The vhost-user uses a specifically defined
protocol to pass vhost ioctl replacement messages to an application on the other
end of the socket. On non-MSIX guests, the feature can be forced with
-@var{vhostforce}.
+@var{vhostforce}. Use 'queues=@var{n}' to specify the number of queues to
+be created for multiqueue vhost-user.
Example:
@example
--
1.9.0
next prev parent reply other threads:[~2015-09-08 7:36 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-08 7:38 [Qemu-devel] [PATCH 0/7 v7] vhost-user multiple queue support Yuanhan Liu
2015-09-08 7:38 ` [Qemu-devel] [PATCH 1/7] vhost-user: use VHOST_USER_XXX macro for switch statement Yuanhan Liu
2015-09-08 7:38 ` [Qemu-devel] [PATCH 2/7] vhost-user: add protocol feature negotiation Yuanhan Liu
2015-09-08 7:38 ` [Qemu-devel] [PATCH 3/7] vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE Yuanhan Liu
2015-09-08 7:38 ` [Qemu-devel] [PATCH 4/7] vhost-user: add VHOST_USER_GET_QUEUE_NUM message Yuanhan Liu
2015-09-08 7:38 ` [Qemu-devel] [PATCH 5/7] vhost_net: move vhost_net_set_vq_index ahead at vhost_net_init Yuanhan Liu
2015-09-10 3:14 ` Jason Wang
2015-09-10 3:57 ` Yuanhan Liu
2015-09-10 4:46 ` Jason Wang
2015-09-10 5:17 ` Yuanhan Liu
2015-09-10 5:52 ` Jason Wang
2015-09-10 6:18 ` Yuanhan Liu
2015-09-10 6:54 ` Jason Wang
2015-09-10 7:06 ` Yuanhan Liu
2015-09-14 5:49 ` Yuanhan Liu
2015-09-08 7:38 ` Yuanhan Liu [this message]
2015-09-08 21:22 ` [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support Eric Blake
2015-09-09 1:47 ` Yuanhan Liu
2015-09-09 8:05 ` Ouyang, Changchun
2015-09-09 8:11 ` Yuanhan Liu
2015-09-09 12:18 ` Michael S. Tsirkin
2015-09-09 13:19 ` Yuanhan Liu
2015-09-09 20:55 ` Michael S. Tsirkin
2015-09-14 10:00 ` Jason Wang
2015-09-15 2:15 ` Yuanhan Liu
2015-09-08 7:38 ` [Qemu-devel] [PATCH 7/7] vhost-user: add a new message to disable/enable a specific virt queue Yuanhan Liu
2015-09-09 10:45 ` Michael S. Tsirkin
2015-09-09 13:38 ` Yuanhan Liu
2015-09-09 14:18 ` Michael S. Tsirkin
2015-09-09 20:55 ` Michael S. Tsirkin
2015-09-14 7:36 ` Yuanhan Liu
-- strict thread matches above, loose matches on Subject: below --
2015-09-15 7:10 [Qemu-devel] [PATCH 0/7 v9] vhost-user multiple queue support Yuanhan Liu
2015-09-15 7:10 ` [Qemu-devel] [PATCH 6/7] vhost-user: add " Yuanhan Liu
2015-09-15 15:02 ` Eric Blake
2015-09-16 2:06 ` Yuanhan Liu
2015-09-16 2:48 ` Yuanhan Liu
2015-09-16 8:10 ` Michael S. Tsirkin
2015-09-16 8:23 ` Yuanhan Liu
2015-09-16 14:15 ` Eric Blake
2015-09-16 14:53 ` Yuanhan Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1441697927-16456-7-git-send-email-yuanhan.liu@linux.intel.com \
--to=yuanhan.liu@linux.intel.com \
--cc=changchun.ouyang@intel.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).