From: Jason Wang <jasowang@redhat.com>
To: Thibaut Collet <thibaut.collet@6wind.com>,
"Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>,
Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 2/2] vhost user: Add RARP injection for legacy guest
Date: Fri, 26 Jun 2015 12:06:35 +0800 [thread overview]
Message-ID: <558CCFCB.3000706@redhat.com> (raw)
In-Reply-To: <CABUUfwNNrvdcr1cp88mnqnHKH9L7f=yj6oRWOYAsAoB_T9w=CQ@mail.gmail.com>
On 06/25/2015 10:22 PM, Thibaut Collet wrote:
> On Thu, Jun 25, 2015 at 2:53 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Thu, Jun 25, 2015 at 01:01:29PM +0200, Thibaut Collet wrote:
>>> On Thu, Jun 25, 2015 at 11:59 AM, Jason Wang <jasowang@redhat.com> wrote:
>>>>
>>>>
>>>> On 06/24/2015 07:05 PM, Michael S. Tsirkin wrote:
>>>>> On Wed, Jun 24, 2015 at 04:31:15PM +0800, Jason Wang wrote:
>>>>>> On 06/23/2015 01:49 PM, Michael S. Tsirkin wrote:
>>>>>>> On Tue, Jun 23, 2015 at 10:12:17AM +0800, Jason Wang wrote:
>>>>>>>>> On 06/18/2015 11:16 PM, Thibaut Collet wrote:
>>>>>>>>>>> On Tue, Jun 16, 2015 at 10:05 AM, Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>>>>>>> On 06/16/2015 03:24 PM, Thibaut Collet wrote:
>>>>>>>>>>>>>>> If my understanding is correct, on a resume operation, we have the
>>>>>>>>>>>>>>> following callback trace:
>>>>>>>>>>>>>>> 1. virtio_pci_restore function that calls all restore call back of
>>>>>>>>>>>>>>> virtio devices
>>>>>>>>>>>>>>> 2. virtnet_restore that calls try_fill_recv function for each virtual queues
>>>>>>>>>>>>>>> 3. try_fill_recv function kicks the virtual queue (through
>>>>>>>>>>>>>>> virtqueue_kick function)
>>>>>>>>>>>>> Yes, but this happens only after pm resume not migration. Migration is
>>>>>>>>>>>>> totally transparent to guest.
>>>>>>>>>>>>>
>>>>>>>>>>> Hi Jason,
>>>>>>>>>>>
>>>>>>>>>>> After a deeper look in the migration code of QEMU a resume event is
>>>>>>>>>>> always sent when the live migration is finished.
>>>>>>>>>>> On a live migration we have the following callback trace:
>>>>>>>>>>> 1. The VM on the new host is set to the state RUN_STATE_INMIGRATE, the
>>>>>>>>>>> autostart boolean to 1 and calls the qemu_start_incoming_migration
>>>>>>>>>>> function (see function main of vl.c)
>>>>>>>>>>> .....
>>>>>>>>>>> 2. call of process_incoming_migration function in
>>>>>>>>>>> migration/migration.c file whatever the way to do the live migration
>>>>>>>>>>> (tcp:, fd:, unix:, exec: ...)
>>>>>>>>>>> 3. call of process_incoming_migration_co function in migration/migration.c
>>>>>>>>>>> 4. call of vm_start function in vl.c (otherwise the migrated VM stay
>>>>>>>>>>> in the pause state, the autostart boolean is set to 1 by the main
>>>>>>>>>>> function in vl.c)
>>>>>>>>>>> 5. call of vm_start function that sets the VM is the RUN_STATE_RUNNING state.
>>>>>>>>>>> 6. call of qapi_event_send_resume function that ends a resume event to the VM
>>>>>>>>> AFAIK, this function sends resume event to qemu monitor not VM.
>>>>>>>>>
>>>>>>>>>>> So when a live migration is ended:
>>>>>>>>>>> 1. a resume event is sent to the guest
>>>>>>>>>>> 2. On the reception of this resume event the virtual queue are kicked
>>>>>>>>>>> by the guest
>>>>>>>>>>> 3. Backend vhost user catches this kick and can emit a RARP to guest
>>>>>>>>>>> that does not support GUEST_ANNOUNCE
>>>>>>>>>>>
>>>>>>>>>>> This solution, as solution based on detection of DRIVER_OK status
>>>>>>>>>>> suggested by Michael, allows backend to send the RARP to legacy guest
>>>>>>>>>>> without involving QEMU and add ioctl to vhost-user.
>>>>>>>>> A question here is did vhost-user code pass status to the backend? If
>>>>>>>>> not, how can userspace backend detect DRIVER_OK?
>>>>>>> Sorry, I must have been unclear.
>>>>>>> vhost core calls VHOST_NET_SET_BACKEND on DRIVER_OK.
>>>>>>> Unfortunately vhost user currently translates it to VHOST_USER_NONE.
>>>>>> Looks like VHOST_NET_SET_BACKEND was only used for tap backend.
>>>>>>
>>>>>>> As a work around, I think kicking ioeventfds once you get
>>>>>>> VHOST_NET_SET_BACKEND will work.
>>>>>> Maybe just a eventfd_set() in vhost_net_start(). But is this
>>>>>> "workaround" elegant enough to be documented? Is it better to do this
>>>>>> explicitly with a new feature?
>>>>> If you are going to do this anyway, there are a couple of other changes
>>>>> we should do, in particular, decide what we want to do with control vq.
>>>>>
>>>> If I understand correctly, you mean VIRTIO_NET_CTRL_MQ and
>>>> VIRTIO_NET_CTRL_GUEST_OFFLOADS? Looks like both of these were broken.
>>>> Need more thought, maybe new kinds of requests.
>>>>
>>>>
>>> Are there any objections to add VHOST_NET_SET_BACKEND support to vhost
>>> user with a patch like that:
>>>
>>>
>>> hw/net/vhost_net.c | 8 ++++++++
>>> hw/virtio/vhost-user.c | 10 +++++++++-
>>> 2 files changed, 17 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>>> index 907e002..7a008c0 100644
>>> --- a/hw/net/vhost_net.c
>>> +++ b/hw/net/vhost_net.c
>>> @@ -234,6 +234,14 @@ static int vhost_net_start_one(struct vhost_net *net,
>>> goto fail;
>>> }
>>> }
>>> + } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
>>> + file.fd = 0;
>>> + for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
>>> + const VhostOps *vhost_ops = net->dev.vhost_ops;
>>> + int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND,
>>> + &file);
>>> + assert(r >= 0);
>>> + }
>>> }
>>> return 0;
>>> fail:
>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>> index d6f2163..32c6bd9 100644
>>> --- a/hw/virtio/vhost-user.c
>>> +++ b/hw/virtio/vhost-user.c
>>> @@ -41,6 +41,7 @@ typedef enum VhostUserRequest {
>>> VHOST_USER_SET_VRING_KICK = 12,
>>> VHOST_USER_SET_VRING_CALL = 13,
>>> VHOST_USER_SET_VRING_ERR = 14,
>>> + VHOST_USER_NET_SET_BACKEND = 15,
>>> VHOST_USER_MAX
>>> } VhostUserRequest;
>>>
>>> @@ -104,7 +105,8 @@ static unsigned long int
>>> ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
>>> VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */
>>> VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */
>>> VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */
>>> - VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */
>>> + VHOST_SET_VRING_ERR, /* VHOST_USER_SET_VRING_ERR */
>>> + VHOST_NET_SET_BACKEND /* VHOST_USER_NET_SET_BACKEND */
>>> };
>>>
>>> static VhostUserRequest vhost_user_request_translate(unsigned long int request)
>>> @@ -287,6 +289,12 @@ static int vhost_user_call(struct vhost_dev *dev,
>>> unsigned long int request,
>>> msg.u64 |= VHOST_USER_VRING_NOFD_MASK;
>>> }
>>> break;
>>> +
>>> + case VHOST_NET_SET_BACKEND:
>>> + memcpy(&msg.file, arg, sizeof(struct vhost_vring_state));
>>> + msg.size = sizeof(m.state);
>>> + break;
>>> +
>>> default:
>>> error_report("vhost-user trying to send unhandled ioctl");
>>> return -1;
>>>
>>>
>>> This message will be sent when guest is ready and can be used by vhost
>>> user backend to send RARP to legacy guest.
>>>
>>> This solution avoids to add new message and has no impact on control vq.
>>
>> I think that you can't add messages to protocol unconditionally.
>> For example, snabbswitch simply crashes if it gets an unknown
>> message.
>>
>> Either this needs a new feature bit, or implement
>> [PATCH RFC] vhost-user: protocol extensions
>> making it safe to add new messages.
>>
>> --
>> MST
> I understand.
> Last idea before doing a RFC:
> Normally guest notifies vhost of new buffer onto a virtqueue by
> kicking the eventfd. This eventfd has been provided to vhost by QEMU.
> So when DRIVER_OK is received by QEMU, QEMU can kick the eventfd.
>
> A possible patch to do that is:
> hw/net/vhost_net.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 907e002..fbc55e0 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -234,6 +234,13 @@ static int vhost_net_start_one(struct vhost_net *net,
> goto fail;
> }
> }
> + } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
> + int idx;
> +
> + for (idx = 0; idx < net->dev.nvqs; ++idx) {
> + struct VirtQueue *vq = virtio_get_queue(dev, idx);
> + event_notifier_set(virtio_queue_get_host_notifier(vq));
> + }
> }
> return 0;
> fail:
>
> kicking this eventfd has no impact for QEMU or the guest (they do not
> poll it) and simply wake up vhost to allow it to send RARP for legacy
> guest.
>
> Regards.
>
> Thibaut.
This may work but the issue is:
- I believe we should document this in the spec. But it looks more like
a workaround and use implicit method to notify control message which
often cause ulgy codes in both side. This is not elegant to be
documented in the spec.
- Consider you may have 100 queues, then kick will happen 100 times and
backend need handle such case.
next prev parent reply other threads:[~2015-06-26 4:06 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-10 13:43 [Qemu-devel] [PATCH v3 0/2] Add live migration for vhost user Thibaut Collet
2015-06-10 13:43 ` [Qemu-devel] [PATCH v3 1/2] vhost user: add support of live migration Thibaut Collet
2015-06-10 13:52 ` Michael S. Tsirkin
2015-06-10 14:22 ` Thibaut Collet
2015-06-10 14:27 ` Michael S. Tsirkin
2015-06-10 15:24 ` Thibaut Collet
2015-06-10 15:34 ` Stefan Hajnoczi
2015-06-23 6:01 ` Michael S. Tsirkin
2015-06-10 13:43 ` [Qemu-devel] [PATCH v3 2/2] vhost user: Add RARP injection for legacy guest Thibaut Collet
2015-06-10 15:34 ` Michael S. Tsirkin
2015-06-10 15:48 ` Thibaut Collet
2015-06-10 16:00 ` Michael S. Tsirkin
2015-06-10 20:25 ` Thibaut Collet
2015-06-10 20:50 ` Michael S. Tsirkin
2015-06-11 5:34 ` Thibaut Collet
2015-06-11 5:39 ` Jason Wang
2015-06-11 5:49 ` Thibaut Collet
2015-06-11 5:54 ` Jason Wang
2015-06-11 10:38 ` Michael S. Tsirkin
2015-06-11 12:10 ` Thibaut Collet
2015-06-11 12:13 ` Michael S. Tsirkin
2015-06-11 12:33 ` Thibaut Collet
2015-06-12 7:55 ` Jason Wang
2015-06-12 11:53 ` Thibaut Collet
2015-06-12 14:28 ` Michael S. Tsirkin
2015-06-15 7:43 ` Jason Wang
2015-06-15 8:44 ` Michael S. Tsirkin
2015-06-15 12:12 ` Thibaut Collet
2015-06-15 12:45 ` Michael S. Tsirkin
2015-06-15 13:04 ` Thibaut Collet
2015-06-16 5:29 ` Jason Wang
2015-06-16 7:24 ` Thibaut Collet
2015-06-16 8:05 ` Jason Wang
2015-06-16 8:16 ` Thibaut Collet
2015-06-17 4:16 ` Jason Wang
2015-06-17 6:42 ` Michael S. Tsirkin
2015-06-17 7:05 ` Thibaut Collet
2015-06-18 15:16 ` Thibaut Collet
2015-06-23 2:12 ` Jason Wang
2015-06-23 5:49 ` Michael S. Tsirkin
2015-06-24 8:31 ` Jason Wang
2015-06-24 11:05 ` Michael S. Tsirkin
2015-06-25 9:59 ` Jason Wang
2015-06-25 11:01 ` Thibaut Collet
2015-06-25 12:53 ` Michael S. Tsirkin
2015-06-25 14:22 ` Thibaut Collet
2015-06-26 4:06 ` Jason Wang [this message]
2015-06-16 3:35 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=558CCFCB.3000706@redhat.com \
--to=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=thibaut.collet@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).