From: Jason Wang <jasowang@redhat.com>
To: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>,
"Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org,
virtio-dev@lists.oasis-open.org, jesse.brandeburg@intel.com,
alexander.h.duyck@intel.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
Date: Wed, 13 Jun 2018 13:40:59 +0800 [thread overview]
Message-ID: <ce4da2b0-88fd-c79b-0567-fe564bb34922@redhat.com> (raw)
In-Reply-To: <c29a5ac9-cfb5-0a19-f28c-30d41bd5efcf@intel.com>
On 2018年06月13日 12:24, Samudrala, Sridhar wrote:
> On 6/12/2018 7:38 PM, Jason Wang wrote:
>>
>>
>> On 2018年06月12日 19:54, Michael S. Tsirkin wrote:
>>> On Wed, Jun 06, 2018 at 10:29:03AM +0800, Jason Wang wrote:
>>>>
>>>> On 2018年06月05日 20:33, Michael S. Tsirkin wrote:
>>>>> I don't think this is sufficient.
>>>>>
>>>>> If both primary and standby devices are present, a legacy guest
>>>>> without
>>>>> support for the feature might see two devices with same mac and get
>>>>> confused.
>>>>>
>>>>> I think that we should only make primary visible after guest acked
>>>>> the
>>>>> backup feature bit.
>>>> I think we want exactly the reverse? E.g fail the negotiation when
>>>> guest
>>>> does not ack backup feature.
>>>>
>>>> Otherwise legacy guest won't even have the chance to see primary
>>>> device in
>>>> the guest.
>>> That's by design.
>>
>> So management needs to know the capability of guest to set the backup
>> feature. This looks a chicken or egg problem to me.
>
> I don't think so. If the tenant requests 'accelerated datapath
> feature', the management
> will set 'standby' feature bit on virtio-net interface and if the
> guest virtio-net driver
> supports this feature, then the tenant VM will get that capability via
> a hot-plugged
> primary device.
Ok, I thought exactly the reverse because of the commit title is "enable
virtio_net to act as a standby for a passthru device". But re-read the
commit log content, I understand the case a little bit. Btw, VF is not
necessarily faster than virtio-net, especially consider virtio-net may
have a lot of queues.
>
>
>>
>>>
>>>>> And on reset or when backup is cleared in some other way, unplug the
>>>>> primary.
>>>> What if guest does not support hotplug?
>>> It shouldn't ack the backup feature then and it's a good point.
>>> We should both document this and check kernel config has
>>> hotplug support. Sridhar could you take a look pls?
>>>
>>>>> Something like the below will do the job:
>>>>>
>>>>> Primary device is added with a special "primary-failover" flag.
>>>>> A virtual machine is then initialized with just a standby virtio
>>>>> device. Primary is not yet added.
>>>> A question is how to do the matching? Qemu knows nothing about e.g mac
>>>> address of a pass-through device I believe?
>>> Supply a flag to the VFIO when it's added, this way QEMU will know.
>>>
>>>>> Later QEMU detects that guest driver device set DRIVER_OK.
>>>>> It then exposes the primary device to the guest, and triggers
>>>>> a device addition event (hot-plug event) for it.
>>>> Do you mean we won't have primary device in the initial qemu cli?
>>> No, that's not what I mean.
>>>
>>> I mean management will supply a flag to VFIO and then
>>>
>>>
>>> - VFIO defers exposing
>>> primary to guest until guest acks the feature bit.
>>> - When we see guest ack, initiate hotplug.
>>> - On reboot, hide it again.
>>> - On reset without reboot, request hot-unplug and on eject hide it
>>> again.
>>
>> This sounds much like a kind of bonding in qemu.
>>
>>>
>>>>> If QEMU detects guest driver removal, it initiates a hot-unplug
>>>>> sequence
>>>>> to remove the primary driver. In particular, if QEMU detects guest
>>>>> re-initialization (e.g. by detecting guest reset) it immediately
>>>>> removes
>>>>> the primary device.
>>>> I believe guest failover module should handle this gracefully?
>>> It can't control enumeration order, if primary is enumerated before
>>> standby then guest will load its driver and it's too late
>>> when failover driver is loaded.
>>
>> Well, even if we can delay the visibility of primary until DRIVER_OK,
>> there still be a race I think? And it looks to me it's still a bug of
>> guest:
>>
>> E.g primary could be probed before failover_register() in guest. Then
>> we will miss the enslaving of primary forever.
>
> That is not an issue. Even if the primary is probed before failover
> driver, it will
> enslave the primary via the call to failover_existing_slave_register()
> as part of
> failover_register() routine.
Aha I get it. So the enumeration order is not an issue.
Consider primary may still be seen by guest kernel even if we delay its
visibility, I wonder whether we can control the lifecycle of primary
through driver but not qemu. This can simplify a lot of things.
Thanks
>
>>
>> Thanks
>>
>>>
>>>> Thanks
>>>>
>>>>> We can move some of this code to management as well,
>>>>> architecturally it
>>>>> does not make too much sense but it might be easier
>>>>> implementation-wise.
>>>>>
>>>>> HTH
>>>>>
>>>>> On Mon, Jun 04, 2018 at 06:41:48PM -0700, Samudrala, Sridhar wrote:
>>>>>> Ping on this patch now that the kernel patches are accepted into
>>>>>> davem's net-next tree.
>>>>>> https://patchwork.ozlabs.org/cover/920005/
>>>>>>
>>>>>>
>>>>>> On 5/7/2018 4:09 PM, Sridhar Samudrala wrote:
>>>>>>> This feature bit can be used by hypervisor to indicate
>>>>>>> virtio_net device to
>>>>>>> act as a standby for another device with the same MAC address.
>>>>>>>
>>>>>>> I tested this with a small change to the patch to mark the
>>>>>>> STANDBY feature 'true'
>>>>>>> by default as i am using libvirt to start the VMs.
>>>>>>> Is there a way to pass the newly added feature bit 'standby' to
>>>>>>> qemu via libvirt
>>>>>>> XML file?
>>>>>>>
>>>>>>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>>>>>>> ---
>>>>>>> hw/net/virtio-net.c | 2 ++
>>>>>>> include/standard-headers/linux/virtio_net.h | 3 +++
>>>>>>> 2 files changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>>>>> index 90502fca7c..38b3140670 100644
>>>>>>> --- a/hw/net/virtio-net.c
>>>>>>> +++ b/hw/net/virtio-net.c
>>>>>>> @@ -2198,6 +2198,8 @@ static Property virtio_net_properties[] = {
>>>>>>> true),
>>>>>>> DEFINE_PROP_INT32("speed", VirtIONet, net_conf.speed,
>>>>>>> SPEED_UNKNOWN),
>>>>>>> DEFINE_PROP_STRING("duplex", VirtIONet,
>>>>>>> net_conf.duplex_str),
>>>>>>> + DEFINE_PROP_BIT64("standby", VirtIONet, host_features,
>>>>>>> VIRTIO_NET_F_STANDBY,
>>>>>>> + false),
>>>>>>> DEFINE_PROP_END_OF_LIST(),
>>>>>>> };
>>>>>>> diff --git a/include/standard-headers/linux/virtio_net.h
>>>>>>> b/include/standard-headers/linux/virtio_net.h
>>>>>>> index e9f255ea3f..01ec09684c 100644
>>>>>>> --- a/include/standard-headers/linux/virtio_net.h
>>>>>>> +++ b/include/standard-headers/linux/virtio_net.h
>>>>>>> @@ -57,6 +57,9 @@
>>>>>>> * Steering */
>>>>>>> #define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */
>>>>>>> +#define VIRTIO_NET_F_STANDBY 62 /* Act as standby for
>>>>>>> another device
>>>>>>> + * with the same MAC.
>>>>>>> + */
>>>>>>> #define VIRTIO_NET_F_SPEED_DUPLEX 63 /* Device set
>>>>>>> linkspeed and duplex */
>>>>>>> #ifndef VIRTIO_NET_NO_LEGACY
>>
>
next prev parent reply other threads:[~2018-06-13 5:41 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-07 23:09 [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net Sridhar Samudrala
2018-06-05 1:41 ` Samudrala, Sridhar
2018-06-05 2:06 ` Jason Wang
2018-06-06 18:17 ` Samudrala, Sridhar
2018-06-06 18:52 ` [Qemu-devel] [libvirt] " Ján Tomko
2018-06-06 19:39 ` Samudrala, Sridhar
2018-06-06 18:53 ` [Qemu-devel] " Michael S. Tsirkin
2018-06-05 12:33 ` Michael S. Tsirkin
2018-06-05 20:20 ` Samudrala, Sridhar
2018-06-05 20:37 ` Michael S. Tsirkin
2018-06-05 21:16 ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-06-05 21:32 ` Michael S. Tsirkin
2018-06-05 22:09 ` Siwei Liu
2018-06-12 11:47 ` Michael S. Tsirkin
2018-06-14 0:56 ` Siwei Liu
2018-06-06 2:29 ` [Qemu-devel] " Jason Wang
2018-06-12 11:54 ` Michael S. Tsirkin
2018-06-13 0:20 ` Samudrala, Sridhar
2018-06-13 2:41 ` Jason Wang
2018-06-13 2:38 ` Jason Wang
2018-06-13 4:24 ` Samudrala, Sridhar
2018-06-13 5:40 ` Jason Wang [this message]
2018-06-21 18:14 ` Michael S. Tsirkin
2018-06-22 1:07 ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-06-22 2:30 ` Michael S. Tsirkin
2018-06-22 19:43 ` Siwei Liu
2018-06-22 21:47 ` Michael S. Tsirkin
2018-06-22 22:25 ` Siwei Liu
2018-06-22 22:28 ` Michael S. Tsirkin
2018-06-11 17:26 ` [Qemu-devel] " Michael S. Tsirkin
2018-06-12 1:54 ` Jason Wang
2018-06-12 2:17 ` Michael S. Tsirkin
2018-06-12 5:02 ` Samudrala, Sridhar
2018-06-12 11:34 ` Michael S. Tsirkin
2018-06-13 0:08 ` [Qemu-devel] [virtio-dev] " Samudrala, Sridhar
2018-06-14 1:02 ` Siwei Liu
2018-06-14 10:02 ` Cornelia Huck
2018-06-15 1:57 ` Siwei Liu
2018-06-15 11:48 ` Cornelia Huck
2018-06-15 17:06 ` Siwei Liu
2018-06-19 10:54 ` Cornelia Huck
2018-06-19 20:09 ` Siwei Liu
2018-06-20 14:34 ` Cornelia Huck
2018-06-20 19:59 ` Siwei Liu
2018-06-19 20:32 ` Michael S. Tsirkin
2018-06-20 9:53 ` Cornelia Huck
2018-06-20 14:11 ` Michael S. Tsirkin
2018-06-20 16:06 ` Cornelia Huck
2018-06-20 19:48 ` Michael S. Tsirkin
2018-06-21 14:59 ` Cornelia Huck
2018-06-21 18:20 ` Michael S. Tsirkin
2018-06-22 15:09 ` Cornelia Huck
2018-06-22 19:05 ` Michael S. Tsirkin
2018-06-22 20:21 ` Siwei Liu
2018-06-22 21:32 ` Michael S. Tsirkin
2018-06-22 21:57 ` Siwei Liu
2018-06-22 22:33 ` Michael S. Tsirkin
2018-06-23 0:05 ` Siwei Liu
2018-06-26 15:08 ` Cornelia Huck
2018-06-26 17:50 ` Michael S. Tsirkin
2018-06-27 9:11 ` Cornelia Huck
2018-06-25 9:55 ` Cornelia Huck
2018-06-26 1:46 ` Michael S. Tsirkin
2018-06-26 11:55 ` Cornelia Huck
2018-06-26 13:54 ` Michael S. Tsirkin
2018-06-22 21:43 ` Michael S. Tsirkin
2018-06-27 10:10 ` Cornelia Huck
2018-06-22 1:21 ` Siwei Liu
2018-06-22 2:25 ` Venu Busireddy
2018-06-22 2:32 ` Michael S. Tsirkin
2018-06-22 20:00 ` Siwei Liu
2018-06-22 20:03 ` Siwei Liu
2018-06-22 21:29 ` Michael S. Tsirkin
2018-06-22 21:51 ` Siwei Liu
2018-06-22 22:25 ` Michael S. Tsirkin
2018-06-22 23:40 ` Siwei Liu
2018-06-23 0:17 ` Siwei Liu
2018-06-24 1:45 ` Michael S. Tsirkin
2018-06-25 17:54 ` Samudrala, Sridhar
2018-06-26 1:50 ` Michael S. Tsirkin
2018-06-26 15:17 ` Cornelia Huck
2018-06-26 15:38 ` Michael S. Tsirkin
2018-06-26 16:03 ` Cornelia Huck
2018-06-26 17:42 ` Michael S. Tsirkin
2018-06-26 23:38 ` Siwei Liu
2018-06-27 0:29 ` Michael S. Tsirkin
2018-06-27 6:21 ` Siwei Liu
2018-06-27 6:49 ` Samudrala, Sridhar
2018-06-27 7:03 ` Siwei Liu
2018-06-15 2:34 ` Michael S. Tsirkin
2018-06-15 9:32 ` Cornelia Huck
2018-06-15 12:31 ` Michael S. Tsirkin
2018-06-18 13:27 ` Cornelia Huck
2018-06-14 12:50 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ce4da2b0-88fd-c79b-0567-fe564bb34922@redhat.com \
--to=jasowang@redhat.com \
--cc=alexander.h.duyck@intel.com \
--cc=jesse.brandeburg@intel.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=sridhar.samudrala@intel.com \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).