From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
To: "Wang, Wei W" <wei.w.wang@intel.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"virtio-comment@lists.oasis-open.org"
<virtio-comment@lists.oasis-open.org>,
"mst@redhat.com" <mst@redhat.com>,
"stefanha@redhat.com" <stefanha@redhat.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>
Subject: Re: [PATCH 4/6 Resend] Vhost-pci RFC: Detailed Description in the Virtio Specification Format
Date: Thu, 2 Jun 2016 19:13:11 +0800 [thread overview]
Message-ID: <575014C7.1000003@linux.intel.com> (raw)
In-Reply-To: <286AC319A985734F985F78AFA26841F7C7962D@shsmsx102.ccr.corp.intel.com>
On 06/02/2016 04:43 PM, Wang, Wei W wrote:
> On Thu 6/2/2016 11:52 AM, Xiao Guangrong wrote:
>> On 06/02/2016 11:15 AM, Wang, Wei W wrote:
>>> On Wed 6/1/2016 4:15 PM, Xiao Guangrong wrote:
>>>> On 05/29/2016 04:11 PM, Wei Wang wrote:
>>>>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>>>>> ---
>>>>> Details | 324
>>>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> 1 file changed, 324 insertions(+)
>>>>> create mode 100644 Details
>>>>>
>>>>> diff --git a/Details b/Details
>>>>> new file mode 100644
>>>>> index 0000000..4ea2252
>>>>> --- /dev/null
>>>>> +++ b/Details
>>>>> @@ -0,0 +1,324 @@
>>>>> +1 Device ID
>>>>> +TBD
>>>>> +
>>>>> +2 Virtqueues
>>>>> +0 controlq
>>>>> +
>>>>> +3 Feature Bits
>>>>> +3.1 Local Feature Bits
>>>>> +Currently no local feature bits are defined, so the standard virtio
>>>>> +feature bits negation will always be successful and complete.
>>>>> +
>>>>> +3.2 Remote Feature Bits
>>>>> +The remote feature bits are obtained from the frontend virtio
>>>>> +device and negotiated with the vhost-pci driver via the controlq.
>>>>> +The negotiation steps are described in 4.5 Device Initialization.
>>>>> +
>>>>> +4 Device Configuration Layout
>>>>> +struct vhost_pci_config {
>>>>> + #define VHOST_PCI_CONTROLQ_MEMORY_INFO_ACK 0
>>>>> + #define VHOST_PCI_CONTROLQ_DEVICE_INFO_ACK 1
>>>>> + #define VHOST_PCI_CONTROLQ_FEATURE_BITS_ACK 2
>>>>> + u32 ack_type;
>>>>> + u32 ack_device_type;
>>>>> + u64 ack_device_id;
>>>>> + union {
>>>>> + #define VHOST_PCI_CONTROLQ_ACK_ADD_DONE 0
>>>>> + #define VHOST_PCI_CONTROLQ_ACK_ADD_FAIL 1
>>>>> + #define VHOST_PCI_CONTROLQ_ACK_DEL_DONE 2
>>>>> + #define VHOST_PCI_CONTROLQ_ACK_DEL_FAIL 3
>>>>> + u64 ack_memory_info;
>>>>> + u64 ack_device_info;
>>>>> + u64 ack_feature_bits;
>>>>> + };
>>>>> +};
>>>>
>>>> Do you need to write all these 4 field to ack the operation? It seems
>>>> it is not efficient and it is not flexible if the driver need to
>>>> offer more data to the device in the further. Can we dedicate a vq
>>>> for this purpose?
>>>
>>> Yes, all the 4 fields are required to be written. The vhost-pci server usually
>> connects to multiple clients, and the "ack_device_type" and "ack_device_id"
>> fields are used to identify them.
>>>
>>> Agree, another controlq for the guest->host direction looks better, and the
>> above fileds can be converted to be the controlq message header.
>>>
>>
>> Thanks.
>>
>>>>
>>>> BTW, current approach can not handle the case if there are multiple
>>>> same kind of requests in the control queue, e.g, if there are two
>>>> memory-add request in the control queue.
>>>
>>> A vhost-pci device corresponds to a driver VM. The two memory-add requests
>> on the controlq are all for the same driver VM. Memory-add requests for
>> different driver VMs couldn’t be present on the same controlq. I haven’t seen
>> the issue yet. Can you please explain more? Thanks.
>>
>> The issue is caused by "The two memory-add requests on the controlq are all for
>> the same driver VM", the driver need to ACK these request respectively, however,
>> these two requests have the same ack_type, device_type, device_id,
>> ack_memory_info, then QEMU is not able to figure out which request has been
>> acked.
>
> Normally pieces of memory info should be combined into one message (the structure includes multiple memory regions) and sent by the client. In a rare case like this: the driver VM hot-adds 1GB memory, followed by hot-adding another 1GB memory. The first piece of memory info is passed via the socket and controlq to the vhost-pci driver, then the second. Normally they won't get an opportunity to be put on the controlq at the same time.
> Even the implementation batches the controlq messages, there will be a sequence difference between the two messages on the controlq, right?
That assumes the driver should serially handle the control messages...
>
> From the QEMU's (vhost-pci server) perspective, it just sends back an ACK to the client whenever it receives an ACK from the vhost-pci driver.
> From the client's perspective, it will receive two ACK messages in this example.
> Since the two have a sequence difference, the client should be able to distinguish the two (first sent, first acked), right?
That assumes that the vhost-pci server and remote virtio device should use
serial mode too.
WARNING: multiple messages have this Message-ID (diff)
From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
To: "Wang, Wei W" <wei.w.wang@intel.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"virtio-comment@lists.oasis-open.org"
<virtio-comment@lists.oasis-open.org>,
"mst@redhat.com" <mst@redhat.com>,
"stefanha@redhat.com" <stefanha@redhat.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 4/6 Resend] Vhost-pci RFC: Detailed Description in the Virtio Specification Format
Date: Thu, 2 Jun 2016 19:13:11 +0800 [thread overview]
Message-ID: <575014C7.1000003@linux.intel.com> (raw)
In-Reply-To: <286AC319A985734F985F78AFA26841F7C7962D@shsmsx102.ccr.corp.intel.com>
On 06/02/2016 04:43 PM, Wang, Wei W wrote:
> On Thu 6/2/2016 11:52 AM, Xiao Guangrong wrote:
>> On 06/02/2016 11:15 AM, Wang, Wei W wrote:
>>> On Wed 6/1/2016 4:15 PM, Xiao Guangrong wrote:
>>>> On 05/29/2016 04:11 PM, Wei Wang wrote:
>>>>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>>>>> ---
>>>>> Details | 324
>>>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> 1 file changed, 324 insertions(+)
>>>>> create mode 100644 Details
>>>>>
>>>>> diff --git a/Details b/Details
>>>>> new file mode 100644
>>>>> index 0000000..4ea2252
>>>>> --- /dev/null
>>>>> +++ b/Details
>>>>> @@ -0,0 +1,324 @@
>>>>> +1 Device ID
>>>>> +TBD
>>>>> +
>>>>> +2 Virtqueues
>>>>> +0 controlq
>>>>> +
>>>>> +3 Feature Bits
>>>>> +3.1 Local Feature Bits
>>>>> +Currently no local feature bits are defined, so the standard virtio
>>>>> +feature bits negation will always be successful and complete.
>>>>> +
>>>>> +3.2 Remote Feature Bits
>>>>> +The remote feature bits are obtained from the frontend virtio
>>>>> +device and negotiated with the vhost-pci driver via the controlq.
>>>>> +The negotiation steps are described in 4.5 Device Initialization.
>>>>> +
>>>>> +4 Device Configuration Layout
>>>>> +struct vhost_pci_config {
>>>>> + #define VHOST_PCI_CONTROLQ_MEMORY_INFO_ACK 0
>>>>> + #define VHOST_PCI_CONTROLQ_DEVICE_INFO_ACK 1
>>>>> + #define VHOST_PCI_CONTROLQ_FEATURE_BITS_ACK 2
>>>>> + u32 ack_type;
>>>>> + u32 ack_device_type;
>>>>> + u64 ack_device_id;
>>>>> + union {
>>>>> + #define VHOST_PCI_CONTROLQ_ACK_ADD_DONE 0
>>>>> + #define VHOST_PCI_CONTROLQ_ACK_ADD_FAIL 1
>>>>> + #define VHOST_PCI_CONTROLQ_ACK_DEL_DONE 2
>>>>> + #define VHOST_PCI_CONTROLQ_ACK_DEL_FAIL 3
>>>>> + u64 ack_memory_info;
>>>>> + u64 ack_device_info;
>>>>> + u64 ack_feature_bits;
>>>>> + };
>>>>> +};
>>>>
>>>> Do you need to write all these 4 field to ack the operation? It seems
>>>> it is not efficient and it is not flexible if the driver need to
>>>> offer more data to the device in the further. Can we dedicate a vq
>>>> for this purpose?
>>>
>>> Yes, all the 4 fields are required to be written. The vhost-pci server usually
>> connects to multiple clients, and the "ack_device_type" and "ack_device_id"
>> fields are used to identify them.
>>>
>>> Agree, another controlq for the guest->host direction looks better, and the
>> above fileds can be converted to be the controlq message header.
>>>
>>
>> Thanks.
>>
>>>>
>>>> BTW, current approach can not handle the case if there are multiple
>>>> same kind of requests in the control queue, e.g, if there are two
>>>> memory-add request in the control queue.
>>>
>>> A vhost-pci device corresponds to a driver VM. The two memory-add requests
>> on the controlq are all for the same driver VM. Memory-add requests for
>> different driver VMs couldn’t be present on the same controlq. I haven’t seen
>> the issue yet. Can you please explain more? Thanks.
>>
>> The issue is caused by "The two memory-add requests on the controlq are all for
>> the same driver VM", the driver need to ACK these request respectively, however,
>> these two requests have the same ack_type, device_type, device_id,
>> ack_memory_info, then QEMU is not able to figure out which request has been
>> acked.
>
> Normally pieces of memory info should be combined into one message (the structure includes multiple memory regions) and sent by the client. In a rare case like this: the driver VM hot-adds 1GB memory, followed by hot-adding another 1GB memory. The first piece of memory info is passed via the socket and controlq to the vhost-pci driver, then the second. Normally they won't get an opportunity to be put on the controlq at the same time.
> Even the implementation batches the controlq messages, there will be a sequence difference between the two messages on the controlq, right?
That assumes the driver should serially handle the control messages...
>
> From the QEMU's (vhost-pci server) perspective, it just sends back an ACK to the client whenever it receives an ACK from the vhost-pci driver.
> From the client's perspective, it will receive two ACK messages in this example.
> Since the two have a sequence difference, the client should be able to distinguish the two (first sent, first acked), right?
That assumes that the vhost-pci server and remote virtio device should use
serial mode too.
next prev parent reply other threads:[~2016-06-02 11:15 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-29 8:11 [PATCH 0/6 Resend] *** Vhost-pci RFC *** Wei Wang
2016-05-29 8:11 ` [Qemu-devel] " Wei Wang
2016-05-29 8:11 ` [virtio-comment] [PATCH 1/6 Resend] Vhost-pci RFC: Introduction Wei Wang
2016-05-29 8:11 ` [Qemu-devel] " Wei Wang
2016-05-29 8:11 ` [virtio-comment] [PATCH 2/6 Resend] Vhost-pci RFC: Modification Scope Wei Wang
2016-05-29 8:11 ` [Qemu-devel] " Wei Wang
2016-05-29 8:11 ` [virtio-comment] [PATCH 3/6 Resend] Vhost-pci RFC: Benefits to KVM Wei Wang
2016-05-29 8:11 ` [Qemu-devel] " Wei Wang
2016-05-29 8:11 ` [virtio-comment] [PATCH 4/6 Resend] Vhost-pci RFC: Detailed Description in the Virtio Specification Format Wei Wang
2016-05-29 8:11 ` [Qemu-devel] " Wei Wang
2016-06-01 8:15 ` Xiao Guangrong
2016-06-01 8:15 ` [Qemu-devel] " Xiao Guangrong
2016-06-02 3:15 ` [virtio-comment] " Wang, Wei W
2016-06-02 3:15 ` [Qemu-devel] " Wang, Wei W
2016-06-02 3:52 ` Xiao Guangrong
2016-06-02 3:52 ` [Qemu-devel] " Xiao Guangrong
2016-06-02 8:43 ` [virtio-comment] " Wang, Wei W
2016-06-02 8:43 ` [Qemu-devel] " Wang, Wei W
2016-06-02 11:13 ` Xiao Guangrong [this message]
2016-06-02 11:13 ` Xiao Guangrong
2016-06-03 6:12 ` [virtio-comment] " Wang, Wei W
2016-06-03 6:12 ` [Qemu-devel] " Wang, Wei W
2016-05-29 8:11 ` [virtio-comment] [PATCH 5/6 Resend] Vhost-pci RFC: Future Security Enhancement Wei Wang
2016-05-29 8:11 ` [Qemu-devel] " Wei Wang
2016-05-30 6:23 ` [virtio-comment] " Jan Kiszka
2016-05-30 6:23 ` [Qemu-devel] " Jan Kiszka
2016-05-31 8:00 ` Wang, Wei W
2016-05-31 8:00 ` [Qemu-devel] " Wang, Wei W
2016-06-02 9:27 ` Jan Kiszka
2016-06-02 9:27 ` [Qemu-devel] " Jan Kiszka
2016-06-03 5:54 ` Wang, Wei W
2016-06-03 5:54 ` [Qemu-devel] " Wang, Wei W
2016-05-29 8:11 ` [PATCH 6/6 Resend] Vhost-pci RFC: Experimental Results Wei Wang
2016-05-29 8:11 ` [Qemu-devel] " Wei Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=575014C7.1000003@linux.intel.com \
--to=guangrong.xiao@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
--cc=wei.w.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.