From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8Qbd-00033D-My for qemu-devel@nongnu.org; Thu, 02 Jun 2016 07:16:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b8QbZ-00034t-DZ for qemu-devel@nongnu.org; Thu, 02 Jun 2016 07:16:04 -0400 Received: from mga02.intel.com ([134.134.136.20]:33446) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8QbZ-00034d-1p for qemu-devel@nongnu.org; Thu, 02 Jun 2016 07:16:01 -0400 References: <1464509494-159509-1-git-send-email-wei.w.wang@intel.com> <1464509494-159509-5-git-send-email-wei.w.wang@intel.com> <574E999B.90307@linux.intel.com> <286AC319A985734F985F78AFA26841F7C78F0A@shsmsx102.ccr.corp.intel.com> <574FAD76.6030601@linux.intel.com> <286AC319A985734F985F78AFA26841F7C7962D@shsmsx102.ccr.corp.intel.com> From: Xiao Guangrong Message-ID: <575014C7.1000003@linux.intel.com> Date: Thu, 2 Jun 2016 19:13:11 +0800 MIME-Version: 1.0 In-Reply-To: <286AC319A985734F985F78AFA26841F7C7962D@shsmsx102.ccr.corp.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH 4/6 Resend] Vhost-pci RFC: Detailed Description in the Virtio Specification Format List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Wang, Wei W" , "kvm@vger.kernel.org" , "qemu-devel@nongnu.org" , "virtio-comment@lists.oasis-open.org" , "mst@redhat.com" , "stefanha@redhat.com" , "pbonzini@redhat.com" On 06/02/2016 04:43 PM, Wang, Wei W wrote: > On Thu 6/2/2016 11:52 AM, Xiao Guangrong wrote: >> On 06/02/2016 11:15 AM, Wang, Wei W wrote: >>> On Wed 6/1/2016 4:15 PM, Xiao Guangrong wrote: >>>> On 05/29/2016 04:11 PM, Wei Wang wrote: >>>>> Signed-off-by: Wei Wang >>>>> --- >>>>> Details | 324 >>>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>> 1 file changed, 324 insertions(+) >>>>> create mode 100644 Details >>>>> >>>>> diff --git a/Details b/Details >>>>> new file mode 100644 >>>>> index 0000000..4ea2252 >>>>> --- /dev/null >>>>> +++ b/Details >>>>> @@ -0,0 +1,324 @@ >>>>> +1 Device ID >>>>> +TBD >>>>> + >>>>> +2 Virtqueues >>>>> +0 controlq >>>>> + >>>>> +3 Feature Bits >>>>> +3.1 Local Feature Bits >>>>> +Currently no local feature bits are defined, so the standard virtio >>>>> +feature bits negation will always be successful and complete. >>>>> + >>>>> +3.2 Remote Feature Bits >>>>> +The remote feature bits are obtained from the frontend virtio >>>>> +device and negotiated with the vhost-pci driver via the controlq. >>>>> +The negotiation steps are described in 4.5 Device Initialization. >>>>> + >>>>> +4 Device Configuration Layout >>>>> +struct vhost_pci_config { >>>>> + #define VHOST_PCI_CONTROLQ_MEMORY_INFO_ACK 0 >>>>> + #define VHOST_PCI_CONTROLQ_DEVICE_INFO_ACK 1 >>>>> + #define VHOST_PCI_CONTROLQ_FEATURE_BITS_ACK 2 >>>>> + u32 ack_type; >>>>> + u32 ack_device_type; >>>>> + u64 ack_device_id; >>>>> + union { >>>>> + #define VHOST_PCI_CONTROLQ_ACK_ADD_DONE 0 >>>>> + #define VHOST_PCI_CONTROLQ_ACK_ADD_FAIL 1 >>>>> + #define VHOST_PCI_CONTROLQ_ACK_DEL_DONE 2 >>>>> + #define VHOST_PCI_CONTROLQ_ACK_DEL_FAIL 3 >>>>> + u64 ack_memory_info; >>>>> + u64 ack_device_info; >>>>> + u64 ack_feature_bits; >>>>> + }; >>>>> +}; >>>> >>>> Do you need to write all these 4 field to ack the operation? It seems >>>> it is not efficient and it is not flexible if the driver need to >>>> offer more data to the device in the further. Can we dedicate a vq >>>> for this purpose? >>> >>> Yes, all the 4 fields are required to be written. The vhost-pci server usually >> connects to multiple clients, and the "ack_device_type" and "ack_device_id" >> fields are used to identify them. >>> >>> Agree, another controlq for the guest->host direction looks better, and the >> above fileds can be converted to be the controlq message header. >>> >> >> Thanks. >> >>>> >>>> BTW, current approach can not handle the case if there are multiple >>>> same kind of requests in the control queue, e.g, if there are two >>>> memory-add request in the control queue. >>> >>> A vhost-pci device corresponds to a driver VM. The two memory-add requests >> on the controlq are all for the same driver VM. Memory-add requests for >> different driver VMs couldn’t be present on the same controlq. I haven’t seen >> the issue yet. Can you please explain more? Thanks. >> >> The issue is caused by "The two memory-add requests on the controlq are all for >> the same driver VM", the driver need to ACK these request respectively, however, >> these two requests have the same ack_type, device_type, device_id, >> ack_memory_info, then QEMU is not able to figure out which request has been >> acked. > > Normally pieces of memory info should be combined into one message (the structure includes multiple memory regions) and sent by the client. In a rare case like this: the driver VM hot-adds 1GB memory, followed by hot-adding another 1GB memory. The first piece of memory info is passed via the socket and controlq to the vhost-pci driver, then the second. Normally they won't get an opportunity to be put on the controlq at the same time. > Even the implementation batches the controlq messages, there will be a sequence difference between the two messages on the controlq, right? That assumes the driver should serially handle the control messages... > > From the QEMU's (vhost-pci server) perspective, it just sends back an ACK to the client whenever it receives an ACK from the vhost-pci driver. > From the client's perspective, it will receive two ACK messages in this example. > Since the two have a sequence difference, the client should be able to distinguish the two (first sent, first acked), right? That assumes that the vhost-pci server and remote virtio device should use serial mode too.