From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: virtio-dev-return-2429-cohuck=redhat.com@lists.oasis-open.org Sender: List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [66.179.20.138]) by lists.oasis-open.org (Postfix) with ESMTP id B004D5818FA3 for ; Sat, 29 Jul 2017 21:22:57 -0700 (PDT) Date: Sun, 30 Jul 2017 07:22:47 +0300 From: "Michael S. Tsirkin" Message-ID: <20170730043922-mutt-send-email-mst@kernel.org> References: <20170712163746-mutt-send-email-mst@kernel.org> <5967246B.9030804@intel.com> <20170713210819-mutt-send-email-mst@kernel.org> <59686EEB.8080805@intel.com> <20170723044036-mutt-send-email-mst@kernel.org> <59781119.8010200@intel.com> <20170726155856-mutt-send-email-mst@kernel.org> <597954E3.2070801@intel.com> <20170729020231-mutt-send-email-mst@kernel.org> <597C83CC.7060702@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <597C83CC.7060702@intel.com> Subject: [virtio-dev] Re: [PATCH v12 5/8] virtio-balloon: VIRTIO_BALLOON_F_SG To: Wei Wang Cc: linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-mm@kvack.org, david@redhat.com, cornelia.huck@de.ibm.com, akpm@linux-foundation.org, mgorman@techsingularity.net, aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com, liliang.opensource@gmail.com, virtio-dev@lists.oasis-open.org, yang.zhang.wz@gmail.com, quan.xu@aliyun.com List-ID: On Sat, Jul 29, 2017 at 08:47:08PM +0800, Wei Wang wrote: > On 07/29/2017 07:08 AM, Michael S. Tsirkin wrote: > > On Thu, Jul 27, 2017 at 10:50:11AM +0800, Wei Wang wrote: > > > > > > OK I thought this over. While we might need these new APIs in > > > > > > the future, I think that at the moment, there's a way to implement > > > > > > this feature that is significantly simpler. Just add each s/g > > > > > > as a separate input buffer. > > > > > Should it be an output buffer? > > > > Hypervisor overwrites these pages with zeroes. Therefore it is > > > > writeable by device: DMA_FROM_DEVICE. > > > Why would the hypervisor need to zero the buffer? > > The page is supplied to hypervisor and can lose the value that > > is there. That is the definition of writeable by device. > > I think for the free pages, it should be clear that they will be added as > output buffer to the device, because (as we discussed) they are just hints, > and some of them may be used by the guest after the report_ API is invoked. > The device/hypervisor should not use or discard them. Discarding contents is exactly what you propose doing if migration is going on, isn't it? > For the balloon pages, I kind of agree with the existing implementation > (e.g. inside tell_host()), which uses virtqueue_add_outbuf (instead of > _add_inbuf()). This is because it does not pass SGs, it passes weirdly formatted PA within the buffer. > I think inbuf should be a buffer which will be written by the device and > read by the > driver. If hypervisor can change it, it's an inbuf. Should not matter whether driver reads it. > The cmd buffer put on the vq for the device to send commands can be > an > inbuf, I think. > > > > > > > I think it may only > > > need to read out the info(base,size). > > And then do what? > > > For the free pages, the info will be used to clear the corresponding "1" in > the dirty bitmap. > For balloon pages, they will be made DONTNEED and given to other host > processes to > use (the device won't write them, so no need to set "write" when > virtqueue_map_desc() in > the device). > > > > > > > I think it should be like this: > > > the cmd hdr buffer: input, because the hypervisor would write it to > > > send a cmd to the guest > > > the payload buffer: output, for the hypervisor to read the info > > These should be split. > > > > We have: > > > > 1. request that hypervisor sends to guest, includes ID: input > > 2. synchronisation header with ID sent by guest: output > > 3. list of pages: input > > > > 2 and 3 must be on the same VQ. 1 can come on any VQ - reusing stats VQ > > might make sense. > > I would prefer to make the above item 1 come on the the free page vq, > because the existing stat_vq doesn't support the cmd hdr. > Now, I think it is also not necessary to move the existing stat_vq > implementation to > a new implementation under the cmd hdr, because > that new support doesn't make a difference for stats, it will still use its > stat_vq (the free > page vq will be used for reporting free pages only) > > What do you think? > > > Best, > Wei --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH v12 5/8] virtio-balloon: VIRTIO_BALLOON_F_SG Date: Sun, 30 Jul 2017 07:22:47 +0300 Message-ID: <20170730043922-mutt-send-email-mst@kernel.org> References: <20170712163746-mutt-send-email-mst@kernel.org> <5967246B.9030804@intel.com> <20170713210819-mutt-send-email-mst@kernel.org> <59686EEB.8080805@intel.com> <20170723044036-mutt-send-email-mst@kernel.org> <59781119.8010200@intel.com> <20170726155856-mutt-send-email-mst@kernel.org> <597954E3.2070801@intel.com> <20170729020231-mutt-send-email-mst@kernel.org> <597C83CC.7060702@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-mm@kvack.org, david@redhat.com, cornelia.huck@de.ibm.com, akpm@linux-foundation.org, mgorman@techsingularity.net, aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com, liliang.opensource@gmail.com, virtio-dev@lists.oasis-open.org, yang.zhang.wz@gmail.com, quan.xu@aliyun.com To: Wei Wang Return-path: Content-Disposition: inline In-Reply-To: <597C83CC.7060702@intel.com> Sender: owner-linux-mm@kvack.org List-Id: kvm.vger.kernel.org On Sat, Jul 29, 2017 at 08:47:08PM +0800, Wei Wang wrote: > On 07/29/2017 07:08 AM, Michael S. Tsirkin wrote: > > On Thu, Jul 27, 2017 at 10:50:11AM +0800, Wei Wang wrote: > > > > > > OK I thought this over. While we might need these new APIs in > > > > > > the future, I think that at the moment, there's a way to implement > > > > > > this feature that is significantly simpler. Just add each s/g > > > > > > as a separate input buffer. > > > > > Should it be an output buffer? > > > > Hypervisor overwrites these pages with zeroes. Therefore it is > > > > writeable by device: DMA_FROM_DEVICE. > > > Why would the hypervisor need to zero the buffer? > > The page is supplied to hypervisor and can lose the value that > > is there. That is the definition of writeable by device. > > I think for the free pages, it should be clear that they will be added as > output buffer to the device, because (as we discussed) they are just hints, > and some of them may be used by the guest after the report_ API is invoked. > The device/hypervisor should not use or discard them. Discarding contents is exactly what you propose doing if migration is going on, isn't it? > For the balloon pages, I kind of agree with the existing implementation > (e.g. inside tell_host()), which uses virtqueue_add_outbuf (instead of > _add_inbuf()). This is because it does not pass SGs, it passes weirdly formatted PA within the buffer. > I think inbuf should be a buffer which will be written by the device and > read by the > driver. If hypervisor can change it, it's an inbuf. Should not matter whether driver reads it. > The cmd buffer put on the vq for the device to send commands can be > an > inbuf, I think. > > > > > > > I think it may only > > > need to read out the info(base,size). > > And then do what? > > > For the free pages, the info will be used to clear the corresponding "1" in > the dirty bitmap. > For balloon pages, they will be made DONTNEED and given to other host > processes to > use (the device won't write them, so no need to set "write" when > virtqueue_map_desc() in > the device). > > > > > > > I think it should be like this: > > > the cmd hdr buffer: input, because the hypervisor would write it to > > > send a cmd to the guest > > > the payload buffer: output, for the hypervisor to read the info > > These should be split. > > > > We have: > > > > 1. request that hypervisor sends to guest, includes ID: input > > 2. synchronisation header with ID sent by guest: output > > 3. list of pages: input > > > > 2 and 3 must be on the same VQ. 1 can come on any VQ - reusing stats VQ > > might make sense. > > I would prefer to make the above item 1 come on the the free page vq, > because the existing stat_vq doesn't support the cmd hdr. > Now, I think it is also not necessary to move the existing stat_vq > implementation to > a new implementation under the cmd hdr, because > that new support doesn't make a difference for stats, it will still use its > stat_vq (the free > page vq will be used for reporting free pages only) > > What do you think? > > > Best, > Wei -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750821AbdG3EW5 (ORCPT ); Sun, 30 Jul 2017 00:22:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54446 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750705AbdG3EWz (ORCPT ); Sun, 30 Jul 2017 00:22:55 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 2FB0732EDA0 Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=mst@redhat.com Date: Sun, 30 Jul 2017 07:22:47 +0300 From: "Michael S. Tsirkin" To: Wei Wang Cc: linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-mm@kvack.org, david@redhat.com, cornelia.huck@de.ibm.com, akpm@linux-foundation.org, mgorman@techsingularity.net, aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com, liliang.opensource@gmail.com, virtio-dev@lists.oasis-open.org, yang.zhang.wz@gmail.com, quan.xu@aliyun.com Subject: Re: [PATCH v12 5/8] virtio-balloon: VIRTIO_BALLOON_F_SG Message-ID: <20170730043922-mutt-send-email-mst@kernel.org> References: <20170712163746-mutt-send-email-mst@kernel.org> <5967246B.9030804@intel.com> <20170713210819-mutt-send-email-mst@kernel.org> <59686EEB.8080805@intel.com> <20170723044036-mutt-send-email-mst@kernel.org> <59781119.8010200@intel.com> <20170726155856-mutt-send-email-mst@kernel.org> <597954E3.2070801@intel.com> <20170729020231-mutt-send-email-mst@kernel.org> <597C83CC.7060702@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <597C83CC.7060702@intel.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Sun, 30 Jul 2017 04:22:55 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 29, 2017 at 08:47:08PM +0800, Wei Wang wrote: > On 07/29/2017 07:08 AM, Michael S. Tsirkin wrote: > > On Thu, Jul 27, 2017 at 10:50:11AM +0800, Wei Wang wrote: > > > > > > OK I thought this over. While we might need these new APIs in > > > > > > the future, I think that at the moment, there's a way to implement > > > > > > this feature that is significantly simpler. Just add each s/g > > > > > > as a separate input buffer. > > > > > Should it be an output buffer? > > > > Hypervisor overwrites these pages with zeroes. Therefore it is > > > > writeable by device: DMA_FROM_DEVICE. > > > Why would the hypervisor need to zero the buffer? > > The page is supplied to hypervisor and can lose the value that > > is there. That is the definition of writeable by device. > > I think for the free pages, it should be clear that they will be added as > output buffer to the device, because (as we discussed) they are just hints, > and some of them may be used by the guest after the report_ API is invoked. > The device/hypervisor should not use or discard them. Discarding contents is exactly what you propose doing if migration is going on, isn't it? > For the balloon pages, I kind of agree with the existing implementation > (e.g. inside tell_host()), which uses virtqueue_add_outbuf (instead of > _add_inbuf()). This is because it does not pass SGs, it passes weirdly formatted PA within the buffer. > I think inbuf should be a buffer which will be written by the device and > read by the > driver. If hypervisor can change it, it's an inbuf. Should not matter whether driver reads it. > The cmd buffer put on the vq for the device to send commands can be > an > inbuf, I think. > > > > > > > I think it may only > > > need to read out the info(base,size). > > And then do what? > > > For the free pages, the info will be used to clear the corresponding "1" in > the dirty bitmap. > For balloon pages, they will be made DONTNEED and given to other host > processes to > use (the device won't write them, so no need to set "write" when > virtqueue_map_desc() in > the device). > > > > > > > I think it should be like this: > > > the cmd hdr buffer: input, because the hypervisor would write it to > > > send a cmd to the guest > > > the payload buffer: output, for the hypervisor to read the info > > These should be split. > > > > We have: > > > > 1. request that hypervisor sends to guest, includes ID: input > > 2. synchronisation header with ID sent by guest: output > > 3. list of pages: input > > > > 2 and 3 must be on the same VQ. 1 can come on any VQ - reusing stats VQ > > might make sense. > > I would prefer to make the above item 1 come on the the free page vq, > because the existing stat_vq doesn't support the cmd hdr. > Now, I think it is also not necessary to move the existing stat_vq > implementation to > a new implementation under the cmd hdr, because > that new support doesn't make a difference for stats, it will still use its > stat_vq (the free > page vq will be used for reporting free pages only) > > What do you think? > > > Best, > Wei From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44701) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dbfku-0000n0-7h for qemu-devel@nongnu.org; Sun, 30 Jul 2017 00:23:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dbfkr-0005q1-1T for qemu-devel@nongnu.org; Sun, 30 Jul 2017 00:23:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44300) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dbfkq-0005na-OI for qemu-devel@nongnu.org; Sun, 30 Jul 2017 00:23:00 -0400 Date: Sun, 30 Jul 2017 07:22:47 +0300 From: "Michael S. Tsirkin" Message-ID: <20170730043922-mutt-send-email-mst@kernel.org> References: <20170712163746-mutt-send-email-mst@kernel.org> <5967246B.9030804@intel.com> <20170713210819-mutt-send-email-mst@kernel.org> <59686EEB.8080805@intel.com> <20170723044036-mutt-send-email-mst@kernel.org> <59781119.8010200@intel.com> <20170726155856-mutt-send-email-mst@kernel.org> <597954E3.2070801@intel.com> <20170729020231-mutt-send-email-mst@kernel.org> <597C83CC.7060702@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <597C83CC.7060702@intel.com> Subject: Re: [Qemu-devel] [PATCH v12 5/8] virtio-balloon: VIRTIO_BALLOON_F_SG List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wei Wang Cc: linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-mm@kvack.org, david@redhat.com, cornelia.huck@de.ibm.com, akpm@linux-foundation.org, mgorman@techsingularity.net, aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com, liliang.opensource@gmail.com, virtio-dev@lists.oasis-open.org, yang.zhang.wz@gmail.com, quan.xu@aliyun.com On Sat, Jul 29, 2017 at 08:47:08PM +0800, Wei Wang wrote: > On 07/29/2017 07:08 AM, Michael S. Tsirkin wrote: > > On Thu, Jul 27, 2017 at 10:50:11AM +0800, Wei Wang wrote: > > > > > > OK I thought this over. While we might need these new APIs in > > > > > > the future, I think that at the moment, there's a way to implement > > > > > > this feature that is significantly simpler. Just add each s/g > > > > > > as a separate input buffer. > > > > > Should it be an output buffer? > > > > Hypervisor overwrites these pages with zeroes. Therefore it is > > > > writeable by device: DMA_FROM_DEVICE. > > > Why would the hypervisor need to zero the buffer? > > The page is supplied to hypervisor and can lose the value that > > is there. That is the definition of writeable by device. > > I think for the free pages, it should be clear that they will be added as > output buffer to the device, because (as we discussed) they are just hints, > and some of them may be used by the guest after the report_ API is invoked. > The device/hypervisor should not use or discard them. Discarding contents is exactly what you propose doing if migration is going on, isn't it? > For the balloon pages, I kind of agree with the existing implementation > (e.g. inside tell_host()), which uses virtqueue_add_outbuf (instead of > _add_inbuf()). This is because it does not pass SGs, it passes weirdly formatted PA within the buffer. > I think inbuf should be a buffer which will be written by the device and > read by the > driver. If hypervisor can change it, it's an inbuf. Should not matter whether driver reads it. > The cmd buffer put on the vq for the device to send commands can be > an > inbuf, I think. > > > > > > > I think it may only > > > need to read out the info(base,size). > > And then do what? > > > For the free pages, the info will be used to clear the corresponding "1" in > the dirty bitmap. > For balloon pages, they will be made DONTNEED and given to other host > processes to > use (the device won't write them, so no need to set "write" when > virtqueue_map_desc() in > the device). > > > > > > > I think it should be like this: > > > the cmd hdr buffer: input, because the hypervisor would write it to > > > send a cmd to the guest > > > the payload buffer: output, for the hypervisor to read the info > > These should be split. > > > > We have: > > > > 1. request that hypervisor sends to guest, includes ID: input > > 2. synchronisation header with ID sent by guest: output > > 3. list of pages: input > > > > 2 and 3 must be on the same VQ. 1 can come on any VQ - reusing stats VQ > > might make sense. > > I would prefer to make the above item 1 come on the the free page vq, > because the existing stat_vq doesn't support the cmd hdr. > Now, I think it is also not necessary to move the existing stat_vq > implementation to > a new implementation under the cmd hdr, because > that new support doesn't make a difference for stats, it will still use its > stat_vq (the free > page vq will be used for reporting free pages only) > > What do you think? > > > Best, > Wei