From: "Michael S. Tsirkin" <mst@redhat.com>
To: Yang Zhang <yang.zhang.wz@gmail.com>
Cc: Lan Tianyu <tianyu.lan@intel.com>, Alexander Graf <agraf@suse.de>,
kvm@vger.kernel.org, konrad.wilk@oracle.com,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
x86@kernel.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Alexander Duyck <alexander.duyck@gmail.com>,
qemu-devel@nongnu.org,
Alex Williamson <alex.williamson@redhat.com>,
Alexander Duyck <aduyck@mirantis.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking
Date: Mon, 14 Dec 2015 16:02:59 +0200 [thread overview]
Message-ID: <20151214160139-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <566E6DBA.4080800@gmail.com>
On Mon, Dec 14, 2015 at 03:20:26PM +0800, Yang Zhang wrote:
> On 2015/12/14 13:46, Alexander Duyck wrote:
> >On Sun, Dec 13, 2015 at 9:22 PM, Yang Zhang <yang.zhang.wz@gmail.com> wrote:
> >>On 2015/12/14 12:54, Alexander Duyck wrote:
> >>>
> >>>On Sun, Dec 13, 2015 at 6:27 PM, Yang Zhang <yang.zhang.wz@gmail.com>
> >>>wrote:
> >>>>
> >>>>On 2015/12/14 5:28, Alexander Duyck wrote:
> >>>>>
> >>>>>
> >>>>>This patch set is meant to be the guest side code for a proof of concept
> >>>>>involving leaving pass-through devices in the guest during the warm-up
> >>>>>phase of guest live migration. In order to accomplish this I have added
> >>>>>a
> >>>>>new function called dma_mark_dirty that will mark the pages associated
> >>>>>with
> >>>>>the DMA transaction as dirty in the case of either an unmap or a
> >>>>>sync_.*_for_cpu where the DMA direction is either DMA_FROM_DEVICE or
> >>>>>DMA_BIDIRECTIONAL. The pass-through device must still be removed before
> >>>>>the stop-and-copy phase, however allowing the device to be present
> >>>>>should
> >>>>>significantly improve the performance of the guest during the warm-up
> >>>>>period.
> >>>>>
> >>>>>This current implementation is very preliminary and there are number of
> >>>>>items still missing. Specifically in order to make this a more complete
> >>>>>solution we need to support:
> >>>>>1. Notifying hypervisor that drivers are dirtying DMA pages received
> >>>>>2. Bypassing page dirtying when it is not needed.
> >>>>>
> >>>>
> >>>>Shouldn't current log dirty mechanism already cover them?
> >>>
> >>>
> >>>The guest has no way of currently knowing that the hypervisor is doing
> >>>dirty page logging, and the log dirty mechanism currently has no way
> >>>of tracking device DMA accesses. This change is meant to bridge the
> >>>two so that the guest device driver will force the SWIOTLB DMA API to
> >>>mark pages written to by the device as dirty.
> >>
> >>
> >>OK. This is what we called "dummy write mechanism". Actually, this is just a
> >>workaround before iommu dirty bit ready. Eventually, we need to change to
> >>use the hardware dirty bit. Besides, we may still lost the data if dma
> >>happens during/just before stop and copy phase.
> >
> >Right, this is a "dummy write mechanism" in order to allow for entry
> >tracking. This only works completely if we force the hardware to
> >quiesce via a hot-plug event before we reach the stop-and-copy phase
> >of the migration.
> >
> >The IOMMU dirty bit approach is likely going to have a significant
> >number of challenges involved. Looking over the driver and the data
> >sheet it looks like the current implementation is using a form of huge
> >pages in the IOMMU, as such we will need to tear that down and replace
> >it with 4K pages if we don't want to dirty large regions with each DMA
>
> Yes, we need to split the huge page into small pages to get the small dirty
> range.
>
> >transaction, and I'm not sure that is something we can change while
> >DMA is active to the affected regions. In addition the data sheet
>
> what changes do you mean?
>
> >references the fact that the page table entries are stored in a
> >translation cache and in order to sync things up you have to
> >invalidate the entries. I'm not sure what the total overhead would be
> >for invalidating something like a half million 4K pages to migrate a
> >guest with just 2G of RAM, but I would think that might be a bit
>
> Do you mean the cost of submit the flush request or the performance
> impaction due to IOTLB miss? For the former, we have domain-selective
> invalidation. For the latter, it would be acceptable since live migration
> shouldn't last too long.
That's pretty weak - if migration time is short and speed does not
matter during migration, then all this work is useless, temporarily
switching to a virtual card would be preferable.
> >expensive given the fact that IOMMU accesses aren't known for being
> >incredibly fast when invalidating DMA on the host.
> >
> >- Alex
> >
>
>
> --
> best regards
> yang
next prev parent reply other threads:[~2015-12-14 14:03 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-13 21:28 [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking Alexander Duyck
2015-12-13 21:28 ` [Qemu-devel] [RFC PATCH 1/3] swiotlb: Fold static unmap and sync calls into calling functions Alexander Duyck
2015-12-13 21:28 ` [Qemu-devel] [RFC PATCH 2/3] xen/swiotlb: " Alexander Duyck
2015-12-13 21:28 ` [Qemu-devel] [RFC PATCH 3/3] x86: Create dma_mark_dirty to dirty pages used for DMA by VM guest Alexander Duyck
2015-12-14 14:00 ` Michael S. Tsirkin
2015-12-14 16:34 ` Alexander Duyck
2015-12-14 17:20 ` Michael S. Tsirkin
2015-12-14 17:59 ` Alexander Duyck
2015-12-14 20:52 ` Michael S. Tsirkin
2015-12-14 22:32 ` Alexander Duyck
2015-12-14 2:27 ` [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking Yang Zhang
2015-12-14 4:54 ` Alexander Duyck
2015-12-14 5:22 ` Yang Zhang
2015-12-14 5:46 ` Alexander Duyck
2015-12-14 7:20 ` Yang Zhang
2015-12-14 14:02 ` Michael S. Tsirkin [this message]
2016-01-04 20:41 ` Konrad Rzeszutek Wilk
2016-01-05 3:11 ` Alexander Duyck
2016-01-05 9:40 ` Michael S. Tsirkin
2016-01-05 10:01 ` Dr. David Alan Gilbert
2016-01-05 10:35 ` Michael S. Tsirkin
2016-01-05 10:45 ` Dr. David Alan Gilbert
2016-01-05 10:59 ` Michael S. Tsirkin
2016-01-05 11:03 ` Dr. David Alan Gilbert
2016-01-05 11:11 ` Michael S. Tsirkin
2016-01-05 11:06 ` Michael S. Tsirkin
2016-01-05 11:05 ` Michael S. Tsirkin
2016-01-05 12:43 ` Dr. David Alan Gilbert
2016-01-05 13:16 ` Michael S. Tsirkin
2016-01-05 18:42 ` Konrad Rzeszutek Wilk
2016-01-05 16:18 ` Alexander Duyck
2016-06-06 9:18 ` Zhou Jie
2016-06-06 16:04 ` Alex Duyck
2016-06-09 10:14 ` Zhou Jie
2016-06-09 15:39 ` Alexander Duyck
2016-06-12 3:03 ` Zhou Jie
2016-06-13 1:28 ` Alexander Duyck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151214160139-mutt-send-email-mst@redhat.com \
--to=mst@redhat.com \
--cc=aduyck@mirantis.com \
--cc=agraf@suse.de \
--cc=alex.williamson@redhat.com \
--cc=alexander.duyck@gmail.com \
--cc=dgilbert@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
--cc=tianyu.lan@intel.com \
--cc=x86@kernel.org \
--cc=yang.zhang.wz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).