From: Jason Wang <jasowang@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
Rob Miller <rob.miller@broadcom.com>
Cc: Virtio-Dev <virtio-dev@lists.oasis-open.org>
Subject: Re: [virtio-dev] Dirty Page Tracking (DPT)
Date: Mon, 9 Mar 2020 16:50:43 +0800 [thread overview]
Message-ID: <ff63b1e2-e4aa-1b13-0b4a-72fd23badc06@redhat.com> (raw)
In-Reply-To: <20200309030251-mutt-send-email-mst@kernel.org>
On 2020/3/9 下午3:38, Michael S. Tsirkin wrote:
> On Fri, Mar 06, 2020 at 10:40:13AM -0500, Rob Miller wrote:
>> I understand that DPT isn't really on the forefront of the vDPA framework, but
>> wanted to understand if there any initial thoughts on how this would work...
> And judging by the next few chapters, you are actually
> talking about vhost pci, right?
>
>> In the migration framework, in its simplest form, (I gather) its QEMU via KVM
>> that is reading the dirty page table, converting bits to page numbers, then
>> flushing remote VM/copying local page(s)->remote VM, ect.
>>
>> While this is fine for a VM (say VM1) dirtying its own memory and the accesses
>> are trapped in the kernel as well as the log is being updated, I'm not sure
>> what happens in the situation of vhost, where a remote VM (say VM2) is dirtying
>> up VM1's memory since it can directly access it, during packet reception for
>> example.
>> Whatever technique is employed to catch this, how would this differ from a HW
>> based Virtio device doing DMA directly into a VM's DDR, wrt to DPT? Is QEMU
>> going to have a 2nd place to query the dirty logs - ie: the vDPA layer?
> I don't think anyone has a good handle at the vhost pci migration yet.
> But I think a reasonable way to handle that would be to
> activate dirty tracking in VM2's QEMU.
>
> And then VM2's QEMU would periodically copy the bits to the log - does
> this sound right?
>
>> Further I heard about a SW based DPT within the vDPA framework for those
>> devices that do not (yet) support DPT inherently in HW. How is this envisioned
>> to work?
> What I am aware of is simply switching to a software virtio
> for the duration of migration. The software can be pretty simple
> since the formats match: just copy available entries to device ring,
> and for used entries, see a used ring entry, mark page
> dirty and then copy used entry to guest ring.
That looks more heavyweight than e.g just relay used ring (as what dpdk
did) I believe?
>
>
> Another approach that I proposed and was prototyped at some point by
> Alex Duyck is guest driver touching the page in question before
> processing it within guest e.g. by an atomic xor with 0.
> Sounds attractive but didn't perform all that well.
Intel posted i40e software solution that traps queue tail/head write.
But I'm not sure it's good enough.
https://lore.kernel.org/kvm/20191206082232.GH31791@joy-OptiPlex-7040/
>
>
>> Finally, for those HW vendors that do support DPT in HW, a mapping of a bit ->
>> page isn't really an option, since no one wants to do a byte wide
>> read-modify-write across the PCI bus, but rather map a whole byte to page is
>> likely more desirable - the HW can just do non-posted writes to the dirty page
>> table. If byte wise, then the QEMU/vDPA layer has to either fix-up the mapping
>> (from byte->bit) or have the capability to handle the granularity diffs.
>>
>> Thoughts?
>>
>> Rob Miller
>> rob.miller@broadcom.com
>> (919)721-3339
> If using an IOMMU, DPT can also be done using either PRI or dirty bit in
> a PTE. PRI is an interrupt so it can kick off a thread to set bits in
> the log I guess, but if it's the dirty bit then I don't think there's an
> interrupt. And a polling thread does not sound attractive. I guess
> we'll need a new interface to notify VDPA that QEMU is looking for dirty
> logs, and then VDPA can send them to QEMU in some way. Will probably be
> good enough to support vendor specific logging interfaces, too. I don't
> actually have hardware which supports either so actually coding it up is
> not yet practical.
Yes, both PRI and PTE dirty bit requires special hardware support. We
can extend vDPA API to support both. For page fault, probably just a
IOMMU page fault handler.
>
> Further, at my KVM forum presentaiton I proposed a virtio-specific
> pagefault handling interface. If there's a wish to standardize and
> implement that, let me know and I will try to write this up in a more
> formal way.
Besides pagefault, if we want virito to be more like vhost, we need also
formalize the device state feching. E.g per vq index etc.
Thanks
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
next prev parent reply other threads:[~2020-03-09 8:50 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-06 15:40 [virtio-dev] Dirty Page Tracking (DPT) Rob Miller
2020-03-09 7:38 ` Michael S. Tsirkin
2020-03-09 8:50 ` Jason Wang [this message]
2020-03-09 10:13 ` Michael S. Tsirkin
2020-03-10 3:22 ` Jason Wang
2020-03-10 6:24 ` Michael S. Tsirkin
2020-03-10 6:39 ` Jason Wang
2020-03-18 15:13 ` Rob Miller
2020-03-19 3:35 ` Jason Wang
2020-03-19 11:17 ` Paolo Bonzini
2020-04-07 9:52 ` Eugenio Perez Martin
2020-04-07 10:27 ` Rob Miller
2020-04-07 16:31 ` Eugenio Perez Martin
2020-04-08 10:10 ` Jason Wang
2020-04-07 10:40 ` Rob Miller
2020-04-08 10:00 ` Jason Wang
2020-04-09 21:06 ` Michael S. Tsirkin
2020-04-10 2:40 ` Jason Wang
2020-04-13 12:15 ` Eugenio Perez Martin
2020-04-13 13:30 ` Rob Miller
2020-04-13 13:49 ` Jason Wang
2020-04-13 13:49 ` Jason Wang
2020-04-13 13:55 ` Jason Wang
2020-04-16 10:55 ` Eugenio Perez Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ff63b1e2-e4aa-1b13-0b4a-72fd23badc06@redhat.com \
--to=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=rob.miller@broadcom.com \
--cc=virtio-dev@lists.oasis-open.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox