From: Marc Zyngier <maz@kernel.org>
To: Gavin Shan <gshan@redhat.com>
Cc: Peter Xu <peterx@redhat.com>,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
catalin.marinas@arm.com, bgardon@google.com, shuah@kernel.org,
andrew.jones@linux.dev, will@kernel.org, dmatlack@google.com,
pbonzini@redhat.com, zhenyzha@redhat.com, shan.gavin@gmail.com,
james.morse@arm.com, suzuki.poulose@arm.com,
alexandru.elisei@arm.com, oliver.upton@linux.dev,
kvmarm@lists.linux.dev
Subject: Re: [PATCH v4 3/6] KVM: arm64: Enable ring-based dirty memory tracking
Date: Tue, 04 Oct 2022 16:45:09 +0100 [thread overview]
Message-ID: <86czb78uey.wl-maz@kernel.org> (raw)
In-Reply-To: <8b82ef3d-16ab-0aee-b464-8ad9b3718028@redhat.com>
On Tue, 04 Oct 2022 05:26:23 +0100,
Gavin Shan <gshan@redhat.com> wrote:
[...]
> > Why another capability? Just allowing dirty logging to be enabled
> > before we saving the GIC state should be enough, shouldn't it?
> >
>
> The GIC state would be just one case where no vcpu can be used to push
> dirty page information. As you mentioned, SMMMU HTTU feature could possibly
> be another case to ARM64. It's uncertain about other architectures where
> dirty-ring will be supported. In QEMU, the dirty (bitmap) logging is enabled
> at the beginning of migration and the bitmap is synchronized to global
> dirty bitmap and RAMBlock's dirty bitmap gradually, as the following
> backtrace shows. What we need to do for QEMU is probably retrieve the
> bitmap at point (A).
>
> Without the new capability, we will have to rely on the return value
> from ioctls KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG to detect the
> capability. For example, -ENXIO is returned on old kernels.
Huh. Fair enough.
KVM_CAP_ALLOW_DIRTY_LOG_AND_DIRTY_RING_TOGETHER_UNTIL_THE_NEXT_TIME...
>
> migration_thread
> qemu_savevm_state_setup
> ram_save_setup
> ram_init_all
> ram_init_bitmaps
> memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION) // dirty logging enabled
> migration_bitmap_sync_precopy(rs)
> :
> migration_iteration_run // iteration 0
> qemu_savevm_state_pending
> migration_bitmap_sync_precopy
> qemu_savevm_state_iterate
> ram_save_iterate
> migration_iteration_run // iteration 1
> qemu_savevm_state_pending
> migration_bitmap_sync_precopy
> qemu_savevm_state_iterate
> ram_save_iterate
> migration_iteration_run // iteration 2
> qemu_savevm_state_pending
> migration_bitmap_sync_precopy
> qemu_savevm_state_iterate
> ram_save_iterate
> :
> migration_iteration_run // iteration N
> qemu_savevm_state_pending
> migration_bitmap_sync_precopy
> migration_completion
> qemu_savevm_state_complete_precopy
> qemu_savevm_state_complete_precopy_iterable
> ram_save_complete
> migration_bitmap_sync_precopy // A
> <send all dirty pages>
>
> Note: for post-copy and snapshot, I assume we need to save the dirty bitmap
> in the last synchronization, right after the VM is stopped.
Not only the VM stopped, but also the devices made quiescent.
> >> If all of us agree on this, I can send another kernel patch to address
> >> this. QEMU still need more patches so that the feature can be supported.
> >
> > Yes, this will also need some work.
> >
> >>>>
> >>>> To me, this is just a relaxation of an arbitrary limitation, as the
> >>>> current assumption that only vcpus can dirty memory doesn't hold at
> >>>> all.
> >>>
> >>> The initial dirty ring proposal has a per-vm ring, but after we
> >>> investigated x86 we found that all legal dirty paths are with a vcpu
> >>> context (except one outlier on kvmgt which fixed within itself), so we
> >>> dropped the per-vm ring.
> >>>
> >>> One thing to mention is that DMAs should not count in this case because
> >>> that's from device perspective, IOW either IOMMU or SMMU dirty tracking
> >>> should be reported to the device driver that interacts with the userspace
> >>> not from KVM interfaces (e.g. vfio with VFIO_IOMMU_DIRTY_PAGES). That even
> >>> includes emulated DMA like vhost (VHOST_SET_LOG_BASE).
> >>>
> >>
> >> Thanks to Peter for mentioning the per-vm ring's history. As I said above,
> >> lets use bitmap instead if all of us agree.
> >>
> >> If I'm correct, Marc may be talking about SMMU, which is emulated in host
> >> instead of QEMU. In this case, the DMA target pages are similar to those
> >> pages for vgic/its tables. Both sets of pages are invisible from QEMU.
> >
> > No, I'm talking about an actual HW SMMU using the HTTU feature that
> > set the Dirty bit in the PTEs. And people have been working on sharing
> > SMMU and CPU PTs for some time, which would give us the one true
> > source of dirty page.
> >
> > In this configuration, the dirty ring mechanism will be pretty useless.
> >
>
> Ok. I don't know the details. Marc, the dirty bitmap is helpful in this case?
Yes, the dirty bitmap is useful if the source of dirty bits is
obtained from the page tables. The cost of collecting/resetting the
bits is pretty high though.
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2022-10-04 15:45 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-27 0:54 [PATCH v4 0/6] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-09-27 0:54 ` [PATCH v4 1/6] KVM: x86: Introduce KVM_REQ_RING_SOFT_FULL Gavin Shan
2022-09-27 10:26 ` Marc Zyngier
2022-09-27 11:31 ` Gavin Shan
2022-09-27 16:00 ` Peter Xu
2022-09-27 0:54 ` [PATCH v4 2/6] KVM: x86: Move declaration of kvm_cpu_dirty_log_size() to kvm_dirty_ring.h Gavin Shan
2022-09-27 16:00 ` Peter Xu
2022-09-27 0:54 ` [PATCH v4 3/6] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-09-27 16:02 ` Peter Xu
2022-09-27 17:32 ` Marc Zyngier
2022-09-27 18:21 ` Peter Xu
2022-09-27 23:47 ` Gavin Shan
2022-09-28 8:25 ` Marc Zyngier
2022-09-28 14:52 ` Peter Xu
2022-09-29 9:50 ` Gavin Shan
2022-09-29 11:31 ` Gavin Shan
2022-09-29 14:44 ` Marc Zyngier
2022-09-29 14:32 ` Peter Xu
2022-09-30 9:28 ` Marc Zyngier
2022-09-29 14:42 ` Marc Zyngier
2022-10-04 4:26 ` Gavin Shan
2022-10-04 13:26 ` Peter Xu
2022-10-04 15:45 ` Marc Zyngier [this message]
2022-09-29 14:34 ` Marc Zyngier
2022-09-27 0:54 ` [PATCH v4 4/6] KVM: selftests: Use host page size to map ring buffer in dirty_log_test Gavin Shan
2022-09-27 0:54 ` [PATCH v4 5/6] KVM: selftests: Clear dirty ring states between two modes " Gavin Shan
2022-09-27 0:54 ` [PATCH v4 6/6] KVM: selftests: Automate choosing dirty ring size " Gavin Shan
2022-09-27 10:30 ` [PATCH v4 0/6] KVM: arm64: Enable ring-based dirty memory tracking Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86czb78uey.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=alexandru.elisei@arm.com \
--cc=andrew.jones@linux.dev \
--cc=bgardon@google.com \
--cc=catalin.marinas@arm.com \
--cc=dmatlack@google.com \
--cc=gshan@redhat.com \
--cc=james.morse@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=kvmarm@lists.linux.dev \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=shan.gavin@gmail.com \
--cc=shuah@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
--cc=zhenyzha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox