From: Marc Zyngier <maz@kernel.org>
To: Keqian Zhu <zhukeqian1@huawei.com>
Cc: linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
Catalin Marinas <catalin.marinas@arm.com>,
James Morse <james.morse@arm.com>, Will Deacon <will@kernel.org>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Sean Christopherson <sean.j.christopherson@intel.com>,
Julien Thierry <julien.thierry.kdev@gmail.com>,
Mark Brown <broonie@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Alexios Zavras <alexios.zavras@intel.com>,
wanghaibin.wang@huawei.com, zhengxiang9@huawei.com
Subject: Re: [RFC PATCH 0/7] kvm: arm64: Support stage2 hardware DBM
Date: Mon, 25 May 2020 16:44:53 +0100 [thread overview]
Message-ID: <4b8a939172395bf38e581634abecf925@kernel.org> (raw)
In-Reply-To: <20200525112406.28224-1-zhukeqian1@huawei.com>
On 2020-05-25 12:23, Keqian Zhu wrote:
> This patch series add support for stage2 hardware DBM, and it is only
> used for dirty log for now.
>
> It works well under some migration test cases, including VM with 4K
> pages or 2M THP. I checked the SHA256 hash digest of all memory and
> they keep same for source VM and destination VM, which means no dirty
> pages is missed under hardware DBM.
>
> However, there are some known issues not solved.
>
> 1. Some mechanisms that rely on "write permission fault" become
> invalid,
> such as kvm_set_pfn_dirty and "mmap page sharing".
>
> kvm_set_pfn_dirty is called in user_mem_abort when guest issues
> write
> fault. This guarantees physical page will not be dropped directly
> when
> host kernel recycle memory. After using hardware dirty management,
> we
> have no chance to call kvm_set_pfn_dirty.
Then you will end-up with memory corruption under memory pressure.
This also breaks things like CoW, which we depend on.
>
> For "mmap page sharing" mechanism, host kernel will allocate a new
> physical page when guest writes a page that is shared with other
> page
> table entries. After using hardware dirty management, we have no
> chance
> to do this too.
>
> I need to do some survey on how stage1 hardware DBM solve these
> problems.
> It helps if anyone can figure it out.
>
> 2. Page Table Modification Races: Though I have found and solved some
> data
> races when kernel changes page table entries, I still doubt that
> there
> are data races I am not aware of. It's great if anyone can figure
> them out.
>
> 3. Performance: Under Kunpeng 920 platform, for every 64GB memory, KVM
> consumes about 40ms to traverse all PTEs to collect dirty log. It
> will
> cause unbearable downtime for migration if memory size is too big. I
> will
> try to solve this problem in Patch v1.
This, in my opinion, is why Stage-2 DBM is fairly useless.
From a performance perspective, this is the worse possible
situation. You end up continuously scanning page tables, at
an arbitrary rate, without a way to evaluate the fault rate.
One thing S2-DBM would be useful for is SVA, where a device
write would mark the S2 PTs dirty as they are shared between
CPU and SMMU. Another thing is SPE, which is essentially a DMA
agent using the CPU's PTs.
But on its own, and just to log the dirty pages, S2-DBM is
pretty rubbish. I wish arm64 had something like Intel's PML,
which looks far more interesting for the purpose of tracking
accesses.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
next prev parent reply other threads:[~2020-05-25 15:44 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-25 11:23 [RFC PATCH 0/7] kvm: arm64: Support stage2 hardware DBM Keqian Zhu
2020-05-25 11:24 ` [RFC PATCH 1/7] KVM: arm64: Add some basic functions for hw DBM Keqian Zhu
2020-05-25 11:24 ` [RFC PATCH 2/7] KVM: arm64: Set DBM bit of PTEs if hw DBM enabled Keqian Zhu
2020-05-26 11:49 ` Catalin Marinas
2020-05-27 9:28 ` zhukeqian
2020-05-25 11:24 ` [RFC PATCH 3/7] KVM: arm64: Traverse page table entries when sync dirty log Keqian Zhu
2020-05-25 11:24 ` [RFC PATCH 4/7] KVM: arm64: Steply write protect page table by mask bit Keqian Zhu
2020-05-25 11:24 ` [RFC PATCH 5/7] kvm: arm64: Modify stage2 young mechanism to support hw DBM Keqian Zhu
2020-05-25 11:24 ` [RFC PATCH 6/7] kvm: arm64: Save stage2 PTE dirty info if it is coverred Keqian Zhu
2020-05-25 11:24 ` [RFC PATCH 7/7] KVM: arm64: Enable stage2 hardware DBM Keqian Zhu
2020-05-25 15:44 ` Marc Zyngier [this message]
2020-05-26 2:08 ` [RFC PATCH 0/7] kvm: arm64: Support " zhukeqian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4b8a939172395bf38e581634abecf925@kernel.org \
--to=maz@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexios.zavras@intel.com \
--cc=broonie@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=julien.thierry.kdev@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sean.j.christopherson@intel.com \
--cc=suzuki.poulose@arm.com \
--cc=tglx@linutronix.de \
--cc=wanghaibin.wang@huawei.com \
--cc=will@kernel.org \
--cc=zhengxiang9@huawei.com \
--cc=zhukeqian1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox