From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BE881CD8CB2 for ; Tue, 9 Jun 2026 16:31:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=rzRJxBHC7K2Am3jz5aUgcwyLjbnwAMvjmPdpbfchiFw=; b=WFt+7yEludkfWx7SVj14p9pRdu Dp5TWRzxUijh064B1Ochx4/gwOixO2Wy/iGb15BDKXXmUKJ2Es5ufd1o1LXdsm54gNgvhSEwiDaj+ LH2FHHUbkZC4W4Rq5VgkGNB6VJvgim+Nep8eMWUCuPyugDu2kAgYC4YkxmQKBTKeYZ7tlfY6Q9kyL o6L7/MBTJEXGW/29tqICGOA26GDCdLt6cGtRUghlpmPEDg747THmiurRF84YuVeXGhQH4L5f0Zd00 T89F6tlgS9w6t2b0qTgWKb9x+VYq0Q6Jz4oqxsOG4ALF/YrCw/R0ydLIvsga373PTVBBuGngVokG/ NRZ4ImAQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.99.1 #2 (Red Hat Linux)) id 1wWzMG-000000062v6-42hr; Tue, 09 Jun 2026 16:31:21 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.99.1 #2 (Red Hat Linux)) id 1wWzMB-000000062ui-3dzh for linux-arm-kernel@lists.infradead.org; Tue, 09 Jun 2026 16:31:19 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 10BAA22EE; Tue, 9 Jun 2026 09:31:08 -0700 (PDT) Received: from devkitleo.cambridge.arm.com (devkitleo.cambridge.arm.com [10.1.196.90]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 845B83FDBD; Tue, 9 Jun 2026 09:31:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=arm.com; s=foss; t=1781022672; bh=edNA0mBFvb7KT9dQDcX5KX/wWm/FkS81I5Rs23y2GhQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ov2IsnKLxbLFbagTn2Y1Hx+XH5BZ35k7W7eWmMOFFzq/nJMTi2bJl37gvPQYmsvjc 2P89O5Y9EgDF50rk60EKwWBr4lDXFWps84LMMCy23+W2/mnS0JLMW2RmkfDphiH5q6 r+P2qR6t8T93PENWjMO45qPMoe+eGNjKilPBrOuw= From: Leonardo Bras To: Wei-Lin Chang Cc: Leonardo Bras , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, Marc Zyngier , Oliver Upton , Joey Gouly , Steffen Eiden , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Gavin Shan Subject: Re: [PATCH 1/2] KVM: arm64: Replace memslot_is_logging() with kvm_slot_dirty_track_enabled() Date: Tue, 9 Jun 2026 17:31:01 +0100 Message-ID: X-Mailer: git-send-email 2.54.0 In-Reply-To: References: <20260605153248.2412064-1-weilin.chang@arm.com> <20260605153248.2412064-2-weilin.chang@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: 8bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.9.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260609_093115_993838_3BEF960E X-CRM114-Status: GOOD ( 49.18 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Jun 08, 2026 at 04:55:45PM +0100, Leonardo Bras wrote: > Hi Wei Lin, > > On Fri, Jun 05, 2026 at 04:32:47PM +0100, Wei-Lin Chang wrote: > > When checking whether a memslot has dirty logging enabled, the > > KVM_MEM_LOG_DIRTY_PAGES flag is the source of truth. Previously we were > > using memslot_is_logging() which only tests dirty bitmap and did not > > consider dirty ring. This was not detected because > > KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP was introduced together with KVM > > arm64 dirty ring, and users need to enable it to ensure dirty > > information is not lost for the case of VGIC LPI/ITS table changes. > > > > Fix this by using kvm_slot_dirty_track_enabled() instead which checks > > KVM_MEM_LOG_DIRTY_PAGES. > > > > Note that memslot_is_logging() also treats a memslot as not logging if > > KVM_MEM_READONLY is set, hence a memslot with both dirty logging and > > read only would be seen as not logging for memslot_is_logging(), but > > logging for kvm_slot_dirty_track_enabled(). This allows a read only > > mapping of size > PAGE_SIZE to be built when memslot_is_logging() is > > used, leading to a better read performance compared to > > kvm_slot_dirty_track_enabled(). However memslots that have both > > KVM_MEM_LOG_DIRTY_PAGES and KVM_MEM_READONLY set do not really make > > sense as dirty logging is essentially nop for a read only memslot, so > > this shouldn't affect real workloads much. > > > It worries me a bit that we are ignoring the KVM_MEM_READONLY flag... > I have not yet gone through the whole s2_mmu code but IIUC we can have > scenarios on which a memslot can be read-only and have dirty-logging > enabled. > If a memslot is not faulted yet, IIUC it is marked as read-only > (so it can be mapped on write fault), and we can have dirty-logging > enabled for it as well (as the VMM has no idea). > Ignore above bit, I confused memslot with block/page entry. Looking a bit more, my viewpoint is that: - Due to dirty_ring, checking memslot.dirty_bitmap should be done only to detect the existence of a dirty_bitmap, not the migration process. - This changes how detection works, in regardas to read-only blocks: memslot_is_logging() -> Checks dirty-bitmap + read-only memslot kvm_slot_dirty_track_enabled() -> Checks only memslot flag - As a simpler change, we could have: ~~~ - return memslot->dirty_bitmap && !(memslot->flags & KVM_MEM_READONLY); + return kvm_slot_dirty_track_enabled(memslot) && !(memslot->flags & KVM_MEM_READONLY); ~~~ Both are cheking memslot->flags, so it will be probably optimized by the compiler as: ~~~ return memslot->flags & 3 == 1 ~~~ My main worry was that in the curent patch we are changing the behavior on skipping read-only memslots. So going through the users, we can see: > > > > Fixes: 9cb1096f8590 ("KVM: arm64: Enable ring-based dirty memory tracking") > > Signed-off-by: Wei-Lin Chang > > --- > > It took me a long investigation to acquire the context needed to > > understand this change, however the reason for this problem not being > > detected is an educated guess. Please let me know if this is wrong or > > if there are other issues, thanks! > > > > arch/arm64/kvm/mmu.c | 11 +++-------- > > 1 file changed, 3 insertions(+), 8 deletions(-) > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index 4da9281312eb..06c46124d3e7 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -161,11 +161,6 @@ static int kvm_mmu_split_huge_pages(struct kvm *kvm, phys_addr_t addr, > > return ret; > > } > > > > -static bool memslot_is_logging(struct kvm_memory_slot *memslot) > > -{ > > - return memslot->dirty_bitmap && !(memslot->flags & KVM_MEM_READONLY); > > -} > > - > > /** > > * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries for v7/8 > > * @kvm: pointer to kvm structure. > > @@ -1748,7 +1743,7 @@ static short kvm_s2_resolve_vma_size(const struct kvm_s2_fault_desc *s2fd, > > { > > short vma_shift; > > > > - if (memslot_is_logging(s2fd->memslot)) { > > + if (kvm_slot_dirty_track_enabled(s2fd->memslot)) { > > s2vi->max_map_size = PAGE_SIZE; > > vma_shift = PAGE_SHIFT; > > } else { On the case dirty_track is enabled in a read-only slot, it will resolve to a smaller vma_size. The fault granule will be smaller here. This could be bad for performance, so maybe we could add a check for read-only block here: ~~~ - if (memslot_is_logging(s2fd->memslot)) { + if (kvm_slot_dirty_track_enabled(s2fd->memslot) && + !memslot_is_readonly(s2fd->memslot) { ~~~ > > @@ -1953,7 +1948,7 @@ static int kvm_s2_fault_compute_prot(const struct kvm_s2_fault_desc *s2fd, > > *prot = KVM_PGTABLE_PROT_R; > > > > if (s2vi->map_writable && (s2vi->device || > > - !memslot_is_logging(s2fd->memslot) || > > + !kvm_slot_dirty_track_enabled(s2fd->memslot) || > > kvm_is_write_fault(s2fd->vcpu))) > > *prot |= KVM_PGTABLE_PROT_W; > > On the same scenario (dirty_track enabled on readonly memslot): This one should be safe, as kvm_is_write_fault() will check if the memslot is readonly and return false in this case. But then, it will have to actually call kvm_is_write_fault(), as the previous version would not even call it in that scenario. Not sure how would that impact perforformance, though. > > @@ -2084,7 +2079,7 @@ static int user_mem_abort(const struct kvm_s2_fault_desc *s2fd) > > * and a write fault needs to collapse a block entry into a table. > > */ > > memcache = get_mmu_memcache(s2fd->vcpu); > > - if (!perm_fault || (memslot_is_logging(s2fd->memslot) && > > + if (!perm_fault || (kvm_slot_dirty_track_enabled(s2fd->memslot) && > > kvm_is_write_fault(s2fd->vcpu))) { > > ret = topup_mmu_memcache(s2fd->vcpu, memcache); > > if (ret) Same thing, if memslot is tracking and is readonly, topup_*() would be called with the new patch, but not with the old behavior. All of that depends on how the VMM uses dirty_tracking: does it enable for all memory, or only for memory that is writable? I could not find anything that would prevent user from enabling dirty_tracking on read-only memslots, so we can either ignore this scenario, apply those patches and let those users carry the extra overhead, or do an extra test to make sure it's doing the same thing as before. Thanks! Leo