From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA9ED1FA1 for ; Tue, 26 Sep 2023 16:37:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 094E1C433C7; Tue, 26 Sep 2023 16:37:52 +0000 (UTC) Date: Tue, 26 Sep 2023 17:37:50 +0100 From: Catalin Marinas To: Shameerali Kolothum Thodi Cc: Oliver Upton , "kvmarm@lists.linux.dev" , "kvm@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , "maz@kernel.org" , "will@kernel.org" , "james.morse@arm.com" , "suzuki.poulose@arm.com" , yuzenghui , zhukeqian , Jonathan Cameron , Linuxarm Subject: Re: [RFC PATCH v2 3/8] KVM: arm64: Add some HW_DBM related pgtable interfaces Message-ID: References: <20230825093528.1637-1-shameerali.kolothum.thodi@huawei.com> <20230825093528.1637-4-shameerali.kolothum.thodi@huawei.com> Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Tue, Sep 26, 2023 at 03:52:19PM +0000, Shameerali Kolothum Thodi wrote: > From: Catalin Marinas [mailto:catalin.marinas@arm.com] > > On Mon, Sep 25, 2023 at 08:04:39AM +0000, Shameerali Kolothum Thodi wrote: > > > From: Oliver Upton [mailto:oliver.upton@linux.dev] > > > > In both cases the 'old' translation should have DBM cleared. Even if the > > > > PTE were dirty, this is wasted work since we need to do a final scan of > > > > the stage-2 when userspace collects the dirty log. > > > > > > > > Am I missing something? > > > > > > I think we can get rid of the above mark_page_dirty(). I will test it to confirm > > > we are not missing anything here. > > > > Is this the case for the other places of mark_page_dirty() in your > > patches? If stage2_pte_writeable() is true, it must have been made > > writeable earlier by a fault and the underlying page marked as dirty. > > > > One of the other place we have mark_page_dirty() is in the stage2_unmap_walker(). > And during the testing of this series, I have tried to remove that and > found out that it actually causes memory corruption during VM migration. > > From my old debug logs: > > [ 399.288076] stage2_unmap_walker+0x270/0x284 > [ 399.288078] __kvm_pgtable_walk+0x1ec/0x2cc > [ 399.288081] __kvm_pgtable_walk+0xec/0x2cc > [ 399.288084] __kvm_pgtable_walk+0xec/0x2cc > [ 399.288086] kvm_pgtable_walk+0xcc/0x160 > [ 399.288088] kvm_pgtable_stage2_unmap+0x4c/0xbc > [ 399.288091] stage2_apply_range+0xd0/0xec > [ 399.288094] __unmap_stage2_range+0x2c/0x60 > [ 399.288096] kvm_unmap_gfn_range+0x30/0x48 > [ 399.288099] kvm_mmu_notifier_invalidate_range_start+0xe0/0x264 > [ 399.288103] __mmu_notifier_invalidate_range_start+0xa4/0x23c > [ 399.288106] change_protection+0x638/0x900 > [ 399.288109] change_prot_numa+0x64/0xfc > [ 399.288113] task_numa_work+0x2ac/0x450 > [ 399.288117] task_work_run+0x78/0xd0 > > It looks like that the unmap path gets triggered from Numa page migration code > path, so we may need it there. I think I confused things (should have looked at the code). mark_page_dirty() is actually about marking the bitmap for dirty tracking rather than the actual struct page. The latter is hidden somewhere deep down the get_user_pages() code. So yeah, I think we still need mark_page_dirty() in some places as to avoid losing the dirty information with DBM being turned on. -- Catalin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1CB19E7E64E for ; Tue, 26 Sep 2023 16:38:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=saq8dVeAK1lgSW1PifSHgbw4WXUm8CPovGBZ9sMJ+jQ=; b=gIobB+65FKVK+t dUwETr5UfCWjUeVAaGW62F5rYM5KTcfBPiS0BC44Ka30T8hgoBf8toW+jTv9/1j2IYe8NBDuqfUF2 y0KaYcO/i2pknxpMToDk+R+sBl+++QmpM3c1z8KsgyGaFRKLliu17lvkzyuUjVy03Po4q37M5ig7X obgJeuM8sY7Zvt14gc2ausXPflJTz+QYJgv3PL4Yd/E8VeiKSBkhJlUycSedrewSGq16i5ObJqLta TdmFowYk+mFAsahlH7js+uvVfU/aeNSwYa6jxeAbjc+bkmiNEfEtXrdyMMTiBiBheaIouiu7nO0bd wr10qBJW/v1/KvH8rGdQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qlB4T-00GjEi-1Y; Tue, 26 Sep 2023 16:38:01 +0000 Received: from sin.source.kernel.org ([2604:1380:40e1:4800::1]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qlB4P-00GjE8-2U for linux-arm-kernel@lists.infradead.org; Tue, 26 Sep 2023 16:37:59 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id EF4DFCE1724; Tue, 26 Sep 2023 16:37:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 094E1C433C7; Tue, 26 Sep 2023 16:37:52 +0000 (UTC) Date: Tue, 26 Sep 2023 17:37:50 +0100 From: Catalin Marinas To: Shameerali Kolothum Thodi Cc: Oliver Upton , "kvmarm@lists.linux.dev" , "kvm@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , "maz@kernel.org" , "will@kernel.org" , "james.morse@arm.com" , "suzuki.poulose@arm.com" , yuzenghui , zhukeqian , Jonathan Cameron , Linuxarm Subject: Re: [RFC PATCH v2 3/8] KVM: arm64: Add some HW_DBM related pgtable interfaces Message-ID: References: <20230825093528.1637-1-shameerali.kolothum.thodi@huawei.com> <20230825093528.1637-4-shameerali.kolothum.thodi@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230926_093757_996662_362A1A4B X-CRM114-Status: GOOD ( 26.83 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Sep 26, 2023 at 03:52:19PM +0000, Shameerali Kolothum Thodi wrote: > From: Catalin Marinas [mailto:catalin.marinas@arm.com] > > On Mon, Sep 25, 2023 at 08:04:39AM +0000, Shameerali Kolothum Thodi wrote: > > > From: Oliver Upton [mailto:oliver.upton@linux.dev] > > > > In both cases the 'old' translation should have DBM cleared. Even if the > > > > PTE were dirty, this is wasted work since we need to do a final scan of > > > > the stage-2 when userspace collects the dirty log. > > > > > > > > Am I missing something? > > > > > > I think we can get rid of the above mark_page_dirty(). I will test it to confirm > > > we are not missing anything here. > > > > Is this the case for the other places of mark_page_dirty() in your > > patches? If stage2_pte_writeable() is true, it must have been made > > writeable earlier by a fault and the underlying page marked as dirty. > > > > One of the other place we have mark_page_dirty() is in the stage2_unmap_walker(). > And during the testing of this series, I have tried to remove that and > found out that it actually causes memory corruption during VM migration. > > From my old debug logs: > > [ 399.288076] stage2_unmap_walker+0x270/0x284 > [ 399.288078] __kvm_pgtable_walk+0x1ec/0x2cc > [ 399.288081] __kvm_pgtable_walk+0xec/0x2cc > [ 399.288084] __kvm_pgtable_walk+0xec/0x2cc > [ 399.288086] kvm_pgtable_walk+0xcc/0x160 > [ 399.288088] kvm_pgtable_stage2_unmap+0x4c/0xbc > [ 399.288091] stage2_apply_range+0xd0/0xec > [ 399.288094] __unmap_stage2_range+0x2c/0x60 > [ 399.288096] kvm_unmap_gfn_range+0x30/0x48 > [ 399.288099] kvm_mmu_notifier_invalidate_range_start+0xe0/0x264 > [ 399.288103] __mmu_notifier_invalidate_range_start+0xa4/0x23c > [ 399.288106] change_protection+0x638/0x900 > [ 399.288109] change_prot_numa+0x64/0xfc > [ 399.288113] task_numa_work+0x2ac/0x450 > [ 399.288117] task_work_run+0x78/0xd0 > > It looks like that the unmap path gets triggered from Numa page migration code > path, so we may need it there. I think I confused things (should have looked at the code). mark_page_dirty() is actually about marking the bitmap for dirty tracking rather than the actual struct page. The latter is hidden somewhere deep down the get_user_pages() code. So yeah, I think we still need mark_page_dirty() in some places as to avoid losing the dirty information with DBM being turned on. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel