All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, maz@kernel.org,
	will@kernel.org, oliver.upton@linux.dev, james.morse@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	zhukeqian1@huawei.com, jonathan.cameron@huawei.com,
	linuxarm@huawei.com
Subject: Re: [RFC PATCH v2 3/8] KVM: arm64: Add some HW_DBM related pgtable interfaces
Date: Fri, 22 Sep 2023 16:24:11 +0100	[thread overview]
Message-ID: <ZQ2xmzZ0H5v5wDSw@arm.com> (raw)
In-Reply-To: <20230825093528.1637-4-shameerali.kolothum.thodi@huawei.com>

On Fri, Aug 25, 2023 at 10:35:23AM +0100, Shameer Kolothum wrote:
> +static bool stage2_pte_writeable(kvm_pte_t pte)
> +{
> +	return pte & KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
> +}
> +
> +static void kvm_update_hw_dbm(const struct kvm_pgtable_visit_ctx *ctx,
> +			      kvm_pte_t new)
> +{
> +	kvm_pte_t old_pte, pte = ctx->old;
> +
> +	/* Only set DBM if page is writeable */
> +	if ((new & KVM_PTE_LEAF_ATTR_HI_S2_DBM) && !stage2_pte_writeable(pte))
> +		return;
> +
> +	/* Clear DBM walk is not shared, update */
> +	if (!kvm_pgtable_walk_shared(ctx)) {
> +		WRITE_ONCE(*ctx->ptep, new);
> +		return;
> +	}

I was wondering if this interferes with the OS dirty tracking (not the
KVM one) but I think that's ok, at least at this point, since the PTE is
already writeable and a fault would have marked the underlying page as
dirty (user_mem_abort() -> kvm_set_pfn_dirty()).

I'm not particularly fond of relying on this but I need to see how it
fits with the rest of the series. IIRC KVM doesn't go around and make
Stage 2 PTEs read-only but rather unmaps them when it changes the
permission of the corresponding Stage 1 VMM mapping.

My personal preference would be to track dirty/clean properly as we do
for stage 1 (e.g. DBM means writeable PTE) but it has some downsides
like the try_to_unmap() code having to retrieve the dirty state via
notifiers.

Anyway, assuming this works correctly, it means that live migration via
DBM is only tracked for PTEs already made dirty/writeable by some guest
write.

> @@ -952,6 +990,11 @@ static int stage2_map_walker_try_leaf(const struct kvm_pgtable_visit_ctx *ctx,
>  	    stage2_pte_executable(new))
>  		mm_ops->icache_inval_pou(kvm_pte_follow(new, mm_ops), granule);
>  
> +	/* Save the possible hardware dirty info */
> +	if ((ctx->level == KVM_PGTABLE_MAX_LEVELS - 1) &&
> +	    stage2_pte_writeable(ctx->old))
> +		mark_page_dirty(kvm_s2_mmu_to_kvm(pgt->mmu), ctx->addr >> PAGE_SHIFT);
> +
>  	stage2_make_pte(ctx, new);

Isn't this racy and potentially losing the dirty state? Or is the 'new'
value guaranteed to have the S2AP[1] bit? For stage 1 we normally make
the page genuinely read-only (clearing DBM) in a cmpxchg loop to
preserve the dirty state (see ptep_set_wrprotect()).

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, maz@kernel.org,
	will@kernel.org, oliver.upton@linux.dev, james.morse@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	zhukeqian1@huawei.com, jonathan.cameron@huawei.com,
	linuxarm@huawei.com
Subject: Re: [RFC PATCH v2 3/8] KVM: arm64: Add some HW_DBM related pgtable interfaces
Date: Fri, 22 Sep 2023 16:24:11 +0100	[thread overview]
Message-ID: <ZQ2xmzZ0H5v5wDSw@arm.com> (raw)
In-Reply-To: <20230825093528.1637-4-shameerali.kolothum.thodi@huawei.com>

On Fri, Aug 25, 2023 at 10:35:23AM +0100, Shameer Kolothum wrote:
> +static bool stage2_pte_writeable(kvm_pte_t pte)
> +{
> +	return pte & KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
> +}
> +
> +static void kvm_update_hw_dbm(const struct kvm_pgtable_visit_ctx *ctx,
> +			      kvm_pte_t new)
> +{
> +	kvm_pte_t old_pte, pte = ctx->old;
> +
> +	/* Only set DBM if page is writeable */
> +	if ((new & KVM_PTE_LEAF_ATTR_HI_S2_DBM) && !stage2_pte_writeable(pte))
> +		return;
> +
> +	/* Clear DBM walk is not shared, update */
> +	if (!kvm_pgtable_walk_shared(ctx)) {
> +		WRITE_ONCE(*ctx->ptep, new);
> +		return;
> +	}

I was wondering if this interferes with the OS dirty tracking (not the
KVM one) but I think that's ok, at least at this point, since the PTE is
already writeable and a fault would have marked the underlying page as
dirty (user_mem_abort() -> kvm_set_pfn_dirty()).

I'm not particularly fond of relying on this but I need to see how it
fits with the rest of the series. IIRC KVM doesn't go around and make
Stage 2 PTEs read-only but rather unmaps them when it changes the
permission of the corresponding Stage 1 VMM mapping.

My personal preference would be to track dirty/clean properly as we do
for stage 1 (e.g. DBM means writeable PTE) but it has some downsides
like the try_to_unmap() code having to retrieve the dirty state via
notifiers.

Anyway, assuming this works correctly, it means that live migration via
DBM is only tracked for PTEs already made dirty/writeable by some guest
write.

> @@ -952,6 +990,11 @@ static int stage2_map_walker_try_leaf(const struct kvm_pgtable_visit_ctx *ctx,
>  	    stage2_pte_executable(new))
>  		mm_ops->icache_inval_pou(kvm_pte_follow(new, mm_ops), granule);
>  
> +	/* Save the possible hardware dirty info */
> +	if ((ctx->level == KVM_PGTABLE_MAX_LEVELS - 1) &&
> +	    stage2_pte_writeable(ctx->old))
> +		mark_page_dirty(kvm_s2_mmu_to_kvm(pgt->mmu), ctx->addr >> PAGE_SHIFT);
> +
>  	stage2_make_pte(ctx, new);

Isn't this racy and potentially losing the dirty state? Or is the 'new'
value guaranteed to have the S2AP[1] bit? For stage 1 we normally make
the page genuinely read-only (clearing DBM) in a cmpxchg loop to
preserve the dirty state (see ptep_set_wrprotect()).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2023-09-22 15:24 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-25  9:35 [RFC PATCH v2 0/8] KVM: arm64: Implement SW/HW combined dirty log Shameer Kolothum
2023-08-25  9:35 ` Shameer Kolothum
2023-08-25  9:35 ` [RFC PATCH v2 1/8] arm64: cpufeature: Add API to report system support of HWDBM Shameer Kolothum
2023-08-25  9:35   ` Shameer Kolothum
2023-08-25  9:35 ` [RFC PATCH v2 2/8] KVM: arm64: Add KVM_PGTABLE_WALK_HW_DBM for HW DBM support Shameer Kolothum
2023-08-25  9:35   ` Shameer Kolothum
2023-09-15 22:05   ` Oliver Upton
2023-09-15 22:05     ` Oliver Upton
2023-09-18  9:52     ` Shameerali Kolothum Thodi
2023-09-18  9:52       ` Shameerali Kolothum Thodi
2023-08-25  9:35 ` [RFC PATCH v2 3/8] KVM: arm64: Add some HW_DBM related pgtable interfaces Shameer Kolothum
2023-08-25  9:35   ` Shameer Kolothum
2023-09-15 22:22   ` Oliver Upton
2023-09-15 22:22     ` Oliver Upton
2023-09-18  9:53     ` Shameerali Kolothum Thodi
2023-09-18  9:53       ` Shameerali Kolothum Thodi
2023-09-22 15:24   ` Catalin Marinas [this message]
2023-09-22 15:24     ` Catalin Marinas
2023-09-22 17:49     ` Oliver Upton
2023-09-22 17:49       ` Oliver Upton
2023-09-25  8:04       ` Shameerali Kolothum Thodi
2023-09-25  8:04         ` Shameerali Kolothum Thodi
2023-09-26 15:20         ` Catalin Marinas
2023-09-26 15:20           ` Catalin Marinas
2023-09-26 15:52           ` Shameerali Kolothum Thodi
2023-09-26 15:52             ` Shameerali Kolothum Thodi
2023-09-26 16:37             ` Catalin Marinas
2023-09-26 16:37               ` Catalin Marinas
2023-08-25  9:35 ` [RFC PATCH v2 4/8] KVM: arm64: Set DBM for previously writeable pages Shameer Kolothum
2023-08-25  9:35   ` Shameer Kolothum
2023-09-15 22:54   ` Oliver Upton
2023-09-15 22:54     ` Oliver Upton
2023-09-18  9:54     ` Shameerali Kolothum Thodi
2023-09-18  9:54       ` Shameerali Kolothum Thodi
2023-09-22 15:40   ` Catalin Marinas
2023-09-22 15:40     ` Catalin Marinas
2023-09-25  8:04     ` Shameerali Kolothum Thodi
2023-09-25  8:04       ` Shameerali Kolothum Thodi
2023-08-25  9:35 ` [RFC PATCH v2 5/8] KVM: arm64: Add some HW_DBM related mmu interfaces Shameer Kolothum
2023-08-25  9:35   ` Shameer Kolothum
2023-08-25  9:35 ` [RFC PATCH v2 6/8] KVM: arm64: Only write protect selected PTE Shameer Kolothum
2023-08-25  9:35   ` Shameer Kolothum
2023-09-22 16:00   ` Catalin Marinas
2023-09-22 16:00     ` Catalin Marinas
2023-09-22 16:59     ` Oliver Upton
2023-09-22 16:59       ` Oliver Upton
2023-09-26 15:58       ` Catalin Marinas
2023-09-26 15:58         ` Catalin Marinas
2023-09-26 16:10         ` Catalin Marinas
2023-09-26 16:10           ` Catalin Marinas
2023-08-25  9:35 ` [RFC PATCH v2 7/8] KVM: arm64: Add KVM_CAP_ARM_HW_DBM Shameer Kolothum
2023-08-25  9:35   ` Shameer Kolothum
2023-08-25  9:35 ` [RFC PATCH v2 8/8] KVM: arm64: Start up SW/HW combined dirty log Shameer Kolothum
2023-08-25  9:35   ` Shameer Kolothum
2023-09-13 17:30 ` [RFC PATCH v2 0/8] KVM: arm64: Implement " Oliver Upton
2023-09-13 17:30   ` Oliver Upton
2023-09-14  9:47   ` Shameerali Kolothum Thodi
2023-09-14  9:47     ` Shameerali Kolothum Thodi
2023-09-15  0:36     ` Oliver Upton
2023-09-15  0:36       ` Oliver Upton
2023-09-18  9:55       ` Shameerali Kolothum Thodi
2023-09-18  9:55         ` Shameerali Kolothum Thodi
2023-09-20 21:12         ` Oliver Upton
2023-09-20 21:12           ` Oliver Upton
2023-10-12  7:51         ` Shameerali Kolothum Thodi
2023-10-12  7:51           ` Shameerali Kolothum Thodi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZQ2xmzZ0H5v5wDSw@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linuxarm@huawei.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.