public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Gavin Shan <gshan@redhat.com>
To: Steven Price <steven.price@arm.com>,
	kvm@vger.kernel.org, kvmarm@lists.linux.dev
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	James Morse <james.morse@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Joey Gouly <joey.gouly@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Fuad Tabba <tabba@google.com>,
	linux-coco@lists.linux.dev,
	Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
	Shanker Donthineni <sdonthineni@nvidia.com>,
	Alper Gun <alpergun@google.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>
Subject: Re: [PATCH v6 19/43] arm64: RME: Allow populating initial contents
Date: Thu, 30 Jan 2025 14:38:22 +1000	[thread overview]
Message-ID: <68e545c8-57a3-4567-9e96-b46066cf6cee@redhat.com> (raw)
In-Reply-To: <20241212155610.76522-20-steven.price@arm.com>

On 12/13/24 1:55 AM, Steven Price wrote:
> The VMM needs to populate the realm with some data before starting (e.g.
> a kernel and initrd). This is measured by the RMM and used as part of
> the attestation later on.
> 
> For now only 4k mappings are supported, future work may add support for
> larger mappings.
> 
> Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
> Changes since v5:
>   * Refactor to use PFNs rather than tracking struct page in
>     realm_create_protected_data_page().
>   * Pull changes from a later patch (in the v5 series) for accessing
>     pages from a guest memfd.
>   * Do the populate in chunks to avoid holding locks for too long and
>     triggering RCU stall warnings.
> ---
>   arch/arm64/kvm/rme.c | 243 +++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 243 insertions(+)
> 
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index 22f0c74455af..d4561e368cd5 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -4,6 +4,7 @@
>    */
>   
>   #include <linux/kvm_host.h>
> +#include <linux/hugetlb.h>
>   

This wouldn't be needed since the huge pages, especially hugetlb part, isn't
supported yet.

>   #include <asm/kvm_emulate.h>
>   #include <asm/kvm_mmu.h>
> @@ -545,6 +546,236 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long start, u64 size,
>   		realm_unmap_private_range(kvm, start, end);
>   }
>   
> +static int realm_create_protected_data_page(struct realm *realm,
> +					    unsigned long ipa,
> +					    kvm_pfn_t dst_pfn,
> +					    kvm_pfn_t src_pfn,
> +					    unsigned long flags)
> +{
> +	phys_addr_t dst_phys, src_phys;
> +	int ret;
> +
> +	dst_phys = __pfn_to_phys(dst_pfn);
> +	src_phys = __pfn_to_phys(src_pfn);
> +
> +	if (rmi_granule_delegate(dst_phys))
> +		return -ENXIO;
> +
> +	ret = rmi_data_create(virt_to_phys(realm->rd), dst_phys, ipa, src_phys,
> +			      flags);
> +
> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> +		/* Create missing RTTs and retry */
> +		int level = RMI_RETURN_INDEX(ret);
> +
> +		ret = realm_create_rtt_levels(realm, ipa, level,
> +					      RMM_RTT_MAX_LEVEL, NULL);
> +		if (ret)
> +			goto err;
> +
> +		ret = rmi_data_create(virt_to_phys(realm->rd), dst_phys, ipa,
> +				      src_phys, flags);
> +	}
> +
> +	if (!ret)
> +		return 0;
> +
> +err:
> +	if (WARN_ON(rmi_granule_undelegate(dst_phys))) {
> +		/* Page can't be returned to NS world so is lost */
> +		get_page(pfn_to_page(dst_pfn));
> +	}
> +	return -ENXIO;
> +}
> +
> +static int fold_rtt(struct realm *realm, unsigned long addr, int level)
> +{
> +	phys_addr_t rtt_addr;
> +	int ret;
> +
> +	ret = realm_rtt_fold(realm, addr, level + 1, &rtt_addr);
> +	if (ret)
> +		return ret;
> +
> +	free_delegated_granule(rtt_addr);
> +
> +	return 0;
> +}
> +
> +static int populate_par_region(struct kvm *kvm,
> +			       phys_addr_t ipa_base,
> +			       phys_addr_t ipa_end,
> +			       u32 flags)
> +{

At the first glance, I was wandering what's meant by 'par' in the function name.
It turns to be a 2MB region and I guess it represents 'part'. I think this may
be renamed to populate_sub_region() or populate_region() directly.

> +	struct realm *realm = &kvm->arch.realm;
> +	struct kvm_memory_slot *memslot;
> +	gfn_t base_gfn, end_gfn;
> +	int idx;
> +	phys_addr_t ipa;
> +	int ret = 0;
> +	unsigned long data_flags = 0;
> +
> +	base_gfn = gpa_to_gfn(ipa_base);
> +	end_gfn = gpa_to_gfn(ipa_end);
> +
> +	if (flags & KVM_ARM_RME_POPULATE_FLAGS_MEASURE)
> +		data_flags = RMI_MEASURE_CONTENT;
> +

The 'data_flags' can be sorted out by its caller kvm_populate_realm(), and passed
to populate_par_region(). In that way, we needn't to figure out 'data_flags' in
every call to populate_par_region().

> +	idx = srcu_read_lock(&kvm->srcu);
> +	memslot = gfn_to_memslot(kvm, base_gfn);
> +	if (!memslot) {
> +		ret = -EFAULT;
> +		goto out;
> +	}
> +
> +	/* We require the region to be contained within a single memslot */
> +	if (memslot->base_gfn + memslot->npages < end_gfn) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	if (!kvm_slot_can_be_private(memslot)) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	write_lock(&kvm->mmu_lock);
> +
> +	ipa = ipa_base;
> +	while (ipa < ipa_end) {
> +		struct vm_area_struct *vma;
> +		unsigned long map_size;
> +		unsigned int vma_shift;
> +		unsigned long offset;
> +		unsigned long hva;
> +		struct page *page;
> +		bool writeable;
> +		kvm_pfn_t pfn;
> +		int level, i;
> +
> +		hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa));
> +		vma = vma_lookup(current->mm, hva);
> +		if (!vma) {
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		/* FIXME: Currently we only support 4k sized mappings */
> +		vma_shift = PAGE_SHIFT;
> +
> +		map_size = 1 << vma_shift;
> +
> +		ipa = ALIGN_DOWN(ipa, map_size);
> +

The blank lines in above 5 lines can be dropped :)

> +		switch (map_size) {
> +		case RMM_L2_BLOCK_SIZE:
> +			level = 2;
> +			break;
> +		case PAGE_SIZE:
> +			level = 3;
> +			break;
> +		default:
> +			WARN_ONCE(1, "Unsupported vma_shift %d", vma_shift);
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		pfn = __kvm_faultin_pfn(memslot, gpa_to_gfn(ipa), FOLL_WRITE,
> +					&writeable, &page);
> +
> +		if (is_error_pfn(pfn)) {
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		if (level < RMM_RTT_MAX_LEVEL) {
> +			/*
> +			 * A temporary RTT is needed during the map, precreate
> +			 * it, however if there is an error (e.g. missing
> +			 * parent tables) this will be handled in the
> +			 * realm_create_protected_data_page() call.
> +			 */
> +			realm_create_rtt_levels(realm, ipa, level,
> +						RMM_RTT_MAX_LEVEL, NULL);
> +		}
> +

This block of code to create the temporary RTT can be removed. With it removed,
we're going to rely on realm_create_protected_data_page() to create the needed
RTT in its failing path. If the temporary RTT has been existing, the function
call to realm_create_rtt_levels() doesn't nothing except multiple RMI calls are
issued. RMI calls aren't cheap and it causes performance lost if you agree.

> +		for (offset = 0, i = 0; offset < map_size && !ret;
> +		     offset += PAGE_SIZE, i++) {
> +			phys_addr_t page_ipa = ipa + offset;
> +			kvm_pfn_t priv_pfn;
> +			struct page *gmem_page;
> +			int order;
> +
> +			ret = kvm_gmem_get_pfn(kvm, memslot,
> +					       page_ipa >> PAGE_SHIFT,
> +					       &priv_pfn, &gmem_page, &order);
> +			if (ret)
> +				break;
> +
> +			ret = realm_create_protected_data_page(realm, page_ipa,
> +							       priv_pfn,
> +							       pfn + i,
> +							       data_flags);
> +		}
> +
> +		kvm_release_faultin_page(kvm, page, false, false);
> +
> +		if (ret)
> +			break;
> +
> +		if (level == 2)
> +			fold_rtt(realm, ipa, level);
> +
> +		ipa += map_size;
> +	}
> +
> +	write_unlock(&kvm->mmu_lock);
> +
> +out:
> +	srcu_read_unlock(&kvm->srcu, idx);
> +	return ret;
> +}
> +
> +static int kvm_populate_realm(struct kvm *kvm,
> +			      struct kvm_cap_arm_rme_populate_realm_args *args)
> +{
> +	phys_addr_t ipa_base, ipa_end;
> +
> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
> +		return -EINVAL;
> +
> +	if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) ||
> +	    !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE))
> +		return -EINVAL;
> +
> +	if (args->flags & ~RMI_MEASURE_CONTENT)
> +		return -EINVAL;
> +
> +	ipa_base = args->populate_ipa_base;
> +	ipa_end = ipa_base + args->populate_ipa_size;
> +
> +	if (ipa_end < ipa_base)
> +		return -EINVAL;
> +
> +	/*
> +	 * Perform the populate in parts to ensure locks are not held for too
> +	 * long
> +	 */
> +	while (ipa_base < ipa_end) {
> +		phys_addr_t end = min(ipa_end, ipa_base + SZ_2M);
> +
> +		int ret = populate_par_region(kvm, ipa_base, end,
> +					      args->flags);
> +
> +		if (ret)
> +			return ret;
> +

cond_resched() seems nice to have here so that those pending tasks can
be run immediately.

> +		ipa_base = end;
> +	}
> +
> +	return 0;
> +}
> +
>   int realm_set_ipa_state(struct kvm_vcpu *vcpu,
>   			unsigned long start,
>   			unsigned long end,
> @@ -794,6 +1025,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>   		r = kvm_init_ipa_range_realm(kvm, &args);
>   		break;
>   	}
> +	case KVM_CAP_ARM_RME_POPULATE_REALM: {
> +		struct kvm_cap_arm_rme_populate_realm_args args;
> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
> +
> +		if (copy_from_user(&args, argp, sizeof(args))) {
> +			r = -EFAULT;
> +			break;
> +		}
> +
> +		r = kvm_populate_realm(kvm, &args);
> +		break;
> +	}
>   	default:
>   		r = -EINVAL;
>   		break;

Thanks,
Gavin


  reply	other threads:[~2025-01-30  4:38 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12 15:55 [PATCH v6 00/43] arm64: Support for Arm CCA in KVM Steven Price
2024-12-12 15:55 ` [PATCH v6 01/43] KVM: Prepare for handling only shared mappings in mmu_notifier events Steven Price
2024-12-12 15:55 ` [PATCH v6 02/43] kvm: arm64: Include kvm_emulate.h in kvm/arm_psci.h Steven Price
2024-12-12 15:55 ` [PATCH v6 03/43] arm64: RME: Handle Granule Protection Faults (GPFs) Steven Price
2024-12-16 16:45   ` Suzuki K Poulose
2024-12-12 15:55 ` [PATCH v6 04/43] arm64: RME: Add SMC definitions for calling the RMM Steven Price
2024-12-16 17:51   ` Suzuki K Poulose
2024-12-12 15:55 ` [PATCH v6 05/43] arm64: RME: Add wrappers for RMI calls Steven Price
2024-12-12 15:55 ` [PATCH v6 06/43] arm64: RME: Check for RME support at KVM init Steven Price
2024-12-19  5:44   ` Aneesh Kumar K.V
2025-01-28  5:47     ` Gavin Shan
2025-01-29  3:57   ` Gavin Shan
2025-01-29 15:24     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 07/43] arm64: RME: Define the user ABI Steven Price
2025-01-28 23:51   ` Gavin Shan
2025-01-29 16:31     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 08/43] arm64: RME: ioctls to create and configure realms Steven Price
2025-01-29  0:43   ` Gavin Shan
2025-01-30 11:57     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 09/43] kvm: arm64: Expose debug HW register numbers for Realm Steven Price
2024-12-12 15:55 ` [PATCH v6 10/43] arm64: kvm: Allow passing machine type in KVM creation Steven Price
2025-01-29  4:07   ` Gavin Shan
2025-01-30 14:14     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 11/43] arm64: RME: RTT tear down Steven Price
2025-01-29 23:06   ` Gavin Shan
2024-12-12 15:55 ` [PATCH v6 12/43] arm64: RME: Allocate/free RECs to match vCPUs Steven Price
2024-12-19  5:37   ` Aneesh Kumar K.V
2025-01-29  4:50   ` Gavin Shan
2025-01-30 16:40     ` Steven Price
2025-02-03 11:18       ` Suzuki K Poulose
2025-02-03 16:34         ` Steven Price
2024-12-12 15:55 ` [PATCH v6 13/43] KVM: arm64: vgic: Provide helper for number of list registers Steven Price
2024-12-12 15:55 ` [PATCH v6 14/43] arm64: RME: Support for the VGIC in realms Steven Price
2024-12-12 15:55 ` [PATCH v6 15/43] KVM: arm64: Support timers in realm RECs Steven Price
2024-12-12 15:55 ` [PATCH v6 16/43] arm64: RME: Allow VMM to set RIPAS Steven Price
2025-01-29 23:25   ` Gavin Shan
2025-01-30 16:56     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 17/43] arm64: RME: Handle realm enter/exit Steven Price
2025-01-29 23:41   ` Gavin Shan
2025-02-03 16:34     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 18/43] KVM: arm64: Handle realm MMIO emulation Steven Price
2024-12-12 15:55 ` [PATCH v6 19/43] arm64: RME: Allow populating initial contents Steven Price
2025-01-30  4:38   ` Gavin Shan [this message]
2025-02-03 16:52     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 20/43] arm64: RME: Runtime faulting of memory Steven Price
2025-01-30  5:22   ` Gavin Shan
2025-02-06  4:33     ` Aneesh Kumar K.V
2025-02-07 17:04     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 21/43] KVM: arm64: Handle realm VCPU load Steven Price
2024-12-12 15:55 ` [PATCH v6 22/43] KVM: arm64: Validate register access for a Realm VM Steven Price
2025-02-02  1:22   ` Gavin Shan
2025-02-07 17:04     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 23/43] KVM: arm64: Handle Realm PSCI requests Steven Price
2025-02-02  2:06   ` Gavin Shan
2024-12-12 15:55 ` [PATCH v6 24/43] KVM: arm64: WARN on injected undef exceptions Steven Price
2025-02-02  2:11   ` Gavin Shan
2024-12-12 15:55 ` [PATCH v6 25/43] arm64: Don't expose stolen time for realm guests Steven Price
2025-02-02  2:15   ` Gavin Shan
2025-02-07 17:05     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 26/43] arm64: rme: allow userspace to inject aborts Steven Price
2024-12-12 15:55 ` [PATCH v6 27/43] arm64: rme: support RSI_HOST_CALL Steven Price
2025-02-02  6:41   ` Gavin Shan
2025-02-07 17:05     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 28/43] arm64: rme: Allow checking SVE on VM instance Steven Price
2025-02-02  6:00   ` Gavin Shan
2025-02-07 17:05     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 29/43] arm64: RME: Always use 4k pages for realms Steven Price
2025-02-02  6:52   ` Gavin Shan
2025-02-07 17:05     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 30/43] arm64: rme: Prevent Device mappings for Realms Steven Price
2025-02-02  7:12   ` Gavin Shan
2025-02-07 17:08     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 31/43] arm_pmu: Provide a mechanism for disabling the physical IRQ Steven Price
2024-12-12 15:55 ` [PATCH v6 32/43] arm64: rme: Enable PMU support with a realm guest Steven Price
2024-12-12 16:54   ` Joey Gouly
2025-02-10 11:58     ` Steven Price
2024-12-12 15:55 ` [PATCH v6 33/43] kvm: rme: Hide KVM_CAP_READONLY_MEM for realm guests Steven Price
2024-12-12 15:55 ` [PATCH v6 34/43] arm64: RME: Propagate number of breakpoints and watchpoints to userspace Steven Price
2025-02-02 23:15   ` Gavin Shan
2025-02-02 23:17     ` Gavin Shan
2024-12-12 15:56 ` [PATCH v6 35/43] arm64: RME: Set breakpoint parameters through SET_ONE_REG Steven Price
2024-12-12 15:56 ` [PATCH v6 36/43] arm64: RME: Initialize PMCR.N with number counter supported by RMM Steven Price
2024-12-12 15:56 ` [PATCH v6 37/43] arm64: RME: Propagate max SVE vector length from RMM Steven Price
2024-12-12 15:56 ` [PATCH v6 38/43] arm64: RME: Configure max SVE vector length for a Realm Steven Price
2025-02-02 23:26   ` Gavin Shan
2024-12-12 15:56 ` [PATCH v6 39/43] arm64: RME: Provide register list for unfinalized RME RECs Steven Price
2024-12-12 15:56 ` [PATCH v6 40/43] arm64: RME: Provide accurate register list Steven Price
2024-12-12 15:56 ` [PATCH v6 41/43] arm64: kvm: Expose support for private memory Steven Price
2024-12-12 15:56 ` [PATCH v6 42/43] KVM: arm64: Expose KVM_ARM_VCPU_REC to user space Steven Price
2024-12-12 15:56 ` [PATCH v6 43/43] KVM: arm64: Allow activating realms Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68e545c8-57a3-4567-9e96-b46066cf6cee@redhat.com \
    --to=gshan@redhat.com \
    --cc=alexandru.elisei@arm.com \
    --cc=alpergun@google.com \
    --cc=aneesh.kumar@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=gankulkarni@os.amperecomputing.com \
    --cc=james.morse@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=sdonthineni@nvidia.com \
    --cc=steven.price@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox