Re: [PATCH v5 18/43] arm64: RME: Handle realm enter/exit

linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

From: Suzuki K Poulose <suzuki.poulose@arm.com>
To: Steven Price <steven.price@arm.com>,
	kvm@vger.kernel.org, kvmarm@lists.linux.dev
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	James Morse <james.morse@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Zenghui Yu <yuzenghui@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Joey Gouly <joey.gouly@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Fuad Tabba <tabba@google.com>,
	linux-coco@lists.linux.dev,
	Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
	Gavin Shan <gshan@redhat.com>,
	Shanker Donthineni <sdonthineni@nvidia.com>,
	Alper Gun <alpergun@google.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>
Subject: Re: [PATCH v5 18/43] arm64: RME: Handle realm enter/exit
Date: Fri, 29 Nov 2024 13:45:25 +0000	[thread overview]
Message-ID: <791e8c32-83fb-442c-9664-4b5f2f9c09bf@arm.com> (raw)
In-Reply-To: <7d1d4893-f798-4e44-aad0-1d0071e30b05@arm.com>

Hi Steven

On 29/11/2024 12:18, Steven Price wrote:
> Hi Suzuki,
> 
> Sorry for the very slow response to this. Coming back to this I'm having
> doubts, see below.
> 
> On 17/10/2024 14:00, Suzuki K Poulose wrote:
>> On 04/10/2024 16:27, Steven Price wrote:
>>> Entering a realm is done using a SMC call to the RMM. On exit the
>>> exit-codes need to be handled slightly differently to the normal KVM
>>> path so define our own functions for realm enter/exit and hook them
>>> in if the guest is a realm guest.
>>>
>>> Signed-off-by: Steven Price <steven.price@arm.com>
> ...
>>> diff --git a/arch/arm64/kvm/rme-exit.c b/arch/arm64/kvm/rme-exit.c
>>> new file mode 100644
>>> index 000000000000..e96ea308212c
>>> --- /dev/null
>>> +++ b/arch/arm64/kvm/rme-exit.c
> ...
>>> +static int rec_exit_ripas_change(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct realm *realm = &kvm->arch.realm;
>>> +    struct realm_rec *rec = &vcpu->arch.rec;
>>> +    unsigned long base = rec->run->exit.ripas_base;
>>> +    unsigned long top = rec->run->exit.ripas_top;
>>> +    unsigned long ripas = rec->run->exit.ripas_value;
>>> +    unsigned long top_ipa;
>>> +    int ret;
>>> +
>>> +    if (!realm_is_addr_protected(realm, base) ||
>>> +        !realm_is_addr_protected(realm, top - 1)) {
>>> +        kvm_err("Invalid RIPAS_CHANGE for %#lx - %#lx, ripas: %#lx\n",
>>> +            base, top, ripas);
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_page_cache,
>>> +                   kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu));
>>
>> I think we also need to filter the request for RIPAS_RAM, by consulting
>> if the "range" is backed by a memslot or not. If they are not, we should
>> reject the request with a response flag set in run.enter.flags.
> 
> It's an interesting API question. At the moment there is no requirement
> to have an active memslot to set the RIPAS - this is true both during
> the setup by the VMM and at run time.
> 
> In theory a VMM can create/destroy memslots while the guest is running.
> So absense of a memslot doesn't actually imply that the RIPAS change

Agreed. Whether an IPA range may be used as RAM is a decision that the
VMM must make. So, we could give the VMM a chance to respond to this
request before we (KVM) make the RTT changes.

> should be rejected. Obviously with realms this is tricky because when
> destroying a memslot that's in use KVM would rip those pages out from
> the guest and it would require guest cooperation to restore those pages
> (transition to RIPAS_EMPTY and back to RIPAS_RAM). But it's not
> something that has been prohibited so far.

True, and it shouldn't be prohibited. If the Host wants to take away a
memslot it must be able to do that. But if it wants to do that in
good faith with the Realm, there must have been some communication
(e.g., virtio-mem ?) between the Host and the Realm and as long as the
Realm knows not to trust the contents on that region it could be 
recovered without a transition to EMPTY.

e.g. From RIPAS_DESTROYED => RIPAS_RAM with RSI_SET_IPA_STATE(... 
CHANGE_DESTROYED).


> 
> On the other hand this is a clear way for a (malicious/buggy) guest to
> use a fair bit of RAM by transitioning to RIPAS_RAM (sparse) pages not
> in a memslot and forcing KVM to allocate the RTT pages to delegate to
> the RMM. But we do exit to the VMM, so this is solvable in the VMM (by
> killing a misbehaving guest). The number of pages this would consume per
> exit is also fairly small.

Correct. If the VMM has no intention to provide memory at a given IPA
range, KVM shouldn't report RSI_ACCEPT to the Realm and the Realm later
gets a stage2 fault that cannot be serviced by KVM.

> 
> So my instinct is that we shouldn't impose that requirement.

I think we may be able to fix this by letting the VMM ACCEPT or REJECT
a given RIPAS_RAM transition request. That way, KVM isn't playing by
the rules set by the VMM and whether the VMM wants to trick the Realm
or play by the rules is upto it.


> 
> Any thoughts?
> 
>> As for EMPTY requests, if the guest wants to explicitly mark any range
>> as EMPTY, it doesn't matter, as long as it is within the protected IPA.
>> (even though they may be EMPTY in the first place).
>>
>>> +    write_lock(&kvm->mmu_lock);
>>> +    ret = realm_set_ipa_state(vcpu, base, top, ripas, &top_ipa);
>>> +    write_unlock(&kvm->mmu_lock);
>>> +
>>> +    WARN(ret && ret != -ENOMEM,
>>> +         "Unable to satisfy RIPAS_CHANGE for %#lx - %#lx, ripas:
>>> %#lx\n",
>>> +         base, top, ripas);
>>> +
>>> +    /* Exit to VMM to complete the change */
>>> +    kvm_prepare_memory_fault_exit(vcpu, base, top_ipa - base, false,
>>> false,
>>> +                      ripas == RMI_RAM);
>>
>> Again this may only be need if the range is backed by a memslot ?
>> Otherwise the VMM has nothing to do.
> 
> Assuming the above, then the VMM would be the one to kill a misbehaving
> guest, so would need a notification.

May be we could reverse the order of operations by delaying the 
realm_set_ipa_state() to occur on VMMs request from the memory_fault_exit.


Suzuki

> 
> Thanks,
> Steve
> 
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static void update_arch_timer_irq_lines(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct realm_rec *rec = &vcpu->arch.rec;
>>> +
>>> +    __vcpu_sys_reg(vcpu, CNTV_CTL_EL0) = rec->run->exit.cntv_ctl;
>>> +    __vcpu_sys_reg(vcpu, CNTV_CVAL_EL0) = rec->run->exit.cntv_cval;
>>> +    __vcpu_sys_reg(vcpu, CNTP_CTL_EL0) = rec->run->exit.cntp_ctl;
>>> +    __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0) = rec->run->exit.cntp_cval;
>>> +
>>> +    kvm_realm_timers_update(vcpu);
>>> +}
>>> +
>>> +/*
>>> + * Return > 0 to return to guest, < 0 on error, 0 (and set
>>> exit_reason) on
>>> + * proper exit to userspace.
>>> + */
>>> +int handle_rec_exit(struct kvm_vcpu *vcpu, int rec_run_ret)
>>> +{
>>> +    struct realm_rec *rec = &vcpu->arch.rec;
>>> +    u8 esr_ec = ESR_ELx_EC(rec->run->exit.esr);
>>> +    unsigned long status, index;
>>> +
>>> +    status = RMI_RETURN_STATUS(rec_run_ret);
>>> +    index = RMI_RETURN_INDEX(rec_run_ret);
>>> +
>>> +    /*
>>> +     * If a PSCI_SYSTEM_OFF request raced with a vcpu executing, we
>>> might
>>> +     * see the following status code and index indicating an attempt
>>> to run
>>> +     * a REC when the RD state is SYSTEM_OFF.  In this case, we just
>>> need to
>>> +     * return to user space which can deal with the system event or
>>> will try
>>> +     * to run the KVM VCPU again, at which point we will no longer
>>> attempt
>>> +     * to enter the Realm because we will have a sleep request
>>> pending on
>>> +     * the VCPU as a result of KVM's PSCI handling.
>>> +     */
>>> +    if (status == RMI_ERROR_REALM && index == 1) {
>>> +        vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
>>> +        return 0;
>>> +    }
>>> +
>>> +    if (rec_run_ret)
>>> +        return -ENXIO;
>>> +
>>> +    vcpu->arch.fault.esr_el2 = rec->run->exit.esr;
>>> +    vcpu->arch.fault.far_el2 = rec->run->exit.far;
>>> +    vcpu->arch.fault.hpfar_el2 = rec->run->exit.hpfar;
>>> +
>>> +    update_arch_timer_irq_lines(vcpu);
>>> +
>>> +    /* Reset the emulation flags for the next run of the REC */
>>> +    rec->run->enter.flags = 0;
>>> +
>>> +    switch (rec->run->exit.exit_reason) {
>>> +    case RMI_EXIT_SYNC:
>>> +        return rec_exit_handlers[esr_ec](vcpu);
>>> +    case RMI_EXIT_IRQ:
>>> +    case RMI_EXIT_FIQ:
>>> +        return 1;
>>> +    case RMI_EXIT_PSCI:
>>> +        return rec_exit_psci(vcpu);
>>> +    case RMI_EXIT_RIPAS_CHANGE:
>>> +        return rec_exit_ripas_change(vcpu);
>>> +    }
>>> +
>>> +    kvm_pr_unimpl("Unsupported exit reason: %u\n",
>>> +              rec->run->exit.exit_reason);
>>> +    vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>>> +    return 0;
>>> +}
>>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>>> index 1fa9991d708b..4c0751231810 100644
>>> --- a/arch/arm64/kvm/rme.c
>>> +++ b/arch/arm64/kvm/rme.c
>>> @@ -899,6 +899,25 @@ void kvm_destroy_realm(struct kvm *kvm)
>>>        kvm_free_stage2_pgd(&kvm->arch.mmu);
>>>    }
>>>    +int kvm_rec_enter(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct realm_rec *rec = &vcpu->arch.rec;
>>> +
>>> +    switch (rec->run->exit.exit_reason) {
>>> +    case RMI_EXIT_HOST_CALL:
>>> +    case RMI_EXIT_PSCI:
>>> +        for (int i = 0; i < REC_RUN_GPRS; i++)
>>> +            rec->run->enter.gprs[i] = vcpu_get_reg(vcpu, i);
>>> +        break;
>>> +    }
>>
>> As mentioned in the patch following (MMIO emulation support), we may be
>> able to do this unconditionally for all REC entries, to cover ourselves
>> from missing out other cases. The RMM is in charge of taking the
>> appropriate action anyways to copy the results back.
>>
>> Suzuki
>>
>>> +
>>> +    if (kvm_realm_state(vcpu->kvm) != REALM_STATE_ACTIVE)
>>> +        return -EINVAL;
>>> +
>>> +    return rmi_rec_enter(virt_to_phys(rec->rec_page),
>>> +                 virt_to_phys(rec->run));
>>> +}
>>> +
>>>    static void free_rec_aux(struct page **aux_pages,
>>>                 unsigned int num_aux)
>>>    {
>

next prev parent reply	other threads:[~2024-11-29 13:45 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-04 15:27 [PATCH v5 00/43] arm64: Support for Arm CCA in KVM Steven Price
2024-10-04 15:27 ` [PATCH v5 01/43] KVM: Prepare for handling only shared mappings in mmu_notifier events Steven Price
2024-10-04 15:27 ` [PATCH v5 02/43] kvm: arm64: pgtable: Track the number of pages in the entry level Steven Price
2024-10-23  4:03   ` Gavin Shan
2024-10-23 14:35     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 03/43] kvm: arm64: Include kvm_emulate.h in kvm/arm_psci.h Steven Price
2024-10-04 15:27 ` [PATCH v5 04/43] arm64: RME: Handle Granule Protection Faults (GPFs) Steven Price
2024-10-24 14:17   ` Aneesh Kumar K.V
2024-10-25 13:24     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 05/43] arm64: RME: Add SMC definitions for calling the RMM Steven Price
2024-10-07  8:54   ` Suzuki K Poulose
2024-10-25  6:37   ` Gavin Shan
2024-10-25 13:24     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 06/43] arm64: RME: Add wrappers for RMI calls Steven Price
2024-10-25  7:03   ` Gavin Shan
2024-10-25 13:24     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 07/43] arm64: RME: Check for RME support at KVM init Steven Price
2024-10-07 10:34   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 08/43] arm64: RME: Define the user ABI Steven Price
2024-10-04 15:27 ` [PATCH v5 09/43] arm64: RME: ioctls to create and configure realms Steven Price
2024-10-08 16:31   ` Suzuki K Poulose
2024-10-30  7:55   ` Aneesh Kumar K.V
2024-11-01 16:22     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 10/43] kvm: arm64: Expose debug HW register numbers for Realm Steven Price
2024-10-04 15:27 ` [PATCH v5 11/43] arm64: kvm: Allow passing machine type in KVM creation Steven Price
2024-10-04 15:27 ` [PATCH v5 12/43] arm64: RME: Keep a spare page delegated to the RMM Steven Price
2024-10-04 15:27 ` [PATCH v5 13/43] arm64: RME: RTT tear down Steven Price
2024-10-15 11:25   ` Suzuki K Poulose
2024-11-01 16:35     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 14/43] arm64: RME: Allocate/free RECs to match vCPUs Steven Price
2024-10-15 12:48   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 15/43] arm64: RME: Support for the VGIC in realms Steven Price
2024-10-15 13:02   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 16/43] KVM: arm64: Support timers in realm RECs Steven Price
2024-10-04 15:27 ` [PATCH v5 17/43] arm64: RME: Allow VMM to set RIPAS Steven Price
2024-10-16  8:46   ` Suzuki K Poulose
2024-10-30  7:52     ` Aneesh Kumar K.V
2024-10-04 15:27 ` [PATCH v5 18/43] arm64: RME: Handle realm enter/exit Steven Price
2024-10-17 13:00   ` Suzuki K Poulose
2024-11-29 12:18     ` Steven Price
2024-11-29 13:45       ` Suzuki K Poulose [this message]
2024-11-29 14:55         ` Steven Price
2024-10-04 15:27 ` [PATCH v5 19/43] KVM: arm64: Handle realm MMIO emulation Steven Price
2024-10-07  4:31   ` Aneesh Kumar K.V
2024-10-07 10:22     ` Steven Price
2024-10-17 11:59   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 20/43] arm64: RME: Allow populating initial contents Steven Price
2024-10-04 15:27 ` [PATCH v5 21/43] arm64: RME: Runtime faulting of memory Steven Price
2024-10-22  5:36   ` Aneesh Kumar K.V
2024-10-23  5:50   ` Aneesh Kumar K.V
2024-10-24 13:51   ` Suzuki K Poulose
2024-10-24 14:30   ` Aneesh Kumar K.V
2024-10-04 15:27 ` [PATCH v5 22/43] KVM: arm64: Handle realm VCPU load Steven Price
2024-10-04 15:27 ` [PATCH v5 23/43] KVM: arm64: Validate register access for a Realm VM Steven Price
2024-10-17 15:32   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 24/43] KVM: arm64: Handle Realm PSCI requests Steven Price
2024-10-04 15:27 ` [PATCH v5 25/43] KVM: arm64: WARN on injected undef exceptions Steven Price
2024-10-04 15:27 ` [PATCH v5 26/43] arm64: Don't expose stolen time for realm guests Steven Price
2024-10-18 13:17   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 27/43] arm64: rme: allow userspace to inject aborts Steven Price
2024-10-04 15:27 ` [PATCH v5 28/43] arm64: rme: support RSI_HOST_CALL Steven Price
2024-10-04 15:27 ` [PATCH v5 29/43] arm64: rme: Allow checking SVE on VM instance Steven Price
2024-10-04 15:27 ` [PATCH v5 30/43] arm64: RME: Always use 4k pages for realms Steven Price
2024-10-04 15:27 ` [PATCH v5 31/43] arm64: rme: Prevent Device mappings for Realms Steven Price
2024-10-18 13:30   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 32/43] arm_pmu: Provide a mechanism for disabling the physical IRQ Steven Price
2024-10-04 15:27 ` [PATCH v5 33/43] arm64: rme: Enable PMU support with a realm guest Steven Price
2024-10-04 15:27 ` [PATCH v5 34/43] kvm: rme: Hide KVM_CAP_READONLY_MEM for realm guests Steven Price
2024-10-04 15:27 ` [PATCH v5 35/43] arm64: RME: Propagate number of breakpoints and watchpoints to userspace Steven Price
2024-10-04 15:27 ` [PATCH v5 36/43] arm64: RME: Set breakpoint parameters through SET_ONE_REG Steven Price
2024-10-04 15:27 ` [PATCH v5 37/43] arm64: RME: Initialize PMCR.N with number counter supported by RMM Steven Price
2024-10-04 15:27 ` [PATCH v5 38/43] arm64: RME: Propagate max SVE vector length from RMM Steven Price
2024-10-04 15:28 ` [PATCH v5 39/43] arm64: RME: Configure max SVE vector length for a Realm Steven Price
2024-10-04 15:28 ` [PATCH v5 40/43] arm64: RME: Provide register list for unfinalized RME RECs Steven Price
2024-10-04 15:28 ` [PATCH v5 41/43] arm64: RME: Provide accurate register list Steven Price
2024-10-04 15:28 ` [PATCH v5 42/43] arm64: kvm: Expose support for private memory Steven Price
2024-10-09  7:03   ` kernel test robot
2024-10-04 15:28 ` [PATCH v5 43/43] KVM: arm64: Allow activating realms Steven Price
2024-12-02  5:10 ` [PATCH v5 00/43] arm64: Support for Arm CCA in KVM Itaru Kitayama
2024-12-02  8:54   ` Steven Price
2024-12-02 10:26     ` Jean-Philippe Brucker
2024-12-02 10:42       ` Itaru Kitayama

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=791e8c32-83fb-442c-9664-4b5f2f9c09bf@arm.com \
    --to=suzuki.poulose@arm.com \
    --cc=alexandru.elisei@arm.com \
    --cc=alpergun@google.com \
    --cc=aneesh.kumar@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=gankulkarni@os.amperecomputing.com \
    --cc=gshan@redhat.com \
    --cc=james.morse@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=sdonthineni@nvidia.com \
    --cc=steven.price@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).