linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Steven Price <steven.price@arm.com>
To: Suzuki K Poulose <suzuki.poulose@arm.com>,
	kvm@vger.kernel.org, kvmarm@lists.linux.dev
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	James Morse <james.morse@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Zenghui Yu <yuzenghui@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Joey Gouly <joey.gouly@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Fuad Tabba <tabba@google.com>,
	linux-coco@lists.linux.dev,
	Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
	Gavin Shan <gshan@redhat.com>,
	Shanker Donthineni <sdonthineni@nvidia.com>,
	Alper Gun <alpergun@google.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>
Subject: Re: [PATCH v5 18/43] arm64: RME: Handle realm enter/exit
Date: Fri, 29 Nov 2024 14:55:36 +0000	[thread overview]
Message-ID: <86f36cfd-65b6-4b71-9272-d17e5e41b57c@arm.com> (raw)
In-Reply-To: <791e8c32-83fb-442c-9664-4b5f2f9c09bf@arm.com>

On 29/11/2024 13:45, Suzuki K Poulose wrote:
> Hi Steven
> 
> On 29/11/2024 12:18, Steven Price wrote:
>> Hi Suzuki,
>>
>> Sorry for the very slow response to this. Coming back to this I'm having
>> doubts, see below.
>>
>> On 17/10/2024 14:00, Suzuki K Poulose wrote:
>>> On 04/10/2024 16:27, Steven Price wrote:
>>>> Entering a realm is done using a SMC call to the RMM. On exit the
>>>> exit-codes need to be handled slightly differently to the normal KVM
>>>> path so define our own functions for realm enter/exit and hook them
>>>> in if the guest is a realm guest.
>>>>
>>>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ...
>>>> diff --git a/arch/arm64/kvm/rme-exit.c b/arch/arm64/kvm/rme-exit.c
>>>> new file mode 100644
>>>> index 000000000000..e96ea308212c
>>>> --- /dev/null
>>>> +++ b/arch/arm64/kvm/rme-exit.c
>> ...
>>>> +static int rec_exit_ripas_change(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +    struct kvm *kvm = vcpu->kvm;
>>>> +    struct realm *realm = &kvm->arch.realm;
>>>> +    struct realm_rec *rec = &vcpu->arch.rec;
>>>> +    unsigned long base = rec->run->exit.ripas_base;
>>>> +    unsigned long top = rec->run->exit.ripas_top;
>>>> +    unsigned long ripas = rec->run->exit.ripas_value;
>>>> +    unsigned long top_ipa;
>>>> +    int ret;
>>>> +
>>>> +    if (!realm_is_addr_protected(realm, base) ||
>>>> +        !realm_is_addr_protected(realm, top - 1)) {
>>>> +        kvm_err("Invalid RIPAS_CHANGE for %#lx - %#lx, ripas: %#lx\n",
>>>> +            base, top, ripas);
>>>> +        return -EINVAL;
>>>> +    }
>>>> +
>>>> +    kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_page_cache,
>>>> +                   kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu));
>>>
>>> I think we also need to filter the request for RIPAS_RAM, by consulting
>>> if the "range" is backed by a memslot or not. If they are not, we should
>>> reject the request with a response flag set in run.enter.flags.
>>
>> It's an interesting API question. At the moment there is no requirement
>> to have an active memslot to set the RIPAS - this is true both during
>> the setup by the VMM and at run time.
>>
>> In theory a VMM can create/destroy memslots while the guest is running.
>> So absense of a memslot doesn't actually imply that the RIPAS change
> 
> Agreed. Whether an IPA range may be used as RAM is a decision that the
> VMM must make. So, we could give the VMM a chance to respond to this
> request before we (KVM) make the RTT changes.
> 
>> should be rejected. Obviously with realms this is tricky because when
>> destroying a memslot that's in use KVM would rip those pages out from
>> the guest and it would require guest cooperation to restore those pages
>> (transition to RIPAS_EMPTY and back to RIPAS_RAM). But it's not
>> something that has been prohibited so far.
> 
> True, and it shouldn't be prohibited. If the Host wants to take away a
> memslot it must be able to do that. But if it wants to do that in
> good faith with the Realm, there must have been some communication
> (e.g., virtio-mem ?) between the Host and the Realm and as long as the
> Realm knows not to trust the contents on that region it could be
> recovered without a transition to EMPTY.
> 
> e.g. From RIPAS_DESTROYED => RIPAS_RAM with RSI_SET_IPA_STATE(...
> CHANGE_DESTROYED).

Indeed - I always forget RSI_SET_IPA_STATE has two modes these days.

>>
>> On the other hand this is a clear way for a (malicious/buggy) guest to
>> use a fair bit of RAM by transitioning to RIPAS_RAM (sparse) pages not
>> in a memslot and forcing KVM to allocate the RTT pages to delegate to
>> the RMM. But we do exit to the VMM, so this is solvable in the VMM (by
>> killing a misbehaving guest). The number of pages this would consume per
>> exit is also fairly small.
> 
> Correct. If the VMM has no intention to provide memory at a given IPA
> range, KVM shouldn't report RSI_ACCEPT to the Realm and the Realm later
> gets a stage2 fault that cannot be serviced by KVM.
> 
>>
>> So my instinct is that we shouldn't impose that requirement.
> 
> I think we may be able to fix this by letting the VMM ACCEPT or REJECT
> a given RIPAS_RAM transition request. That way, KVM isn't playing by
> the rules set by the VMM and whether the VMM wants to trick the Realm
> or play by the rules is upto it.

Sounds good to me.

>>
>> Any thoughts?
>>
>>> As for EMPTY requests, if the guest wants to explicitly mark any range
>>> as EMPTY, it doesn't matter, as long as it is within the protected IPA.
>>> (even though they may be EMPTY in the first place).
>>>
>>>> +    write_lock(&kvm->mmu_lock);
>>>> +    ret = realm_set_ipa_state(vcpu, base, top, ripas, &top_ipa);
>>>> +    write_unlock(&kvm->mmu_lock);
>>>> +
>>>> +    WARN(ret && ret != -ENOMEM,
>>>> +         "Unable to satisfy RIPAS_CHANGE for %#lx - %#lx, ripas:
>>>> %#lx\n",
>>>> +         base, top, ripas);
>>>> +
>>>> +    /* Exit to VMM to complete the change */
>>>> +    kvm_prepare_memory_fault_exit(vcpu, base, top_ipa - base, false,
>>>> false,
>>>> +                      ripas == RMI_RAM);
>>>
>>> Again this may only be need if the range is backed by a memslot ?
>>> Otherwise the VMM has nothing to do.
>>
>> Assuming the above, then the VMM would be the one to kill a misbehaving
>> guest, so would need a notification.
> 
> May be we could reverse the order of operations by delaying the
> realm_set_ipa_state() to occur on VMMs request from the memory_fault_exit.

Ah, good point - moving the RIPAS state set to the entry path makes a
lot of sense. The only negative is that we push the loop handling
partial RIPAS changes into the KVM entry path - but I don't think that's
a major problem.

Thanks,
Steve

> 
> Suzuki
> 
>>
>> Thanks,
>> Steve
>>
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static void update_arch_timer_irq_lines(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +    struct realm_rec *rec = &vcpu->arch.rec;
>>>> +
>>>> +    __vcpu_sys_reg(vcpu, CNTV_CTL_EL0) = rec->run->exit.cntv_ctl;
>>>> +    __vcpu_sys_reg(vcpu, CNTV_CVAL_EL0) = rec->run->exit.cntv_cval;
>>>> +    __vcpu_sys_reg(vcpu, CNTP_CTL_EL0) = rec->run->exit.cntp_ctl;
>>>> +    __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0) = rec->run->exit.cntp_cval;
>>>> +
>>>> +    kvm_realm_timers_update(vcpu);
>>>> +}
>>>> +
>>>> +/*
>>>> + * Return > 0 to return to guest, < 0 on error, 0 (and set
>>>> exit_reason) on
>>>> + * proper exit to userspace.
>>>> + */
>>>> +int handle_rec_exit(struct kvm_vcpu *vcpu, int rec_run_ret)
>>>> +{
>>>> +    struct realm_rec *rec = &vcpu->arch.rec;
>>>> +    u8 esr_ec = ESR_ELx_EC(rec->run->exit.esr);
>>>> +    unsigned long status, index;
>>>> +
>>>> +    status = RMI_RETURN_STATUS(rec_run_ret);
>>>> +    index = RMI_RETURN_INDEX(rec_run_ret);
>>>> +
>>>> +    /*
>>>> +     * If a PSCI_SYSTEM_OFF request raced with a vcpu executing, we
>>>> might
>>>> +     * see the following status code and index indicating an attempt
>>>> to run
>>>> +     * a REC when the RD state is SYSTEM_OFF.  In this case, we just
>>>> need to
>>>> +     * return to user space which can deal with the system event or
>>>> will try
>>>> +     * to run the KVM VCPU again, at which point we will no longer
>>>> attempt
>>>> +     * to enter the Realm because we will have a sleep request
>>>> pending on
>>>> +     * the VCPU as a result of KVM's PSCI handling.
>>>> +     */
>>>> +    if (status == RMI_ERROR_REALM && index == 1) {
>>>> +        vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
>>>> +        return 0;
>>>> +    }
>>>> +
>>>> +    if (rec_run_ret)
>>>> +        return -ENXIO;
>>>> +
>>>> +    vcpu->arch.fault.esr_el2 = rec->run->exit.esr;
>>>> +    vcpu->arch.fault.far_el2 = rec->run->exit.far;
>>>> +    vcpu->arch.fault.hpfar_el2 = rec->run->exit.hpfar;
>>>> +
>>>> +    update_arch_timer_irq_lines(vcpu);
>>>> +
>>>> +    /* Reset the emulation flags for the next run of the REC */
>>>> +    rec->run->enter.flags = 0;
>>>> +
>>>> +    switch (rec->run->exit.exit_reason) {
>>>> +    case RMI_EXIT_SYNC:
>>>> +        return rec_exit_handlers[esr_ec](vcpu);
>>>> +    case RMI_EXIT_IRQ:
>>>> +    case RMI_EXIT_FIQ:
>>>> +        return 1;
>>>> +    case RMI_EXIT_PSCI:
>>>> +        return rec_exit_psci(vcpu);
>>>> +    case RMI_EXIT_RIPAS_CHANGE:
>>>> +        return rec_exit_ripas_change(vcpu);
>>>> +    }
>>>> +
>>>> +    kvm_pr_unimpl("Unsupported exit reason: %u\n",
>>>> +              rec->run->exit.exit_reason);
>>>> +    vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>>>> +    return 0;
>>>> +}
>>>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>>>> index 1fa9991d708b..4c0751231810 100644
>>>> --- a/arch/arm64/kvm/rme.c
>>>> +++ b/arch/arm64/kvm/rme.c
>>>> @@ -899,6 +899,25 @@ void kvm_destroy_realm(struct kvm *kvm)
>>>>        kvm_free_stage2_pgd(&kvm->arch.mmu);
>>>>    }
>>>>    +int kvm_rec_enter(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +    struct realm_rec *rec = &vcpu->arch.rec;
>>>> +
>>>> +    switch (rec->run->exit.exit_reason) {
>>>> +    case RMI_EXIT_HOST_CALL:
>>>> +    case RMI_EXIT_PSCI:
>>>> +        for (int i = 0; i < REC_RUN_GPRS; i++)
>>>> +            rec->run->enter.gprs[i] = vcpu_get_reg(vcpu, i);
>>>> +        break;
>>>> +    }
>>>
>>> As mentioned in the patch following (MMIO emulation support), we may be
>>> able to do this unconditionally for all REC entries, to cover ourselves
>>> from missing out other cases. The RMM is in charge of taking the
>>> appropriate action anyways to copy the results back.
>>>
>>> Suzuki
>>>
>>>> +
>>>> +    if (kvm_realm_state(vcpu->kvm) != REALM_STATE_ACTIVE)
>>>> +        return -EINVAL;
>>>> +
>>>> +    return rmi_rec_enter(virt_to_phys(rec->rec_page),
>>>> +                 virt_to_phys(rec->run));
>>>> +}
>>>> +
>>>>    static void free_rec_aux(struct page **aux_pages,
>>>>                 unsigned int num_aux)
>>>>    {
>>
> 


  reply	other threads:[~2024-11-29 14:55 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-04 15:27 [PATCH v5 00/43] arm64: Support for Arm CCA in KVM Steven Price
2024-10-04 15:27 ` [PATCH v5 01/43] KVM: Prepare for handling only shared mappings in mmu_notifier events Steven Price
2024-10-04 15:27 ` [PATCH v5 02/43] kvm: arm64: pgtable: Track the number of pages in the entry level Steven Price
2024-10-23  4:03   ` Gavin Shan
2024-10-23 14:35     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 03/43] kvm: arm64: Include kvm_emulate.h in kvm/arm_psci.h Steven Price
2024-10-04 15:27 ` [PATCH v5 04/43] arm64: RME: Handle Granule Protection Faults (GPFs) Steven Price
2024-10-24 14:17   ` Aneesh Kumar K.V
2024-10-25 13:24     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 05/43] arm64: RME: Add SMC definitions for calling the RMM Steven Price
2024-10-07  8:54   ` Suzuki K Poulose
2024-10-25  6:37   ` Gavin Shan
2024-10-25 13:24     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 06/43] arm64: RME: Add wrappers for RMI calls Steven Price
2024-10-25  7:03   ` Gavin Shan
2024-10-25 13:24     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 07/43] arm64: RME: Check for RME support at KVM init Steven Price
2024-10-07 10:34   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 08/43] arm64: RME: Define the user ABI Steven Price
2024-10-04 15:27 ` [PATCH v5 09/43] arm64: RME: ioctls to create and configure realms Steven Price
2024-10-08 16:31   ` Suzuki K Poulose
2024-10-30  7:55   ` Aneesh Kumar K.V
2024-11-01 16:22     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 10/43] kvm: arm64: Expose debug HW register numbers for Realm Steven Price
2024-10-04 15:27 ` [PATCH v5 11/43] arm64: kvm: Allow passing machine type in KVM creation Steven Price
2024-10-04 15:27 ` [PATCH v5 12/43] arm64: RME: Keep a spare page delegated to the RMM Steven Price
2024-10-04 15:27 ` [PATCH v5 13/43] arm64: RME: RTT tear down Steven Price
2024-10-15 11:25   ` Suzuki K Poulose
2024-11-01 16:35     ` Steven Price
2024-10-04 15:27 ` [PATCH v5 14/43] arm64: RME: Allocate/free RECs to match vCPUs Steven Price
2024-10-15 12:48   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 15/43] arm64: RME: Support for the VGIC in realms Steven Price
2024-10-15 13:02   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 16/43] KVM: arm64: Support timers in realm RECs Steven Price
2024-10-04 15:27 ` [PATCH v5 17/43] arm64: RME: Allow VMM to set RIPAS Steven Price
2024-10-16  8:46   ` Suzuki K Poulose
2024-10-30  7:52     ` Aneesh Kumar K.V
2024-10-04 15:27 ` [PATCH v5 18/43] arm64: RME: Handle realm enter/exit Steven Price
2024-10-17 13:00   ` Suzuki K Poulose
2024-11-29 12:18     ` Steven Price
2024-11-29 13:45       ` Suzuki K Poulose
2024-11-29 14:55         ` Steven Price [this message]
2024-10-04 15:27 ` [PATCH v5 19/43] KVM: arm64: Handle realm MMIO emulation Steven Price
2024-10-07  4:31   ` Aneesh Kumar K.V
2024-10-07 10:22     ` Steven Price
2024-10-17 11:59   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 20/43] arm64: RME: Allow populating initial contents Steven Price
2024-10-04 15:27 ` [PATCH v5 21/43] arm64: RME: Runtime faulting of memory Steven Price
2024-10-22  5:36   ` Aneesh Kumar K.V
2024-10-23  5:50   ` Aneesh Kumar K.V
2024-10-24 13:51   ` Suzuki K Poulose
2024-10-24 14:30   ` Aneesh Kumar K.V
2024-10-04 15:27 ` [PATCH v5 22/43] KVM: arm64: Handle realm VCPU load Steven Price
2024-10-04 15:27 ` [PATCH v5 23/43] KVM: arm64: Validate register access for a Realm VM Steven Price
2024-10-17 15:32   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 24/43] KVM: arm64: Handle Realm PSCI requests Steven Price
2024-10-04 15:27 ` [PATCH v5 25/43] KVM: arm64: WARN on injected undef exceptions Steven Price
2024-10-04 15:27 ` [PATCH v5 26/43] arm64: Don't expose stolen time for realm guests Steven Price
2024-10-18 13:17   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 27/43] arm64: rme: allow userspace to inject aborts Steven Price
2024-10-04 15:27 ` [PATCH v5 28/43] arm64: rme: support RSI_HOST_CALL Steven Price
2024-10-04 15:27 ` [PATCH v5 29/43] arm64: rme: Allow checking SVE on VM instance Steven Price
2024-10-04 15:27 ` [PATCH v5 30/43] arm64: RME: Always use 4k pages for realms Steven Price
2024-10-04 15:27 ` [PATCH v5 31/43] arm64: rme: Prevent Device mappings for Realms Steven Price
2024-10-18 13:30   ` Suzuki K Poulose
2024-10-04 15:27 ` [PATCH v5 32/43] arm_pmu: Provide a mechanism for disabling the physical IRQ Steven Price
2024-10-04 15:27 ` [PATCH v5 33/43] arm64: rme: Enable PMU support with a realm guest Steven Price
2024-10-04 15:27 ` [PATCH v5 34/43] kvm: rme: Hide KVM_CAP_READONLY_MEM for realm guests Steven Price
2024-10-04 15:27 ` [PATCH v5 35/43] arm64: RME: Propagate number of breakpoints and watchpoints to userspace Steven Price
2024-10-04 15:27 ` [PATCH v5 36/43] arm64: RME: Set breakpoint parameters through SET_ONE_REG Steven Price
2024-10-04 15:27 ` [PATCH v5 37/43] arm64: RME: Initialize PMCR.N with number counter supported by RMM Steven Price
2024-10-04 15:27 ` [PATCH v5 38/43] arm64: RME: Propagate max SVE vector length from RMM Steven Price
2024-10-04 15:28 ` [PATCH v5 39/43] arm64: RME: Configure max SVE vector length for a Realm Steven Price
2024-10-04 15:28 ` [PATCH v5 40/43] arm64: RME: Provide register list for unfinalized RME RECs Steven Price
2024-10-04 15:28 ` [PATCH v5 41/43] arm64: RME: Provide accurate register list Steven Price
2024-10-04 15:28 ` [PATCH v5 42/43] arm64: kvm: Expose support for private memory Steven Price
2024-10-09  7:03   ` kernel test robot
2024-10-04 15:28 ` [PATCH v5 43/43] KVM: arm64: Allow activating realms Steven Price
2024-12-02  5:10 ` [PATCH v5 00/43] arm64: Support for Arm CCA in KVM Itaru Kitayama
2024-12-02  8:54   ` Steven Price
2024-12-02 10:26     ` Jean-Philippe Brucker
2024-12-02 10:42       ` Itaru Kitayama

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86f36cfd-65b6-4b71-9272-d17e5e41b57c@arm.com \
    --to=steven.price@arm.com \
    --cc=alexandru.elisei@arm.com \
    --cc=alpergun@google.com \
    --cc=aneesh.kumar@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=gankulkarni@os.amperecomputing.com \
    --cc=gshan@redhat.com \
    --cc=james.morse@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=sdonthineni@nvidia.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).