linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Christoph Schlameuss" <schlameuss@linux.ibm.com>
To: "Janosch Frank" <frankja@linux.ibm.com>,
	"Christoph Schlameuss" <schlameuss@linux.ibm.com>,
	<kvm@vger.kernel.org>
Cc: <linux-s390@vger.kernel.org>,
	"Heiko Carstens" <hca@linux.ibm.com>,
	"Vasily Gorbik" <gor@linux.ibm.com>,
	"Alexander Gordeev" <agordeev@linux.ibm.com>,
	"Christian Borntraeger" <borntraeger@linux.ibm.com>,
	"Claudio Imbrenda" <imbrenda@linux.ibm.com>,
	"Nico Boehr" <nrb@linux.ibm.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Sven Schnelle" <svens@linux.ibm.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Shuah Khan" <shuah@kernel.org>
Subject: Re: [PATCH RFC v2 07/11] KVM: s390: Shadow VSIE SCA in guest-1
Date: Tue, 18 Nov 2025 10:27:51 +0100	[thread overview]
Message-ID: <DEBPPQ1YH6TY.3W3PLCBFCYOAG@linux.ibm.com> (raw)
In-Reply-To: <c92235d2-cee0-40c8-9a86-1334aaba4875@linux.ibm.com>

On Mon Nov 17, 2025 at 4:22 PM CET, Janosch Frank wrote:
> On 11/10/25 18:16, Christoph Schlameuss wrote:
>> Restructure kvm_s390_handle_vsie() to create a guest-1 shadow of the SCA
>> if guest-2 attempts to enter SIE with an SCA. If the SCA is used the
>> vsie_pages are stored in a new vsie_sca struct instead of the arch vsie
>> struct.
>> 
>> When the VSIE-Interpretation-Extension Facility is active (minimum z17)
>> the shadow SCA (ssca_block) will be created and shadows of all CPUs
>> defined in the configuration are created.
>> SCAOL/H in the VSIE control block are overwritten with references to the
>> shadow SCA.
>> 
>> The shadow SCA contains the addresses of the original guest-3 SCA as
>> well as the original VSIE control blocks. With these addresses the
>> machine can directly monitor the intervention bits within the original
>> SCA entries, enabling it to handle SENSE_RUNNING and EXTERNAL_CALL sigp
>> instructions without exiting VSIE.
>> 
>> The original SCA will be pinned in guest-2 memory and only be unpinned
>> before reuse. This means some pages might still be pinned even after the
>> guest 3 VM does no longer exist.
>> 
>> The ssca_blocks are also kept within a radix tree to reuse already
>> existing ssca_blocks efficiently. While the radix tree and array with
>> references to the ssca_blocks are held in the vsie_sca struct.
>> The use of vsie_scas is tracked using an ref_count.
>> 
>> Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
>
> [...]
>
>> +/*
>> + * Try to find an currently unused ssca_vsie from the vsie struct.
>> + *
>> + * Called with ssca_lock held.
>> + */
>> +static struct vsie_sca *get_free_existing_vsie_sca(struct kvm *kvm)
>> +{
>> +	struct vsie_sca *sca;
>> +	int i, ref_count;
>> +
>> +	for (i = 0; i >= kvm->arch.vsie.sca_count; i++) {
>> +		sca = kvm->arch.vsie.scas[kvm->arch.vsie.sca_next];
>> +		kvm->arch.vsie.sca_next++;
>> +		kvm->arch.vsie.sca_next %= kvm->arch.vsie.sca_count;
>> +		ref_count = atomic_inc_return(&sca->ref_count);
>> +		WARN_ON_ONCE(ref_count < 1);
>> +		if (ref_count == 1)
>> +			return sca;
>> +		atomic_dec(&sca->ref_count);
>> +	}
>> +	return ERR_PTR(-EFAULT);
>
> ENOENT?
>

Ack.

>> +}
>> +
>> +static void destroy_vsie_sca(struct kvm *kvm, struct vsie_sca *sca)
>> +{
>> +	radix_tree_delete(&kvm->arch.vsie.osca_to_sca, sca->sca_gpa);
>> +	if (sca->ssca)
>> +		free_pages_exact(sca->ssca, sca->page_count);
>> +	sca->ssca = NULL;
>> +	free_page((unsigned long)sca);
>> +}
>> +
>> +static void put_vsie_sca(struct vsie_sca *sca)
>> +{
>> +	if (!sca)
>> +		return;
>> +
>> +	WARN_ON_ONCE(atomic_dec_return(&sca->ref_count) < 0);
>> +}
>> +
>> +/*
>> + * Pin and get an existing or new guest system control area.
>> + *
>> + * May sleep.
>> + */
>> +static struct vsie_sca *get_vsie_sca(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page,
>> +				     gpa_t sca_addr)
>> +{
>> +	struct vsie_sca *sca, *sca_new = NULL;
>> +	struct kvm *kvm = vcpu->kvm;
>> +	unsigned int max_sca;
>> +	int rc;
>> +
>> +	rc = validate_scao(vcpu, vsie_page->scb_o, vsie_page->sca_gpa);
>> +	if (rc)
>> +		return ERR_PTR(rc);
>
> This is wild.
> validate_scao() returns 0/1 (once you fix the bool) and the rest of the 
> function below returns -ERRNO. I think validate_scao() should return 
> -EINVAL since the scao is clearly invalid if the function doesn't return 0.
>

Yes, -EINVAL will be much more logical. I will also revisit the rest of the
return codes.

>> +
>> +	/* get existing sca */
>> +	down_read(&kvm->arch.vsie.ssca_lock);
>> +	sca = get_existing_vsie_sca(kvm, sca_addr);
>> +	up_read(&kvm->arch.vsie.ssca_lock);
>> +	if (sca)
>> +		return sca;
>> +
>> +	/*
>> +	 * Allocate new ssca, it will likely be needed below.
>> +	 * We want at least #online_vcpus shadows, so every VCPU can execute the
>> +	 * VSIE in parallel. (Worst case all single core VMs.)
>> +	 */
>> +	max_sca = MIN(atomic_read(&kvm->online_vcpus), KVM_S390_MAX_VSIE_VCPUS);
>> +	if (kvm->arch.vsie.sca_count < max_sca) {
>> +		BUILD_BUG_ON(sizeof(struct vsie_sca) > PAGE_SIZE);
>> +		sca_new = (void *)__get_free_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
>
> sca and sca_new are not FW structs, they are not scas that you can hand 
> over to FW. As such they should not be exclusively named sca. Name them 
> vsie_sca or some other name to make it clear that we're working with a 
> KVM struct.
>

Will update the sca variables to vsie_sca.

> Is there a need for sca_new to be page allocated?
> vsie_page's size is close to a page and it is similar to sie_page so 
> that makes sense. But vsie_sca is only a copule of DWORDs until we reach 
> the "pages" member and we could dynamically allocate vsie_sca based on 
> the actual number of max pages since pages is at the end of the struct.
>

I would rather inline member ssca (struct ssca_block) into struct vsie_sca and
then allocate the whole thing at once using
alloc_pages_exact(sizeof(*vsie_sca_new)). That comes out to some words over 2
pages.
But I would rather want to allocate the full the space to hold the ssca for an
esca to not have to check and reallocate when reusing the allocation to shadow
another original sca or even when upgrading from bsca to esca. Especially once
we have the change upstream to start out with the esca for new guest VMs the
likelyhood of esca usages will only go up.

>> +		if (!sca_new)
>> +			return ERR_PTR(-ENOMEM);
>> +
>> +		if (use_vsie_sigpif(vcpu->kvm)) {
>> +			BUILD_BUG_ON(offsetof(struct ssca_block, cpu) != 64);
>> +			sca_new->ssca = alloc_pages_exact(sizeof(*sca_new->ssca),
>> +							  GFP_KERNEL_ACCOUNT | __GFP_ZERO);
>> +			if (!sca_new->ssca) {
>> +				free_page((unsigned long)sca);
>> +				sca_new = NULL;
>> +				return ERR_PTR(-ENOMEM);
>> +			}
>> +		}
>> +	}
>> +
>> +	/* enter write lock and recheck to make sure ssca has not been created by other cpu */
>> +	down_write(&kvm->arch.vsie.ssca_lock);
>> +	sca = get_existing_vsie_sca(kvm, sca_addr);
>> +	if (sca)
>> +		goto out;
>> +
>> +	/* check again under write lock if we are still under our sca_count limit */
>> +	if (sca_new && kvm->arch.vsie.sca_count < max_sca) {
>> +		/* make use of vsie_sca just created */
>> +		sca = sca_new;
>> +		sca_new = NULL;
>> +
>> +		kvm->arch.vsie.scas[kvm->arch.vsie.sca_count] = sca;
>> +	} else {
>> +		/* reuse previously created vsie_sca allocation for different osca */
>> +		sca = get_free_existing_vsie_sca(kvm);
>> +		/* with nr_vcpus scas one must be free */
>> +		if (IS_ERR(sca))
>> +			goto out;
>> +
>> +		unpin_sca(kvm, sca);
>> +		radix_tree_delete(&kvm->arch.vsie.osca_to_sca, sca->sca_gpa);
>> +		memset(sca, 0, sizeof(struct vsie_sca));
>> +	}
>> +
>> +	/* use ECB of shadow scb to determine SCA type */
>> +	if (sie_uses_esca(vsie_page->scb_o))
>> +		__set_bit(VSIE_SCA_ESCA, &sca->flags);
>> +	sca->sca_gpa = sca_addr;
>> +	sca->pages[vsie_page->scb_o->icpua] = vsie_page;
>> +
>> +	if (sca->sca_gpa != 0) {
>> +		/*
>> +		 * The pinned original sca will only be unpinned lazily to limit the
>> +		 * required amount of pins/unpins on each vsie entry/exit.
>> +		 * The unpin is done in the reuse vsie_sca allocation path above and
>> +		 * kvm_s390_vsie_destroy().
>> +		 */
>> +		rc = pin_sca(kvm, vsie_page, sca);
>> +		if (rc) {
>> +			sca = ERR_PTR(rc);
>> +			goto out;
>> +		}
>> +	}
>> +
>> +	atomic_set(&sca->ref_count, 1);
>> +	radix_tree_insert(&kvm->arch.vsie.osca_to_sca, sca->sca_gpa, sca);
>> +
>> +out:
>> +	up_write(&kvm->arch.vsie.ssca_lock);
>> +	if (sca_new)
>> +		destroy_vsie_sca(kvm, sca_new);
>> +	return sca;
>> +}


  reply	other threads:[~2025-11-18  9:28 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-10 17:16 [PATCH RFC v2 00/11] KVM: s390: Add VSIE SIGP Interpretation (vsie_sigpif) Christoph Schlameuss
2025-11-10 17:16 ` [PATCH RFC v2 01/11] KVM: s390: Add SCAO read and write helpers Christoph Schlameuss
2025-11-11 13:45   ` Claudio Imbrenda
2025-11-11 14:37     ` Christoph Schlameuss
2025-11-11 14:55       ` Claudio Imbrenda
2025-11-10 17:16 ` [PATCH RFC v2 02/11] KVM: s390: Remove double 64bscao feature check Christoph Schlameuss
2025-11-10 21:32   ` Eric Farman
2025-11-11  8:13   ` Hendrik Brueckner
2025-11-11 13:20   ` Janosch Frank
2025-11-10 17:16 ` [PATCH RFC v2 03/11] KVM: s390: Move scao validation into a function Christoph Schlameuss
2025-11-10 21:30   ` Eric Farman
2025-11-11  8:48     ` Christoph Schlameuss
2025-11-10 17:16 ` [PATCH RFC v2 04/11] KVM: s390: Add vsie_sigpif detection Christoph Schlameuss
2025-11-10 17:16 ` [PATCH RFC v2 05/11] KVM: s390: Add ssca_block and ssca_entry structs for vsie_ie Christoph Schlameuss
2025-11-10 17:16 ` [PATCH RFC v2 06/11] KVM: s390: Add helper to pin multiple guest pages Christoph Schlameuss
2025-11-13 15:24   ` Janosch Frank
2025-11-10 17:16 ` [PATCH RFC v2 07/11] KVM: s390: Shadow VSIE SCA in guest-1 Christoph Schlameuss
2025-11-14 14:09   ` Janosch Frank
2025-11-17 15:39     ` Christoph Schlameuss
2025-11-17 15:22   ` Janosch Frank
2025-11-18  9:27     ` Christoph Schlameuss [this message]
2025-11-18 16:04   ` Janosch Frank
2025-11-21 15:10     ` Christoph Schlameuss
2025-11-20 11:15   ` Janosch Frank
2025-11-10 17:16 ` [PATCH RFC v2 08/11] KVM: s390: Allow guest-3 cpu add and remove with vsie sigpif Christoph Schlameuss
2025-11-11 15:47   ` Janosch Frank
2025-11-11 16:34     ` Christoph Schlameuss
2025-11-10 17:16 ` [PATCH RFC v2 09/11] KVM: s390: Allow guest-3 switch to extended sca " Christoph Schlameuss
2025-11-11 14:18   ` Janosch Frank
2025-11-10 17:16 ` [PATCH RFC v2 10/11] KVM: s390: Add VSIE shadow configuration Christoph Schlameuss
2025-11-20 11:02   ` Janosch Frank
2025-11-10 17:16 ` [PATCH RFC v2 11/11] KVM: s390: Add VSIE shadow stat counters Christoph Schlameuss
2025-11-20 11:07   ` Janosch Frank

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DEBPPQ1YH6TY.3W3PLCBFCYOAG@linux.ibm.com \
    --to=schlameuss@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=nrb@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=shuah@kernel.org \
    --cc=svens@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).