public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: David Hildenbrand <david@redhat.com>, kvm@vger.kernel.org
Cc: Janosch Frank <frankja@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Sebastian Mitterle <smitterl@redhat.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	linux-s390@vger.kernel.org
Subject: Re: [kvm-unit-tests PATCH v2 2/2] s390x: firq: floating interrupt test
Date: Fri, 3 Dec 2021 11:55:31 +0100	[thread overview]
Message-ID: <11f0ff2f-2bae-0f1b-753f-b0e9dc24b345@redhat.com> (raw)
In-Reply-To: <20211202123553.96412-3-david@redhat.com>

On 02/12/2021 13.35, David Hildenbrand wrote:
> We had a KVM BUG fixed by kernel commit a3e03bc1368c ("KVM: s390: index
> kvm->arch.idle_mask by vcpu_idx"), whereby a floating interrupt might get
> stuck forever because a CPU in the wait state would not get woken up.
> 
> The issue can be triggered when CPUs are created in a nonlinear fashion,
> such that the CPU address ("core-id") and the KVM cpu id don't match.
> 
> So let's start with a floating interrupt test that will trigger a
> floating interrupt (via SCLP) to be delivered to a CPU in the wait state.

Thank you very much for tackling this! Some remarks below...

> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   lib/s390x/sclp.c    |  11 ++--
>   lib/s390x/sclp.h    |   1 +
>   s390x/Makefile      |   1 +
>   s390x/firq.c        | 122 ++++++++++++++++++++++++++++++++++++++++++++
>   s390x/unittests.cfg |  10 ++++
>   5 files changed, 142 insertions(+), 3 deletions(-)
>   create mode 100644 s390x/firq.c
> 
> diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c
> index 0272249..33985eb 100644
> --- a/lib/s390x/sclp.c
> +++ b/lib/s390x/sclp.c
> @@ -60,9 +60,7 @@ void sclp_setup_int(void)
>   void sclp_handle_ext(void)
>   {
>   	ctl_clear_bit(0, CTL0_SERVICE_SIGNAL);
> -	spin_lock(&sclp_lock);
> -	sclp_busy = false;
> -	spin_unlock(&sclp_lock);
> +	sclp_clear_busy();
>   }
>   
>   void sclp_wait_busy(void)
> @@ -89,6 +87,13 @@ void sclp_mark_busy(void)
>   	}
>   }
>   
> +void sclp_clear_busy(void)
> +{
> +	spin_lock(&sclp_lock);
> +	sclp_busy = false;
> +	spin_unlock(&sclp_lock);
> +}
> +
>   static void sclp_read_scp_info(ReadInfo *ri, int length)
>   {
>   	unsigned int commands[] = { SCLP_CMDW_READ_SCP_INFO_FORCED,
> diff --git a/lib/s390x/sclp.h b/lib/s390x/sclp.h
> index 61e9cf5..fead007 100644
> --- a/lib/s390x/sclp.h
> +++ b/lib/s390x/sclp.h
> @@ -318,6 +318,7 @@ void sclp_setup_int(void);
>   void sclp_handle_ext(void);
>   void sclp_wait_busy(void);
>   void sclp_mark_busy(void);
> +void sclp_clear_busy(void);
>   void sclp_console_setup(void);
>   void sclp_print(const char *str);
>   void sclp_read_info(void);
> diff --git a/s390x/Makefile b/s390x/Makefile
> index f95f2e6..1e567c1 100644
> --- a/s390x/Makefile
> +++ b/s390x/Makefile
> @@ -25,6 +25,7 @@ tests += $(TEST_DIR)/uv-host.elf
>   tests += $(TEST_DIR)/edat.elf
>   tests += $(TEST_DIR)/mvpg-sie.elf
>   tests += $(TEST_DIR)/spec_ex-sie.elf
> +tests += $(TEST_DIR)/firq.elf
>   
>   tests_binary = $(patsubst %.elf,%.bin,$(tests))
>   ifneq ($(HOST_KEY_DOCUMENT),)
> diff --git a/s390x/firq.c b/s390x/firq.c
> new file mode 100644
> index 0000000..1f87718
> --- /dev/null
> +++ b/s390x/firq.c
> @@ -0,0 +1,122 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Floating interrupt tests.
> + *
> + * Copyright 2021 Red Hat Inc
> + *
> + * Authors:
> + *    David Hildenbrand <david@redhat.com>
> + */
> +#include <libcflat.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/interrupt.h>
> +#include <asm/page.h>
> +#include <asm-generic/barrier.h>
> +
> +#include <sclp.h>
> +#include <smp.h>
> +#include <alloc_page.h>
> +
> +static void wait_for_sclp_int(void)
> +{
> +	/* Enable SCLP interrupts on this CPU only. */
> +	ctl_set_bit(0, CTL0_SERVICE_SIGNAL);
> +
> +	/* Enable external interrupts and go to the wait state. */
> +	wait_for_interrupt(PSW_MASK_EXT);
> +}

What happens if the CPU got an interrupt? Should there be a "while (true)" 
at the end of the function to avoid that the CPU ends up crashing at the end 
of the function?

> +/*
> + * Some KVM versions might mix CPUs when looking for a floating IRQ target,
> + * accidentially detecting a stopped CPU as waiting and resulting in the actually
> + * waiting CPU not getting woken up for the interrupt.
> + */
> +static void test_wait_state_delivery(void)
> +{
> +	struct psw psw;
> +	SCCBHeader *h;
> +	int ret;
> +
> +	report_prefix_push("wait state delivery");
> +
> +	if (smp_query_num_cpus() < 3) {
> +		report_skip("need at least 3 CPUs for this test");
> +		goto out;
> +	}
> +
> +	if (stap()) {
> +		report_skip("need to start on CPU #0");
> +		goto out;
> +	}

I think I'd rather turn this into an assert() instead ... no strong opinion 
about it, though.

> +
> +	/*
> +	 * We want CPU #2 to be stopped. This should be the case at this
> +	 * point, however, we want to sense if it even exists as well.
> +	 */
> +	ret = smp_cpu_stop(2);
> +	if (ret) {
> +		report_skip("CPU #2 not found");

Since you already queried for the availablity of at least 3 CPUs above, I 
think you could turn this into a report_fail() instead?

> +		goto out;
> +	}
> +
> +	/*
> +	 * We're going to perform an SCLP service call but expect
> +	 * the interrupt on CPU #1 while it is in the wait state.
> +	 */
> +	sclp_mark_busy();
> +
> +	/* Start CPU #1 and let it wait for the interrupt. */
> +	psw.mask = extract_psw_mask();
> +	psw.addr = (unsigned long)wait_for_sclp_int;
> +	ret = smp_cpu_setup(1, psw);
> +	if (ret) {
> +		sclp_clear_busy();
> +		report_skip("cpu #1 not found");
> +		goto out;
> +	}
> +
> +	/*
> +	 * We'd have to jump trough some hoops to sense e.g., via SIGP
> +	 * CONDITIONAL EMERGENCY SIGNAL if CPU #1 is already in the
> +	 * wait state.
> +	 *
> +	 * Although not completely reliable, use SIGP SENSE RUNNING STATUS
> +	 * until not reported as running -- after all, our SCLP processing
> +	 * will take some time as well and smp_cpu_setup() returns when we're
> +	 * either already in wait_for_sclp_int() or just about to execute it.
> +	 */
> +	while(smp_sense_running_status(1));
> +
> +	h = alloc_page();
> +	h->length = 4096;
> +	ret = servc(SCLP_CMDW_READ_CPU_INFO, __pa(h));
> +	if (ret) {
> +		sclp_clear_busy();
> +		report_fail("SCLP_CMDW_READ_CPU_INFO failed");
> +		goto out_destroy;
> +	}
> +
> +	/*
> +	 * Wait until the interrupt gets delivered on CPU #1, marking the
> +	 * SCLP requests as done.
> +	 */
> +	sclp_wait_busy();
> +
> +	report(true, "sclp interrupt delivered");
> +
> +out_destroy:
> +	free_page(h);
> +	smp_cpu_destroy(1);
> +out:
> +	report_prefix_pop();
> +}

Anyway, code looks fine for me, either with my comments addressed or not:

Reviewed-by: Thomas Huth <thuth@redhat.com>


  parent reply	other threads:[~2021-12-03 10:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-02 12:35 [kvm-unit-tests PATCH v2 0/2] s390x: firq: floating interrupt test David Hildenbrand
2021-12-02 12:35 ` [kvm-unit-tests PATCH v2 1/2] s390x: make smp_cpu_setup() return 0 on success David Hildenbrand
2021-12-02 12:35 ` [kvm-unit-tests PATCH v2 2/2] s390x: firq: floating interrupt test David Hildenbrand
2021-12-02 12:45   ` Claudio Imbrenda
2021-12-03 10:55   ` Thomas Huth [this message]
2021-12-03 11:18     ` Claudio Imbrenda
2021-12-03 11:22       ` Thomas Huth
2021-12-03 18:23       ` David Hildenbrand
2021-12-06  7:12         ` Thomas Huth
2021-12-06  8:15           ` David Hildenbrand
2021-12-06 11:09             ` Claudio Imbrenda
2021-12-06 13:35 ` [kvm-unit-tests PATCH v2 0/2] " Claudio Imbrenda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11f0ff2f-2bae-0f1b-753f-b0e9dc24b345@redhat.com \
    --to=thuth@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    --cc=smitterl@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox