kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v4 0/3] riscv: sbi: Add support to test HSM extension
@ 2024-09-15 18:34 James Raphael Tiovalen
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 1/3] riscv: Rewrite hartid_to_cpu in assembly James Raphael Tiovalen
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: James Raphael Tiovalen @ 2024-09-15 18:34 UTC (permalink / raw)
  To: kvm, kvm-riscv; +Cc: andrew.jones, atishp, cade.richard, James Raphael Tiovalen

This patch series adds support for testing all 4 functions of the HSM
extension as defined in the RISC-V SBI specification. The first 2
patches adds some helper routines to prepare for the HSM test, while
the third patch adds the actual test for the HSM extension.

v4:
- Addressed all of Andrew's comments.
- Included the 2 patches from Andrew's branch that refactored some
  functions.
- Added timers to all of the waiting activities in the HSM tests.

v3:
- Addressed all of Andrew's comments.
- Split the report_prefix_pop patch into its own series.
- Added a new environment variable to specify the maximum number of
  CPUs supported by the SBI implementation.

v2:
- Addressed all of Andrew's comments.
- Added a new patch to add helper routines to clear multiple prefixes.
- Reworked the approach to test the HSM extension by using cpumask and
  on-cpus.

Andrew Jones (1):
  riscv: Rewrite hartid_to_cpu in assembly

James Raphael Tiovalen (2):
  riscv: sbi: Provide entry point for HSM tests
  riscv: sbi: Add tests for HSM extension

 riscv/Makefile          |   3 +-
 lib/riscv/asm-offsets.c |   5 +
 lib/riscv/setup.c       |  10 -
 riscv/sbi-tests.h       |  10 +
 riscv/sbi.h             |  10 +
 riscv/cstart.S          |  23 ++
 riscv/sbi-asm.S         |  71 +++++
 riscv/sbi.c             | 566 ++++++++++++++++++++++++++++++++++++++++
 8 files changed, 687 insertions(+), 11 deletions(-)
 create mode 100644 riscv/sbi-tests.h
 create mode 100644 riscv/sbi.h
 create mode 100644 riscv/sbi-asm.S

--
2.43.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [kvm-unit-tests PATCH v4 1/3] riscv: Rewrite hartid_to_cpu in assembly
  2024-09-15 18:34 [kvm-unit-tests PATCH v4 0/3] riscv: sbi: Add support to test HSM extension James Raphael Tiovalen
@ 2024-09-15 18:34 ` James Raphael Tiovalen
  2024-09-16  7:19   ` Andrew Jones
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 2/3] riscv: sbi: Provide entry point for HSM tests James Raphael Tiovalen
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 3/3] riscv: sbi: Add tests for HSM extension James Raphael Tiovalen
  2 siblings, 1 reply; 7+ messages in thread
From: James Raphael Tiovalen @ 2024-09-15 18:34 UTC (permalink / raw)
  To: kvm, kvm-riscv; +Cc: andrew.jones, atishp, cade.richard

From: Andrew Jones <andrew.jones@linux.dev>

Some SBI HSM tests run without a stack being setup so they can't
run C code. Those tests still need to know the corresponding cpuid
for the hartid on which they are running. Give those tests
hartid_to_cpu() by reimplementing it in assembly.

Signed-off-by: Andrew Jones <andrew.jones@linux.dev>
---
 lib/riscv/asm-offsets.c |  5 +++++
 lib/riscv/setup.c       | 10 ----------
 riscv/cstart.S          | 23 +++++++++++++++++++++++
 3 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/lib/riscv/asm-offsets.c b/lib/riscv/asm-offsets.c
index a2a32438..6c511c14 100644
--- a/lib/riscv/asm-offsets.c
+++ b/lib/riscv/asm-offsets.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 #include <kbuild.h>
 #include <elf.h>
+#include <asm/processor.h>
 #include <asm/ptrace.h>
 #include <asm/smp.h>
 
@@ -58,5 +59,9 @@ int main(void)
 	OFFSET(SECONDARY_FUNC, secondary_data, func);
 	DEFINE(SECONDARY_DATA_SIZE, sizeof(struct secondary_data));
 
+	OFFSET(THREAD_INFO_CPU, thread_info, cpu);
+	OFFSET(THREAD_INFO_HARTID, thread_info, hartid);
+	DEFINE(THREAD_INFO_SIZE, sizeof(struct thread_info));
+
 	return 0;
 }
diff --git a/lib/riscv/setup.c b/lib/riscv/setup.c
index 495db041..f347ad63 100644
--- a/lib/riscv/setup.c
+++ b/lib/riscv/setup.c
@@ -43,16 +43,6 @@ uint64_t timebase_frequency;
 
 static struct mem_region riscv_mem_regions[NR_MEM_REGIONS + 1];
 
-int hartid_to_cpu(unsigned long hartid)
-{
-	int cpu;
-
-	for_each_present_cpu(cpu)
-		if (cpus[cpu].hartid == hartid)
-			return cpu;
-	return -1;
-}
-
 static void cpu_set_fdt(int fdtnode __unused, u64 regval, void *info __unused)
 {
 	int cpu = nr_cpus++;
diff --git a/riscv/cstart.S b/riscv/cstart.S
index 8f269997..6784d5e1 100644
--- a/riscv/cstart.S
+++ b/riscv/cstart.S
@@ -109,6 +109,29 @@ halt:
 1:	wfi
 	j	1b
 
+/*
+ * hartid_to_cpu
+ *   a0 is a hartid on entry
+ * returns the corresponding cpuid in a0
+ */
+.balign 4
+.global hartid_to_cpu
+hartid_to_cpu:
+	la	t0, cpus
+	la	t1, nr_cpus
+	lw	t1, 0(t1)
+	li	t2, 0
+1:	bne	t2, t1, 2f
+	li	a0, -1
+	ret
+2:	REG_L	t3, THREAD_INFO_HARTID(t0)
+	bne	a0, t3, 3f
+	lw	a0, THREAD_INFO_CPU(t0)
+	ret
+3:	addi	t0, t0, THREAD_INFO_SIZE
+	addi	t2, t2, 1
+	j	1b
+
 .balign 4
 .global secondary_entry
 secondary_entry:
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [kvm-unit-tests PATCH v4 2/3] riscv: sbi: Provide entry point for HSM tests
  2024-09-15 18:34 [kvm-unit-tests PATCH v4 0/3] riscv: sbi: Add support to test HSM extension James Raphael Tiovalen
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 1/3] riscv: Rewrite hartid_to_cpu in assembly James Raphael Tiovalen
@ 2024-09-15 18:34 ` James Raphael Tiovalen
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 3/3] riscv: sbi: Add tests for HSM extension James Raphael Tiovalen
  2 siblings, 0 replies; 7+ messages in thread
From: James Raphael Tiovalen @ 2024-09-15 18:34 UTC (permalink / raw)
  To: kvm, kvm-riscv; +Cc: andrew.jones, atishp, cade.richard, James Raphael Tiovalen

The HSM tests will need to test HSM start and resumption from HSM
suspend. Provide an entry point written in assembly that doesn't
use a stack for this. Results of the test are written to global
per-hart arrays to be checked by the main SBI HSM test function. The
started/resumed hart does its checks and then just loops until it
gets a signal from the main SBI HSM test function to invoke HSM stop.

Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Co-developed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Andrew Jones <andrew.jones@linux.dev>
---
 riscv/Makefile    |  3 +-
 riscv/sbi-tests.h | 10 +++++++
 riscv/sbi-asm.S   | 71 +++++++++++++++++++++++++++++++++++++++++++++++
 riscv/sbi.c       |  5 ++++
 4 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100644 riscv/sbi-tests.h
 create mode 100644 riscv/sbi-asm.S

diff --git a/riscv/Makefile b/riscv/Makefile
index 2ee7c5bb..4676d262 100644
--- a/riscv/Makefile
+++ b/riscv/Makefile
@@ -43,6 +43,7 @@ cflatobjs += lib/riscv/timer.o
 ifeq ($(ARCH),riscv32)
 cflatobjs += lib/ldiv32.o
 endif
+cflatobjs += riscv/sbi-asm.o
 
 ########################################
 
@@ -80,7 +81,7 @@ CFLAGS += -mcmodel=medany
 CFLAGS += -std=gnu99
 CFLAGS += -ffreestanding
 CFLAGS += -O2
-CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib
+CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv
 
 asm-offsets = lib/riscv/asm-offsets.h
 include $(SRCDIR)/scripts/asm-offsets.mak
diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h
new file mode 100644
index 00000000..f5cc8635
--- /dev/null
+++ b/riscv/sbi-tests.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _RISCV_SBI_TESTS_H_
+#define _RISCV_SBI_TESTS_H_
+
+#define SBI_HSM_TEST_DONE	(1 << 0)
+#define SBI_HSM_TEST_HARTID_A1	(1 << 1)
+#define SBI_HSM_TEST_SATP	(1 << 2)
+#define SBI_HSM_TEST_SIE	(1 << 3)
+
+#endif /* _RISCV_SBI_TESTS_H_ */
diff --git a/riscv/sbi-asm.S b/riscv/sbi-asm.S
new file mode 100644
index 00000000..f165f9da
--- /dev/null
+++ b/riscv/sbi-asm.S
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Helper assembly code routines for RISC-V SBI extension tests.
+ *
+ * Copyright (C) 2024, James Raphael Tiovalen <jamestiotio@gmail.com>
+ */
+#define __ASSEMBLY__
+#include <asm/csr.h>
+
+#include "sbi-tests.h"
+
+.section .text
+
+/*
+ * sbi_hsm_check
+ *   a0 and a1 are set by SBI HSM start/suspend
+ *   s1 is the address of the results array
+ * Doesn't return.
+ *
+ * This function is only called from HSM start and on resumption
+ * from HSM suspend which means we can do whatever we like with
+ * all registers. So, to avoid complicated register agreements with
+ * other assembly functions called, we just always use the saved
+ * registers for anything that should be maintained across calls.
+ */
+#define RESULTS_ARRAY	s1
+#define RESULTS_MAP	s2
+#define CPU_INDEX	s3
+.balign 4
+sbi_hsm_check:
+	li	RESULTS_MAP, 0
+	bne	a0, a1, 1f
+	ori	RESULTS_MAP, RESULTS_MAP, SBI_HSM_TEST_HARTID_A1
+1:	csrr	t0, CSR_SATP
+	bnez	t0, 2f
+	ori	RESULTS_MAP, RESULTS_MAP, SBI_HSM_TEST_SATP
+2:	csrr	t0, CSR_SSTATUS
+	andi	t0, t0, SR_SIE
+	bnez	t0, 3f
+	ori	RESULTS_MAP, RESULTS_MAP, SBI_HSM_TEST_SIE
+3:	call	hartid_to_cpu
+	mv	CPU_INDEX, a0
+	li	t0, -1
+	bne	CPU_INDEX, t0, 5f
+4:	pause
+	j	4b
+5:	ori	RESULTS_MAP, RESULTS_MAP, SBI_HSM_TEST_DONE
+	add	t0, RESULTS_ARRAY, CPU_INDEX
+	sb	RESULTS_MAP, 0(t0)
+	la	t1, sbi_hsm_stop_hart
+	add	t1, t1, CPU_INDEX
+6:	lb	t0, 0(t1)
+	pause
+	beqz	t0, 6b
+	li	a7, 0x48534d	/* SBI_EXT_HSM */
+	li	a6, 1		/* SBI_EXT_HSM_HART_STOP */
+	ecall
+7:	pause
+	j	7b
+
+.balign 4
+.global sbi_hsm_check_hart_start
+sbi_hsm_check_hart_start:
+	la	RESULTS_ARRAY, sbi_hsm_hart_start_checks
+	j	sbi_hsm_check
+
+.balign 4
+.global sbi_hsm_check_non_retentive_suspend
+sbi_hsm_check_non_retentive_suspend:
+	la	RESULTS_ARRAY, sbi_hsm_non_retentive_hart_suspend_checks
+	j	sbi_hsm_check
diff --git a/riscv/sbi.c b/riscv/sbi.c
index a7abc08c..d4dfd48e 100644
--- a/riscv/sbi.c
+++ b/riscv/sbi.c
@@ -18,6 +18,7 @@
 #include <asm/mmu.h>
 #include <asm/processor.h>
 #include <asm/sbi.h>
+#include <asm/setup.h>
 #include <asm/smp.h>
 #include <asm/timer.h>
 
@@ -288,6 +289,10 @@ static void check_time(void)
 	report_prefix_popn(2);
 }
 
+unsigned char sbi_hsm_stop_hart[NR_CPUS];
+unsigned char sbi_hsm_hart_start_checks[NR_CPUS];
+unsigned char sbi_hsm_non_retentive_hart_suspend_checks[NR_CPUS];
+
 #define DBCN_WRITE_TEST_STRING		"DBCN_WRITE_TEST_STRING\n"
 #define DBCN_WRITE_BYTE_TEST_BYTE	((u8)'a')
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [kvm-unit-tests PATCH v4 3/3] riscv: sbi: Add tests for HSM extension
  2024-09-15 18:34 [kvm-unit-tests PATCH v4 0/3] riscv: sbi: Add support to test HSM extension James Raphael Tiovalen
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 1/3] riscv: Rewrite hartid_to_cpu in assembly James Raphael Tiovalen
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 2/3] riscv: sbi: Provide entry point for HSM tests James Raphael Tiovalen
@ 2024-09-15 18:34 ` James Raphael Tiovalen
  2024-09-16  9:23   ` Andrew Jones
  2024-09-16 10:49   ` Andrew Jones
  2 siblings, 2 replies; 7+ messages in thread
From: James Raphael Tiovalen @ 2024-09-15 18:34 UTC (permalink / raw)
  To: kvm, kvm-riscv; +Cc: andrew.jones, atishp, cade.richard, James Raphael Tiovalen

Add some tests for all of the HSM extension functions. These tests
ensure that the HSM extension functions follow the behavior as described
in the SBI specification.

Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
---
 riscv/sbi.h |  10 +
 riscv/sbi.c | 561 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 571 insertions(+)
 create mode 100644 riscv/sbi.h

diff --git a/riscv/sbi.h b/riscv/sbi.h
new file mode 100644
index 00000000..e8625cb1
--- /dev/null
+++ b/riscv/sbi.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _RISCV_SBI_H_
+#define _RISCV_SBI_H_
+
+#define SBI_HSM_TEST_DONE	(1 << 0)
+#define SBI_HSM_TEST_SATP	(1 << 1)
+#define SBI_HSM_TEST_SIE	(1 << 2)
+#define SBI_HSM_TEST_HARTID_A1	(1 << 3)
+
+#endif /* _RISCV_SBI_H_ */
diff --git a/riscv/sbi.c b/riscv/sbi.c
index d4dfd48e..fab0091b 100644
--- a/riscv/sbi.c
+++ b/riscv/sbi.c
@@ -6,6 +6,8 @@
  */
 #include <libcflat.h>
 #include <alloc_page.h>
+#include <cpumask.h>
+#include <on-cpus.h>
 #include <stdlib.h>
 #include <string.h>
 #include <limits.h>
@@ -16,11 +18,13 @@
 #include <asm/delay.h>
 #include <asm/io.h>
 #include <asm/mmu.h>
+#include <asm/page.h>
 #include <asm/processor.h>
 #include <asm/sbi.h>
 #include <asm/setup.h>
 #include <asm/smp.h>
 #include <asm/timer.h>
+#include <sbi.h>
 
 #define	HIGH_ADDR_BOUNDARY	((phys_addr_t)1 << 32)
 
@@ -47,6 +51,11 @@ static struct sbiret sbi_dbcn_write_byte(uint8_t byte)
 	return sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_WRITE_BYTE, byte, 0, 0, 0, 0, 0);
 }
 
+static struct sbiret sbi_hart_suspend(uint32_t suspend_type, unsigned long resume_addr, unsigned long opaque)
+{
+	return sbi_ecall(SBI_EXT_HSM, SBI_EXT_HSM_HART_SUSPEND, suspend_type, resume_addr, opaque, 0, 0, 0);
+}
+
 static void split_phys_addr(phys_addr_t paddr, unsigned long *hi, unsigned long *lo)
 {
 	*lo = (unsigned long)paddr;
@@ -434,6 +443,557 @@ static void check_dbcn(void)
 	report_prefix_popn(2);
 }
 
+unsigned char sbi_hsm_stop_hart[NR_CPUS];
+unsigned char sbi_hsm_hart_start_checks[NR_CPUS];
+unsigned char sbi_hsm_non_retentive_hart_suspend_checks[NR_CPUS];
+cpumask_t sbi_hsm_started_hart_checks;
+cpumask_t sbi_hsm_invalid_hartid_checks;
+static bool hsm_timer_fired;
+extern void sbi_hsm_check_hart_start(void);
+extern void sbi_hsm_check_non_retentive_suspend(void);
+
+static void hsm_timer_irq_handler(struct pt_regs *regs)
+{
+	hsm_timer_fired = true;
+	timer_stop();
+}
+
+static void hsm_timer_setup(void)
+{
+	install_irq_handler(IRQ_S_TIMER, hsm_timer_irq_handler);
+	local_irq_enable();
+	timer_irq_enable();
+}
+
+static void hsm_timer_teardown(void)
+{
+	timer_irq_disable();
+	local_irq_disable();
+	install_irq_handler(IRQ_S_TIMER, NULL);
+}
+
+static void hart_empty_fn(void *data) {}
+
+static void hart_execute(void *data)
+{
+	struct sbiret ret;
+	unsigned long hartid = current_thread_info()->hartid;
+	int me = smp_processor_id();
+
+	ret = sbi_hart_start(hartid, virt_to_phys(&hart_empty_fn), 0);
+
+	if (ret.error == SBI_ERR_ALREADY_AVAILABLE)
+		cpumask_set_cpu(me, &sbi_hsm_started_hart_checks);
+
+	ret = sbi_hart_start(ULONG_MAX, virt_to_phys(&hart_empty_fn), 0);
+
+	if (ret.error == SBI_ERR_INVALID_PARAM)
+		cpumask_set_cpu(me, &sbi_hsm_invalid_hartid_checks);
+}
+
+static void hart_retentive_suspend(void *data)
+{
+	unsigned long hartid = current_thread_info()->hartid;
+	struct sbiret ret = sbi_hart_suspend(SBI_EXT_HSM_HART_SUSPEND_RETENTIVE, 0, 0);
+
+	if (ret.error)
+		report_fail("failed to retentive suspend hart %ld (error=%ld)", hartid, ret.error);
+}
+
+static void hart_non_retentive_suspend(void *data)
+{
+	unsigned long hartid = current_thread_info()->hartid;
+
+	/* Set opaque as hartid so that we can check a0 == a1, ensuring that a0 is hartid and a1 is opaque */
+	struct sbiret ret = sbi_hart_suspend(SBI_EXT_HSM_HART_SUSPEND_NON_RETENTIVE,
+					     virt_to_phys(&sbi_hsm_check_non_retentive_suspend), hartid);
+
+	report_fail("failed to non-retentive suspend hart %ld (error=%ld)", hartid, ret.error);
+}
+
+static void hart_wait_on_status(unsigned long hartid, enum sbi_ext_hsm_sid status)
+{
+	struct sbiret ret = sbi_hart_get_status(hartid);
+
+	while (!ret.error && ret.value == status && !hsm_timer_fired) {
+		cpu_relax();
+		ret = sbi_hart_get_status(hartid);
+	}
+
+	if (hsm_timer_fired)
+		report_info("timer fired while waiting on status %u for hart %ld", status, hartid);
+	else if (ret.error)
+		report_fail("got %ld while waiting on status %u for hart %ld\n", ret.error, status, hartid);
+}
+
+static void check_hsm(void)
+{
+	struct sbiret ret;
+	unsigned long hartid;
+	cpumask_t secondary_cpus_mask, hsm_start, hsm_stop, hsm_suspend, hsm_resume, hsm_check;
+	int cpu, me = smp_processor_id();
+	int max_cpu = getenv("SBI_HSM_MAX_CPU") ? strtol(getenv("SBI_HSM_MAX_CPU"), NULL, 0) : INT_MAX;
+	unsigned long hsm_timer_duration = getenv("SBI_HSM_TIMER_DURATION")
+					 ? strtol(getenv("SBI_HSM_TIMER_DURATION"), NULL, 0) : 200000;
+
+	max_cpu = MIN(max_cpu, nr_cpus - 1);
+
+	cpumask_copy(&secondary_cpus_mask, &cpu_present_mask);
+	cpumask_clear_cpu(me, &secondary_cpus_mask);
+	for_each_cpu(cpu, &secondary_cpus_mask)
+		if (cpu > max_cpu)
+			cpumask_clear_cpu(cpu, &secondary_cpus_mask);
+
+	report_prefix_push("hsm");
+
+	if (!sbi_probe(SBI_EXT_HSM)) {
+		report_skip("hsm extension not available");
+		report_prefix_pop();
+		return;
+	}
+
+	report_prefix_push("hart_get_status");
+
+	hartid = current_thread_info()->hartid;
+	ret = sbi_hart_get_status(hartid);
+
+	if (ret.error) {
+		report_fail("failed to get status of current hart (error=%ld)", ret.error);
+		report_prefix_popn(2);
+		return;
+	} else if (ret.value != SBI_EXT_HSM_STARTED) {
+		report_fail("current hart is not started (ret.value=%ld)", ret.value);
+		report_prefix_popn(2);
+		return;
+	}
+
+	report_pass("status of current hart is started");
+
+	report_prefix_pop();
+
+	if (max_cpu + 1 < 2) {
+		report_skip("no other cpus to run the remaining hsm tests on");
+		report_prefix_pop();
+		return;
+	}
+
+	/* This is necessary since we do not choose which cpu the boot hart will run on */
+	if (me > max_cpu)
+		max_cpu++;
+
+	report_prefix_push("hart_start");
+
+	for_each_cpu(cpu, &secondary_cpus_mask) {
+		hartid = cpus[cpu].hartid;
+		/* Set opaque as hartid so that we can check a0 == a1, ensuring that a0 is hartid and a1 is opaque */
+		ret = sbi_hart_start(hartid, virt_to_phys(&sbi_hsm_check_hart_start), hartid);
+		if (ret.error) {
+			report_fail("failed to start test hart %ld (error=%ld)", hartid, ret.error);
+			report_prefix_popn(2);
+			return;
+		}
+	}
+
+	cpumask_clear(&hsm_start);
+	hsm_timer_setup();
+	timer_start(hsm_timer_duration);
+
+	for_each_cpu(cpu, &secondary_cpus_mask) {
+		hartid = cpus[cpu].hartid;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_STOPPED);
+		if (hsm_timer_fired)
+			break;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_START_PENDING);
+		if (hsm_timer_fired)
+			break;
+		ret = sbi_hart_get_status(hartid);
+		if (ret.error)
+			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
+		else if (ret.value != SBI_EXT_HSM_STARTED)
+			report_info("hart %ld status is not 'started' (ret.value=%ld)", hartid, ret.value);
+		else
+			cpumask_set_cpu(cpu, &hsm_start);
+	}
+
+	if (hsm_timer_fired) {
+		hsm_timer_teardown();
+		report_fail("hsm timer fired before all secondary harts started");
+		report_prefix_popn(2);
+		return;
+	}
+
+	report(cpumask_weight(&hsm_start) == max_cpu, "all secondary harts started");
+
+	cpumask_clear(&hsm_check);
+
+	for_each_cpu(cpu, &secondary_cpus_mask) {
+		hartid = cpus[cpu].hartid;
+
+		while (!(READ_ONCE(sbi_hsm_hart_start_checks[cpu]) & SBI_HSM_TEST_DONE) && !hsm_timer_fired)
+			cpu_relax();
+
+		if (hsm_timer_fired)
+			break;
+
+		if (!(sbi_hsm_hart_start_checks[cpu] & SBI_HSM_TEST_SATP))
+			report_info("satp is not zero for test hart %ld", hartid);
+		else if (!(sbi_hsm_hart_start_checks[cpu] & SBI_HSM_TEST_SIE))
+			report_info("sstatus.SIE is not zero for test hart %ld", hartid);
+		else if (!(sbi_hsm_hart_start_checks[cpu] & SBI_HSM_TEST_HARTID_A1))
+			report_info("either a0 or a1 is not hartid for test hart %ld", hartid);
+		else
+			cpumask_set_cpu(cpu, &hsm_check);
+	}
+
+	if (hsm_timer_fired) {
+		hsm_timer_teardown();
+		report_fail("hsm timer fired before all secondary harts are done with checks");
+		report_prefix_popn(2);
+		return;
+	}
+
+	timer_stop();
+
+	report(cpumask_weight(&hsm_check) == max_cpu,
+	       "all secondary harts have expected register values after hart start");
+
+	report_prefix_pop();
+
+	report_prefix_push("hart_stop");
+
+	memset(sbi_hsm_stop_hart, 1, sizeof(sbi_hsm_stop_hart));
+
+	cpumask_clear(&hsm_stop);
+	hsm_timer_fired = false;
+	timer_start(hsm_timer_duration);
+
+	for_each_cpu(cpu, &secondary_cpus_mask) {
+		hartid = cpus[cpu].hartid;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_STARTED);
+		if (hsm_timer_fired)
+			break;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_STOP_PENDING);
+		if (hsm_timer_fired)
+			break;
+		ret = sbi_hart_get_status(hartid);
+		if (ret.error)
+			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
+		else if (ret.value != SBI_EXT_HSM_STOPPED)
+			report_info("hart %ld status is not 'stopped' (ret.value=%ld)", hartid, ret.value);
+		else
+			cpumask_set_cpu(cpu, &hsm_stop);
+	}
+
+	if (hsm_timer_fired) {
+		hsm_timer_teardown();
+		report_fail("hsm timer fired before all secondary harts stopped");
+		report_prefix_popn(2);
+		return;
+	}
+
+	timer_stop();
+
+	report(cpumask_weight(&hsm_stop) == max_cpu, "all secondary harts stopped");
+
+	/* Reset the stop flags so that we can reuse them after suspension tests */
+	memset(sbi_hsm_stop_hart, 0, sizeof(sbi_hsm_stop_hart));
+
+	report_prefix_pop();
+
+	report_prefix_push("hart_start");
+
+	on_cpumask_async(&secondary_cpus_mask, hart_execute, NULL);
+
+	cpumask_clear(&hsm_start);
+	hsm_timer_fired = false;
+	timer_start(hsm_timer_duration);
+
+	for_each_cpu(cpu, &secondary_cpus_mask) {
+		hartid = cpus[cpu].hartid;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_STOPPED);
+		if (hsm_timer_fired)
+			break;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_START_PENDING);
+		if (hsm_timer_fired)
+			break;
+		ret = sbi_hart_get_status(hartid);
+		if (ret.error)
+			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
+		else if (ret.value != SBI_EXT_HSM_STARTED)
+			report_info("hart %ld status is not 'started' (ret.value=%ld)", hartid, ret.value);
+		else
+			cpumask_set_cpu(cpu, &hsm_start);
+	}
+
+	if (hsm_timer_fired) {
+		hsm_timer_teardown();
+		report_fail("hsm timer fired before all secondary harts started");
+		report_prefix_popn(2);
+		return;
+	}
+
+	report(cpumask_weight(&hsm_start) == max_cpu, "all secondary harts started");
+
+	while (cpumask_weight(&cpu_idle_mask) != max_cpu && !hsm_timer_fired)
+		cpu_relax();
+
+	if (hsm_timer_fired) {
+		hsm_timer_teardown();
+		report_fail("hsm timer fired before all secondary harts started");
+		report_prefix_popn(2);
+		return;
+	}
+
+	timer_stop();
+
+	report(cpumask_weight(&cpu_idle_mask) == max_cpu,
+	       "all secondary harts successfully executed code after start");
+	report(cpumask_weight(&cpu_online_mask) == max_cpu + 1, "all secondary harts online");
+	report(cpumask_weight(&sbi_hsm_started_hart_checks) == max_cpu,
+	       "all secondary harts are already started");
+	report(cpumask_weight(&sbi_hsm_invalid_hartid_checks) == max_cpu,
+	       "all secondary harts refuse to start with invalid hartid");
+
+	report_prefix_pop();
+
+	report_prefix_push("hart_suspend");
+
+	if (!sbi_probe(SBI_EXT_IPI)) {
+		hsm_timer_teardown();
+		report_skip("skipping suspension tests since ipi extension is unavailable");
+		report_prefix_popn(2);
+		return;
+	}
+
+	on_cpumask_async(&secondary_cpus_mask, hart_retentive_suspend, NULL);
+
+	cpumask_clear(&hsm_suspend);
+	hsm_timer_fired = false;
+	timer_start(hsm_timer_duration);
+
+	for_each_cpu(cpu, &secondary_cpus_mask) {
+		hartid = cpus[cpu].hartid;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_STARTED);
+		if (hsm_timer_fired)
+			break;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_SUSPEND_PENDING);
+		if (hsm_timer_fired)
+			break;
+		ret = sbi_hart_get_status(hartid);
+		if (ret.error)
+			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
+		else if (ret.value != SBI_EXT_HSM_SUSPENDED)
+			report_info("hart %ld status is not 'suspended' (ret.value=%ld)", hartid, ret.value);
+		else
+			cpumask_set_cpu(cpu, &hsm_suspend);
+	}
+
+	if (hsm_timer_fired) {
+		hsm_timer_teardown();
+		report_fail("hsm timer fired before all secondary harts retentive suspended");
+		report_prefix_popn(2);
+		return;
+	}
+
+	timer_stop();
+
+	report(cpumask_weight(&hsm_suspend) == max_cpu, "all secondary harts retentive suspended");
+
+	ret = sbi_send_ipi_cpumask(&secondary_cpus_mask);
+
+	if (!ret.error) {
+		cpumask_clear(&hsm_resume);
+		hsm_timer_fired = false;
+		timer_start(hsm_timer_duration);
+
+		for_each_cpu(cpu, &secondary_cpus_mask) {
+			hartid = cpus[cpu].hartid;
+			hart_wait_on_status(hartid, SBI_EXT_HSM_SUSPENDED);
+			if (hsm_timer_fired)
+				break;
+			hart_wait_on_status(hartid, SBI_EXT_HSM_RESUME_PENDING);
+			if (hsm_timer_fired)
+				break;
+			ret = sbi_hart_get_status(hartid);
+			if (ret.error)
+				report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
+			else if (ret.value != SBI_EXT_HSM_STARTED)
+				report_info("hart %ld status is not 'started' (ret.value=%ld)", hartid, ret.value);
+			else
+				cpumask_set_cpu(cpu, &hsm_resume);
+		}
+
+		if (hsm_timer_fired) {
+			hsm_timer_teardown();
+			report_fail("hsm timer fired before all secondary harts retentive resumed");
+			report_prefix_popn(2);
+			return;
+		}
+
+		report(cpumask_weight(&hsm_resume) == max_cpu, "all secondary harts retentive resumed");
+
+		while (cpumask_weight(&cpu_idle_mask) != max_cpu && !hsm_timer_fired)
+			cpu_relax();
+
+		if (hsm_timer_fired) {
+			hsm_timer_teardown();
+			report_fail("hsm timer fired before all secondary harts retentive resumed");
+			report_prefix_popn(2);
+			return;
+		}
+
+		timer_stop();
+
+		report(cpumask_weight(&cpu_idle_mask) == max_cpu,
+		       "all secondary harts successfully executed code after retentive suspend");
+		report(cpumask_weight(&cpu_online_mask) == max_cpu + 1,
+		       "all secondary harts online");
+	}
+
+	on_cpumask_async(&secondary_cpus_mask, hart_non_retentive_suspend, NULL);
+
+	cpumask_clear(&hsm_suspend);
+	hsm_timer_fired = false;
+	timer_start(hsm_timer_duration);
+
+	for_each_cpu(cpu, &secondary_cpus_mask) {
+		hartid = cpus[cpu].hartid;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_STARTED);
+		if (hsm_timer_fired)
+			break;
+		hart_wait_on_status(hartid, SBI_EXT_HSM_SUSPEND_PENDING);
+		if (hsm_timer_fired)
+			break;
+		ret = sbi_hart_get_status(hartid);
+		if (ret.error)
+			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
+		else if (ret.value != SBI_EXT_HSM_SUSPENDED)
+			report_info("hart %ld status is not 'suspended' (ret.value=%ld)", hartid, ret.value);
+		else
+			cpumask_set_cpu(cpu, &hsm_suspend);
+	}
+
+	if (hsm_timer_fired) {
+		hsm_timer_teardown();
+		report_fail("hsm timer fired before all secondary harts non-retentive suspended");
+		report_prefix_popn(2);
+		return;
+	}
+
+	timer_stop();
+
+	report(cpumask_weight(&hsm_suspend) == max_cpu, "all secondary harts non-retentive suspended");
+
+	ret = sbi_send_ipi_cpumask(&secondary_cpus_mask);
+
+	if (!ret.error) {
+		cpumask_clear(&hsm_resume);
+		hsm_timer_fired = false;
+		timer_start(hsm_timer_duration);
+
+		for_each_cpu(cpu, &secondary_cpus_mask) {
+			hartid = cpus[cpu].hartid;
+			hart_wait_on_status(hartid, SBI_EXT_HSM_SUSPENDED);
+			if (hsm_timer_fired)
+				break;
+			hart_wait_on_status(hartid, SBI_EXT_HSM_RESUME_PENDING);
+			if (hsm_timer_fired)
+				break;
+			ret = sbi_hart_get_status(hartid);
+			if (ret.error)
+				report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
+			else if (ret.value != SBI_EXT_HSM_STARTED)
+				report_info("hart %ld status is not 'started' (ret.value=%ld)", hartid, ret.value);
+			else
+				cpumask_set_cpu(cpu, &hsm_resume);
+		}
+
+		if (hsm_timer_fired) {
+			hsm_timer_teardown();
+			report_fail("hsm timer fired before all secondary harts non-retentive resumed");
+			report_prefix_popn(2);
+			return;
+		}
+
+		report(cpumask_weight(&hsm_resume) == max_cpu, "all secondary harts non-retentive resumed");
+
+		cpumask_clear(&hsm_check);
+
+		for_each_cpu(cpu, &secondary_cpus_mask) {
+			hartid = cpus[cpu].hartid;
+
+			while (!((READ_ONCE(sbi_hsm_non_retentive_hart_suspend_checks[cpu])) & SBI_HSM_TEST_DONE)
+			       && !hsm_timer_fired)
+				cpu_relax();
+
+			if (hsm_timer_fired)
+				break;
+
+			if (!(sbi_hsm_non_retentive_hart_suspend_checks[cpu] & SBI_HSM_TEST_SATP))
+				report_info("satp is not zero for test hart %ld", hartid);
+			else if (!(sbi_hsm_non_retentive_hart_suspend_checks[cpu] & SBI_HSM_TEST_SIE))
+				report_info("sstatus.SIE is not zero for test hart %ld", hartid);
+			else if (!(sbi_hsm_non_retentive_hart_suspend_checks[cpu] & SBI_HSM_TEST_HARTID_A1))
+				report_info("either a0 or a1 is not hartid for test hart %ld", hartid);
+			else
+				cpumask_set_cpu(cpu, &hsm_check);
+		}
+
+		if (hsm_timer_fired) {
+			hsm_timer_teardown();
+			report_fail("hsm timer fired before all secondary harts are done with checks");
+			report_prefix_popn(2);
+			return;
+		}
+
+		timer_stop();
+
+		report(cpumask_weight(&hsm_check) == max_cpu,
+		       "all secondary harts have expected register values after non-retentive resume");
+
+		report_prefix_pop();
+
+		report_prefix_push("hart_stop");
+
+		memset(sbi_hsm_stop_hart, 1, sizeof(sbi_hsm_stop_hart));
+
+		cpumask_clear(&hsm_stop);
+		hsm_timer_fired = false;
+		timer_start(hsm_timer_duration);
+
+		for_each_cpu(cpu, &secondary_cpus_mask) {
+			hartid = cpus[cpu].hartid;
+			hart_wait_on_status(hartid, SBI_EXT_HSM_STARTED);
+			if (hsm_timer_fired)
+				break;
+			hart_wait_on_status(hartid, SBI_EXT_HSM_STOP_PENDING);
+			if (hsm_timer_fired)
+				break;
+			ret = sbi_hart_get_status(hartid);
+			if (ret.error)
+				report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
+			else if (ret.value != SBI_EXT_HSM_STOPPED)
+				report_info("hart %ld status is not 'stopped' (ret.value=%ld)", hartid, ret.value);
+			else
+				cpumask_set_cpu(cpu, &hsm_stop);
+		}
+
+		if (hsm_timer_fired) {
+			hsm_timer_teardown();
+			report_fail("hsm timer fired before all secondary harts stopped after resumption");
+			report_prefix_popn(2);
+			return;
+		}
+
+		timer_stop();
+
+		report(cpumask_weight(&hsm_stop) == max_cpu, "all secondary harts stopped after resumption");
+	}
+
+	hsm_timer_teardown();
+	report_prefix_popn(2);
+}
+
 int main(int argc, char **argv)
 {
 	if (argc > 1 && !strcmp(argv[1], "-h")) {
@@ -444,6 +1004,7 @@ int main(int argc, char **argv)
 	report_prefix_push("sbi");
 	check_base();
 	check_time();
+	check_hsm();
 	check_dbcn();
 
 	return report_summary();
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH v4 1/3] riscv: Rewrite hartid_to_cpu in assembly
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 1/3] riscv: Rewrite hartid_to_cpu in assembly James Raphael Tiovalen
@ 2024-09-16  7:19   ` Andrew Jones
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Jones @ 2024-09-16  7:19 UTC (permalink / raw)
  To: James Raphael Tiovalen; +Cc: kvm, kvm-riscv, atishp, cade.richard

On Mon, Sep 16, 2024 at 02:34:57AM GMT, James Raphael Tiovalen wrote:
> From: Andrew Jones <andrew.jones@linux.dev>
> 
> Some SBI HSM tests run without a stack being setup so they can't
> run C code. Those tests still need to know the corresponding cpuid
> for the hartid on which they are running. Give those tests
> hartid_to_cpu() by reimplementing it in assembly.
> 
> Signed-off-by: Andrew Jones <andrew.jones@linux.dev>

This should also include your sign-off since you're posting it and it's
for your series. Passing unmodified patches on that you're comfortable
passing on corresponds to (c) of [1]

[1] https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#developer-s-certificate-of-origin-1-1

> ---
>  lib/riscv/asm-offsets.c |  5 +++++
>  lib/riscv/setup.c       | 10 ----------
>  riscv/cstart.S          | 23 +++++++++++++++++++++++
>  3 files changed, 28 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/riscv/asm-offsets.c b/lib/riscv/asm-offsets.c
> index a2a32438..6c511c14 100644
> --- a/lib/riscv/asm-offsets.c
> +++ b/lib/riscv/asm-offsets.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  #include <kbuild.h>
>  #include <elf.h>
> +#include <asm/processor.h>
>  #include <asm/ptrace.h>
>  #include <asm/smp.h>
>  
> @@ -58,5 +59,9 @@ int main(void)
>  	OFFSET(SECONDARY_FUNC, secondary_data, func);
>  	DEFINE(SECONDARY_DATA_SIZE, sizeof(struct secondary_data));
>  
> +	OFFSET(THREAD_INFO_CPU, thread_info, cpu);
> +	OFFSET(THREAD_INFO_HARTID, thread_info, hartid);
> +	DEFINE(THREAD_INFO_SIZE, sizeof(struct thread_info));
> +
>  	return 0;
>  }
> diff --git a/lib/riscv/setup.c b/lib/riscv/setup.c
> index 495db041..f347ad63 100644
> --- a/lib/riscv/setup.c
> +++ b/lib/riscv/setup.c
> @@ -43,16 +43,6 @@ uint64_t timebase_frequency;
>  
>  static struct mem_region riscv_mem_regions[NR_MEM_REGIONS + 1];
>  
> -int hartid_to_cpu(unsigned long hartid)
> -{
> -	int cpu;
> -
> -	for_each_present_cpu(cpu)
> -		if (cpus[cpu].hartid == hartid)
> -			return cpu;
> -	return -1;
> -}
> -
>  static void cpu_set_fdt(int fdtnode __unused, u64 regval, void *info __unused)
>  {
>  	int cpu = nr_cpus++;
> diff --git a/riscv/cstart.S b/riscv/cstart.S
> index 8f269997..6784d5e1 100644
> --- a/riscv/cstart.S
> +++ b/riscv/cstart.S
> @@ -109,6 +109,29 @@ halt:
>  1:	wfi
>  	j	1b
>  
> +/*
> + * hartid_to_cpu
> + *   a0 is a hartid on entry
> + * returns the corresponding cpuid in a0

I should have capitalized the 'r' in 'returns' and also written
"or -1 if no thread-info struct with 'hartid' is found."

> + */
> +.balign 4
> +.global hartid_to_cpu
> +hartid_to_cpu:
> +	la	t0, cpus
> +	la	t1, nr_cpus
> +	lw	t1, 0(t1)
> +	li	t2, 0
> +1:	bne	t2, t1, 2f
> +	li	a0, -1
> +	ret
> +2:	REG_L	t3, THREAD_INFO_HARTID(t0)
> +	bne	a0, t3, 3f
> +	lw	a0, THREAD_INFO_CPU(t0)
> +	ret
> +3:	addi	t0, t0, THREAD_INFO_SIZE
> +	addi	t2, t2, 1
> +	j	1b
> +
>  .balign 4
>  .global secondary_entry
>  secondary_entry:
> -- 
> 2.43.0
>

I'll fixup the comment and add your sign-off when applying.

Thanks,
drew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH v4 3/3] riscv: sbi: Add tests for HSM extension
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 3/3] riscv: sbi: Add tests for HSM extension James Raphael Tiovalen
@ 2024-09-16  9:23   ` Andrew Jones
  2024-09-16 10:49   ` Andrew Jones
  1 sibling, 0 replies; 7+ messages in thread
From: Andrew Jones @ 2024-09-16  9:23 UTC (permalink / raw)
  To: James Raphael Tiovalen; +Cc: kvm, kvm-riscv, atishp, cade.richard

On Mon, Sep 16, 2024 at 02:34:59AM GMT, James Raphael Tiovalen wrote:
> Add some tests for all of the HSM extension functions. These tests
> ensure that the HSM extension functions follow the behavior as described
> in the SBI specification.
> 
> Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
> ---
>  riscv/sbi.h |  10 +
>  riscv/sbi.c | 561 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 571 insertions(+)
>  create mode 100644 riscv/sbi.h
> 
> diff --git a/riscv/sbi.h b/riscv/sbi.h
> new file mode 100644
> index 00000000..e8625cb1
> --- /dev/null
> +++ b/riscv/sbi.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#ifndef _RISCV_SBI_H_
> +#define _RISCV_SBI_H_
> +
> +#define SBI_HSM_TEST_DONE	(1 << 0)
> +#define SBI_HSM_TEST_SATP	(1 << 1)
> +#define SBI_HSM_TEST_SIE	(1 << 2)
> +#define SBI_HSM_TEST_HARTID_A1	(1 << 3)
> +
> +#endif /* _RISCV_SBI_H_ */

The last patch creates riscv/sbi-tests.h for this, so we don't want this
file.

> diff --git a/riscv/sbi.c b/riscv/sbi.c
> index d4dfd48e..fab0091b 100644
> --- a/riscv/sbi.c
> +++ b/riscv/sbi.c
> @@ -6,6 +6,8 @@
>   */
>  #include <libcflat.h>
>  #include <alloc_page.h>
> +#include <cpumask.h>
> +#include <on-cpus.h>
>  #include <stdlib.h>
>  #include <string.h>
>  #include <limits.h>
> @@ -16,11 +18,13 @@
>  #include <asm/delay.h>
>  #include <asm/io.h>
>  #include <asm/mmu.h>
> +#include <asm/page.h>
>  #include <asm/processor.h>
>  #include <asm/sbi.h>
>  #include <asm/setup.h>
>  #include <asm/smp.h>
>  #include <asm/timer.h>
> +#include <sbi.h>

This should be

 #include "sbi-tests.h"

and have a blank line between the last <> include and it. See how it's
included in sbi-asm.S.

(I renamed it to sbi-tests.h and use "" since I want it to be clear that
it isn't part of the library. Headers included like <sbi.h> would normally
be in lib.)

>  
>  #define	HIGH_ADDR_BOUNDARY	((phys_addr_t)1 << 32)
>  
> @@ -47,6 +51,11 @@ static struct sbiret sbi_dbcn_write_byte(uint8_t byte)
>  	return sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_WRITE_BYTE, byte, 0, 0, 0, 0, 0);
>  }
>  
> +static struct sbiret sbi_hart_suspend(uint32_t suspend_type, unsigned long resume_addr, unsigned long opaque)
> +{
> +	return sbi_ecall(SBI_EXT_HSM, SBI_EXT_HSM_HART_SUSPEND, suspend_type, resume_addr, opaque, 0, 0, 0);
> +}
> +
>  static void split_phys_addr(phys_addr_t paddr, unsigned long *hi, unsigned long *lo)
>  {
>  	*lo = (unsigned long)paddr;
> @@ -434,6 +443,557 @@ static void check_dbcn(void)
>  	report_prefix_popn(2);
>  }
>  
> +unsigned char sbi_hsm_stop_hart[NR_CPUS];
> +unsigned char sbi_hsm_hart_start_checks[NR_CPUS];
> +unsigned char sbi_hsm_non_retentive_hart_suspend_checks[NR_CPUS];

These were added in the first patch of this series. So they must be
getting added again somehow?

> +cpumask_t sbi_hsm_started_hart_checks;
> +cpumask_t sbi_hsm_invalid_hartid_checks;
> +static bool hsm_timer_fired;
> +extern void sbi_hsm_check_hart_start(void);
> +extern void sbi_hsm_check_non_retentive_suspend(void);
> +
> +static void hsm_timer_irq_handler(struct pt_regs *regs)
> +{
> +	hsm_timer_fired = true;
> +	timer_stop();
> +}
> +
> +static void hsm_timer_setup(void)
> +{
> +	install_irq_handler(IRQ_S_TIMER, hsm_timer_irq_handler);
> +	local_irq_enable();
> +	timer_irq_enable();
> +}
> +
> +static void hsm_timer_teardown(void)
> +{
> +	timer_irq_disable();
> +	local_irq_disable();
> +	install_irq_handler(IRQ_S_TIMER, NULL);
> +}
> +
> +static void hart_empty_fn(void *data) {}
> +
> +static void hart_execute(void *data)
> +{
> +	struct sbiret ret;
> +	unsigned long hartid = current_thread_info()->hartid;
> +	int me = smp_processor_id();
> +
> +	ret = sbi_hart_start(hartid, virt_to_phys(&hart_empty_fn), 0);
> +
> +	if (ret.error == SBI_ERR_ALREADY_AVAILABLE)
> +		cpumask_set_cpu(me, &sbi_hsm_started_hart_checks);
> +
> +	ret = sbi_hart_start(ULONG_MAX, virt_to_phys(&hart_empty_fn), 0);
> +
> +	if (ret.error == SBI_ERR_INVALID_PARAM)
> +		cpumask_set_cpu(me, &sbi_hsm_invalid_hartid_checks);

This is a good tests, but I'm not sure why we give the 'me' hart credit
for it since we weren't using the 'me' hart's hartid in the test. The
invalid hartid test should just be an independent test.

> +}
> +
> +static void hart_retentive_suspend(void *data)
> +{
> +	unsigned long hartid = current_thread_info()->hartid;
> +	struct sbiret ret = sbi_hart_suspend(SBI_EXT_HSM_HART_SUSPEND_RETENTIVE, 0, 0);
> +
> +	if (ret.error)
> +		report_fail("failed to retentive suspend hart %ld (error=%ld)", hartid, ret.error);
> +}
> +
> +static void hart_non_retentive_suspend(void *data)
> +{
> +	unsigned long hartid = current_thread_info()->hartid;
> +
> +	/* Set opaque as hartid so that we can check a0 == a1, ensuring that a0 is hartid and a1 is opaque */
> +	struct sbiret ret = sbi_hart_suspend(SBI_EXT_HSM_HART_SUSPEND_NON_RETENTIVE,
> +					     virt_to_phys(&sbi_hsm_check_non_retentive_suspend), hartid);
> +
> +	report_fail("failed to non-retentive suspend hart %ld (error=%ld)", hartid, ret.error);
> +}
> +
> +static void hart_wait_on_status(unsigned long hartid, enum sbi_ext_hsm_sid status)
> +{
> +	struct sbiret ret = sbi_hart_get_status(hartid);
> +
> +	while (!ret.error && ret.value == status && !hsm_timer_fired) {
> +		cpu_relax();
> +		ret = sbi_hart_get_status(hartid);
> +	}
> +
> +	if (hsm_timer_fired)
> +		report_info("timer fired while waiting on status %u for hart %ld", status, hartid);
> +	else if (ret.error)
> +		report_fail("got %ld while waiting on status %u for hart %ld\n", ret.error, status, hartid);
> +}
> +
> +static void check_hsm(void)
> +{
> +	struct sbiret ret;
> +	unsigned long hartid;
> +	cpumask_t secondary_cpus_mask, hsm_start, hsm_stop, hsm_suspend, hsm_resume, hsm_check;
> +	int cpu, me = smp_processor_id();
> +	int max_cpu = getenv("SBI_HSM_MAX_CPU") ? strtol(getenv("SBI_HSM_MAX_CPU"), NULL, 0) : INT_MAX;

I think it'll be better if this is named SBI_MAX_CPUS and is the maximum
_number_ of cpus that the SBI implementation supports.

And max_cpus should default to nr_cpus, so we should have

 int max_cpus = getenv("SBI_MAX_CPUS") ? strtol(getenv("SBI_MAX_CPUS"), NULL, 0) : nr_cpus;


> +	unsigned long hsm_timer_duration = getenv("SBI_HSM_TIMER_DURATION")
> +					 ? strtol(getenv("SBI_HSM_TIMER_DURATION"), NULL, 0) : 200000;
> +
> +	max_cpu = MIN(max_cpu, nr_cpus - 1);

And here we would have

	max_cpus = MIN(max_cpus, nr_cpus);

and if we still need 'max_cpu' (the highest cpu id), like we do below,
then we'll get it with 'max_cpus - 1'.

> +
> +	cpumask_copy(&secondary_cpus_mask, &cpu_present_mask);
> +	cpumask_clear_cpu(me, &secondary_cpus_mask);
> +	for_each_cpu(cpu, &secondary_cpus_mask)
> +		if (cpu > max_cpu)

 if (cpu > max_cpus - 1)

> +			cpumask_clear_cpu(cpu, &secondary_cpus_mask);
> +
> +	report_prefix_push("hsm");
> +
> +	if (!sbi_probe(SBI_EXT_HSM)) {
> +		report_skip("hsm extension not available");
> +		report_prefix_pop();
> +		return;
> +	}
> +
> +	report_prefix_push("hart_get_status");
> +
> +	hartid = current_thread_info()->hartid;
> +	ret = sbi_hart_get_status(hartid);
> +
> +	if (ret.error) {
> +		report_fail("failed to get status of current hart (error=%ld)", ret.error);
> +		report_prefix_popn(2);
> +		return;
> +	} else if (ret.value != SBI_EXT_HSM_STARTED) {
> +		report_fail("current hart is not started (ret.value=%ld)", ret.value);
> +		report_prefix_popn(2);
> +		return;
> +	}
> +
> +	report_pass("status of current hart is started");
> +
> +	report_prefix_pop();
> +
> +	if (max_cpu + 1 < 2) {

 if (max_cpus < 2)

> +		report_skip("no other cpus to run the remaining hsm tests on");
> +		report_prefix_pop();
> +		return;
> +	}
> +
> +	/* This is necessary since we do not choose which cpu the boot hart will run on */
> +	if (me > max_cpu)
> +		max_cpu++;

Hmm, I see an issue here. If a DT has its CPU nodes in an order where
nodes earlier in the list include hartids which are not activated by the
SBI implementation due to the SBI implementation's limits on the number
of harts it can activate (because the SBI implementation chose to activate
harts in a particular hartid order rather than by DT cpu node order), then
simply clamping nr_cpus down to SBI max_cpus will not work. We need to
instead run the cpu_present_mask through a filter to remove all the cpus
corresponding to hartids which the SBI implementation has not activated.
This is something we should do in setup(), so I just wrote a patch for it
now [1]. With that patch the 

  for_each_cpu(cpu, &secondary_cpus_mask)
     if (cpu > max_cpu)
          cpumask_clear_cpu(cpu, &secondary_cpus_mask);

loop above should be removed. The

    cpumask_copy(&secondary_cpus_mask, &cpu_present_mask);
    cpumask_clear_cpu(me, &secondary_cpus_mask);

is sufficient for getting a valid mask of cpus that SBI manages. And of
course no need for this

 if (me > max_cpu)
     max_cpu++;

which was insufficient anyway since 'me' doesn't necessarily have to be
just one more than the expected max cpuid.

[1] https://gitlab.com/jones-drew/kvm-unit-tests/-/commit/5fc2c49c6ee37ab725dc0f7a0f6858c17705604f

> +
> +	report_prefix_push("hart_start");
> +
> +	for_each_cpu(cpu, &secondary_cpus_mask) {
> +		hartid = cpus[cpu].hartid;
> +		/* Set opaque as hartid so that we can check a0 == a1, ensuring that a0 is hartid and a1 is opaque */
> +		ret = sbi_hart_start(hartid, virt_to_phys(&sbi_hsm_check_hart_start), hartid);
> +		if (ret.error) {
> +			report_fail("failed to start test hart %ld (error=%ld)", hartid, ret.error);
> +			report_prefix_popn(2);
> +			return;

Do we need to give up on all tests if one hart fails to start?

> +		}

We should move the sbi_hart_start call into the loop below so we can time
each start/wait individually.

> +	}
> +
> +	cpumask_clear(&hsm_start);
> +	hsm_timer_setup();
> +	timer_start(hsm_timer_duration);

Duration should be small, so we don't want to put a timer_start outside
the loop. It should instead be...

> +
> +	for_each_cpu(cpu, &secondary_cpus_mask) {
> +		hartid = cpus[cpu].hartid;

...here and...

> +		hart_wait_on_status(hartid, SBI_EXT_HSM_STOPPED);

...here should be a timer_stop().

> +		if (hsm_timer_fired)
> +			break;

Other timer_start here

> +		hart_wait_on_status(hartid, SBI_EXT_HSM_START_PENDING);

and stop here. Actually we can put timer_start/stop in
hart_wait_on_status().

> +		if (hsm_timer_fired)
> +			break;
> +		ret = sbi_hart_get_status(hartid);
> +		if (ret.error)
> +			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
> +		else if (ret.value != SBI_EXT_HSM_STARTED)
> +			report_info("hart %ld status is not 'started' (ret.value=%ld)", hartid, ret.value);
> +		else
> +			cpumask_set_cpu(cpu, &hsm_start);
> +	}
> +
> +	if (hsm_timer_fired) {
> +		hsm_timer_teardown();
> +		report_fail("hsm timer fired before all secondary harts started");
> +		report_prefix_popn(2);
> +		return;

Do we need to give up on all test if some harts fail?

> +	}
> +
> +	report(cpumask_weight(&hsm_start) == max_cpu, "all secondary harts started");
> +
> +	cpumask_clear(&hsm_check);
> +
> +	for_each_cpu(cpu, &secondary_cpus_mask) {
> +		hartid = cpus[cpu].hartid;
> +

timer_start

> +		while (!(READ_ONCE(sbi_hsm_hart_start_checks[cpu]) & SBI_HSM_TEST_DONE) && !hsm_timer_fired)
> +			cpu_relax();

timer_stop

> +
> +		if (hsm_timer_fired)
> +			break;
> +
> +		if (!(sbi_hsm_hart_start_checks[cpu] & SBI_HSM_TEST_SATP))
> +			report_info("satp is not zero for test hart %ld", hartid);
> +		else if (!(sbi_hsm_hart_start_checks[cpu] & SBI_HSM_TEST_SIE))
> +			report_info("sstatus.SIE is not zero for test hart %ld", hartid);
> +		else if (!(sbi_hsm_hart_start_checks[cpu] & SBI_HSM_TEST_HARTID_A1))
> +			report_info("either a0 or a1 is not hartid for test hart %ld", hartid);
> +		else
> +			cpumask_set_cpu(cpu, &hsm_check);

The body of this loop can also be merged with the loop above. There should
be only one loop that does, start, wait, check for each hart allowing us
to time the wait for each hart individually. One the hart the fails to
start within duration will fail its start test and we try each to see if
others fail or not.

After the loop completes we can output all the

 report(cpumask_weight(...

checks.

> +	}
> +
> +	if (hsm_timer_fired) {
> +		hsm_timer_teardown();
> +		report_fail("hsm timer fired before all secondary harts are done with checks");
> +		report_prefix_popn(2);
> +		return;

Do we need to give up on all test if some harts fail?

> +	}
> +
> +	timer_stop();
> +
> +	report(cpumask_weight(&hsm_check) == max_cpu,
> +	       "all secondary harts have expected register values after hart start");
> +
> +	report_prefix_pop();
> +
> +	report_prefix_push("hart_stop");
> +
> +	memset(sbi_hsm_stop_hart, 1, sizeof(sbi_hsm_stop_hart));
> +
> +	cpumask_clear(&hsm_stop);
> +	hsm_timer_fired = false;
> +	timer_start(hsm_timer_duration);

move timer_start/top into hart_wait_on_status.

> +
> +	for_each_cpu(cpu, &secondary_cpus_mask) {
> +		hartid = cpus[cpu].hartid;
> +		hart_wait_on_status(hartid, SBI_EXT_HSM_STARTED);
> +		if (hsm_timer_fired)
> +			break;
> +		hart_wait_on_status(hartid, SBI_EXT_HSM_STOP_PENDING);
> +		if (hsm_timer_fired)
> +			break;
> +		ret = sbi_hart_get_status(hartid);
> +		if (ret.error)
> +			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
> +		else if (ret.value != SBI_EXT_HSM_STOPPED)
> +			report_info("hart %ld status is not 'stopped' (ret.value=%ld)", hartid, ret.value);
> +		else
> +			cpumask_set_cpu(cpu, &hsm_stop);
> +	}
> +
> +	if (hsm_timer_fired) {
> +		hsm_timer_teardown();
> +		report_fail("hsm timer fired before all secondary harts stopped");
> +		report_prefix_popn(2);
> +		return;

Do we need to give up on all test if some harts fail?

> +	}
> +
> +	timer_stop();
> +
> +	report(cpumask_weight(&hsm_stop) == max_cpu, "all secondary harts stopped");
> +
> +	/* Reset the stop flags so that we can reuse them after suspension tests */
> +	memset(sbi_hsm_stop_hart, 0, sizeof(sbi_hsm_stop_hart));
> +
> +	report_prefix_pop();
> +
> +	report_prefix_push("hart_start");
> +
> +	on_cpumask_async(&secondary_cpus_mask, hart_execute, NULL);
> +
> +	cpumask_clear(&hsm_start);
> +	hsm_timer_fired = false;
> +	timer_start(hsm_timer_duration);

move timer_start/top into hart_wait_on_status

> +
> +	for_each_cpu(cpu, &secondary_cpus_mask) {
> +		hartid = cpus[cpu].hartid;
> +		hart_wait_on_status(hartid, SBI_EXT_HSM_STOPPED);
> +		if (hsm_timer_fired)
> +			break;
> +		hart_wait_on_status(hartid, SBI_EXT_HSM_START_PENDING);
> +		if (hsm_timer_fired)
> +			break;
> +		ret = sbi_hart_get_status(hartid);
> +		if (ret.error)
> +			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
> +		else if (ret.value != SBI_EXT_HSM_STARTED)
> +			report_info("hart %ld status is not 'started' (ret.value=%ld)", hartid, ret.value);
> +		else
> +			cpumask_set_cpu(cpu, &hsm_start);
> +	}
> +
> +	if (hsm_timer_fired) {
> +		hsm_timer_teardown();
> +		report_fail("hsm timer fired before all secondary harts started");
> +		report_prefix_popn(2);
> +		return;

Do we need to give up on all test if some harts fail?

> +	}
> +
> +	report(cpumask_weight(&hsm_start) == max_cpu, "all secondary harts started");
> +

timer_start

> +	while (cpumask_weight(&cpu_idle_mask) != max_cpu && !hsm_timer_fired)
> +		cpu_relax();

timer_stop

> +
> +	if (hsm_timer_fired) {
> +		hsm_timer_teardown();
> +		report_fail("hsm timer fired before all secondary harts started");
> +		report_prefix_popn(2);
> +		return;
> +	}
> +
> +	timer_stop();
> +
> +	report(cpumask_weight(&cpu_idle_mask) == max_cpu,
> +	       "all secondary harts successfully executed code after start");
> +	report(cpumask_weight(&cpu_online_mask) == max_cpu + 1, "all secondary harts online");
> +	report(cpumask_weight(&sbi_hsm_started_hart_checks) == max_cpu,
> +	       "all secondary harts are already started");
> +	report(cpumask_weight(&sbi_hsm_invalid_hartid_checks) == max_cpu,
> +	       "all secondary harts refuse to start with invalid hartid");
> +
> +	report_prefix_pop();
> +
> +	report_prefix_push("hart_suspend");
> +
> +	if (!sbi_probe(SBI_EXT_IPI)) {
> +		hsm_timer_teardown();
> +		report_skip("skipping suspension tests since ipi extension is unavailable");
> +		report_prefix_popn(2);
> +		return;
> +	}
> +
> +	on_cpumask_async(&secondary_cpus_mask, hart_retentive_suspend, NULL);
> +
> +	cpumask_clear(&hsm_suspend);
> +	hsm_timer_fired = false;
> +	timer_start(hsm_timer_duration);

move timer_start/top into hart_wait_on_status

> +
> +	for_each_cpu(cpu, &secondary_cpus_mask) {
> +		hartid = cpus[cpu].hartid;
> +		hart_wait_on_status(hartid, SBI_EXT_HSM_STARTED);
> +		if (hsm_timer_fired)
> +			break;
> +		hart_wait_on_status(hartid, SBI_EXT_HSM_SUSPEND_PENDING);
> +		if (hsm_timer_fired)
> +			break;
> +		ret = sbi_hart_get_status(hartid);
> +		if (ret.error)
> +			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
> +		else if (ret.value != SBI_EXT_HSM_SUSPENDED)
> +			report_info("hart %ld status is not 'suspended' (ret.value=%ld)", hartid, ret.value);
> +		else
> +			cpumask_set_cpu(cpu, &hsm_suspend);
> +	}
> +
> +	if (hsm_timer_fired) {
> +		hsm_timer_teardown();
> +		report_fail("hsm timer fired before all secondary harts retentive suspended");
> +		report_prefix_popn(2);
> +		return;

Do we need to give up on all test if some harts fail?

> +	}
> +
> +	timer_stop();
> +
> +	report(cpumask_weight(&hsm_suspend) == max_cpu, "all secondary harts retentive suspended");
> +
> +	ret = sbi_send_ipi_cpumask(&secondary_cpus_mask);
> +
> +	if (!ret.error) {
> +		cpumask_clear(&hsm_resume);
> +		hsm_timer_fired = false;
> +		timer_start(hsm_timer_duration);

move timer_start/top into hart_wait_on_status

> +
> +		for_each_cpu(cpu, &secondary_cpus_mask) {
> +			hartid = cpus[cpu].hartid;
> +			hart_wait_on_status(hartid, SBI_EXT_HSM_SUSPENDED);
> +			if (hsm_timer_fired)
> +				break;
> +			hart_wait_on_status(hartid, SBI_EXT_HSM_RESUME_PENDING);
> +			if (hsm_timer_fired)
> +				break;
> +			ret = sbi_hart_get_status(hartid);
> +			if (ret.error)
> +				report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
> +			else if (ret.value != SBI_EXT_HSM_STARTED)
> +				report_info("hart %ld status is not 'started' (ret.value=%ld)", hartid, ret.value);
> +			else
> +				cpumask_set_cpu(cpu, &hsm_resume);
> +		}
> +
> +		if (hsm_timer_fired) {
> +			hsm_timer_teardown();
> +			report_fail("hsm timer fired before all secondary harts retentive resumed");
> +			report_prefix_popn(2);
> +			return;

Do we need to give up on all test if some harts fail?

> +		}
> +
> +		report(cpumask_weight(&hsm_resume) == max_cpu, "all secondary harts retentive resumed");
> +

timer_start

> +		while (cpumask_weight(&cpu_idle_mask) != max_cpu && !hsm_timer_fired)
> +			cpu_relax();

timer_stop

> +
> +		if (hsm_timer_fired) {
> +			hsm_timer_teardown();
> +			report_fail("hsm timer fired before all secondary harts retentive resumed");
> +			report_prefix_popn(2);
> +			return;

Do we need to give up on all test if some harts fail?

> +		}
> +
> +		timer_stop();
> +
> +		report(cpumask_weight(&cpu_idle_mask) == max_cpu,
> +		       "all secondary harts successfully executed code after retentive suspend");
> +		report(cpumask_weight(&cpu_online_mask) == max_cpu + 1,
> +		       "all secondary harts online");
> +	}
> +
> +	on_cpumask_async(&secondary_cpus_mask, hart_non_retentive_suspend, NULL);
> +
> +	cpumask_clear(&hsm_suspend);
> +	hsm_timer_fired = false;

move timer_start/top into hart_wait_on_status

> +	timer_start(hsm_timer_duration);
> +
> +	for_each_cpu(cpu, &secondary_cpus_mask) {
> +		hartid = cpus[cpu].hartid;
> +		hart_wait_on_status(hartid, SBI_EXT_HSM_STARTED);
> +		if (hsm_timer_fired)
> +			break;
> +		hart_wait_on_status(hartid, SBI_EXT_HSM_SUSPEND_PENDING);
> +		if (hsm_timer_fired)
> +			break;
> +		ret = sbi_hart_get_status(hartid);
> +		if (ret.error)
> +			report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
> +		else if (ret.value != SBI_EXT_HSM_SUSPENDED)
> +			report_info("hart %ld status is not 'suspended' (ret.value=%ld)", hartid, ret.value);
> +		else
> +			cpumask_set_cpu(cpu, &hsm_suspend);
> +	}
> +
> +	if (hsm_timer_fired) {
> +		hsm_timer_teardown();
> +		report_fail("hsm timer fired before all secondary harts non-retentive suspended");
> +		report_prefix_popn(2);
> +		return;

Do we need to give up on all test if some harts fail?

> +	}
> +
> +	timer_stop();
> +
> +	report(cpumask_weight(&hsm_suspend) == max_cpu, "all secondary harts non-retentive suspended");
> +
> +	ret = sbi_send_ipi_cpumask(&secondary_cpus_mask);
> +
> +	if (!ret.error) {
> +		cpumask_clear(&hsm_resume);
> +		hsm_timer_fired = false;
> +		timer_start(hsm_timer_duration);

move timer_start/top into hart_wait_on_status

> +
> +		for_each_cpu(cpu, &secondary_cpus_mask) {
> +			hartid = cpus[cpu].hartid;
> +			hart_wait_on_status(hartid, SBI_EXT_HSM_SUSPENDED);
> +			if (hsm_timer_fired)
> +				break;
> +			hart_wait_on_status(hartid, SBI_EXT_HSM_RESUME_PENDING);
> +			if (hsm_timer_fired)
> +				break;
> +			ret = sbi_hart_get_status(hartid);
> +			if (ret.error)
> +				report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
> +			else if (ret.value != SBI_EXT_HSM_STARTED)
> +				report_info("hart %ld status is not 'started' (ret.value=%ld)", hartid, ret.value);
> +			else
> +				cpumask_set_cpu(cpu, &hsm_resume);
> +		}
> +
> +		if (hsm_timer_fired) {
> +			hsm_timer_teardown();
> +			report_fail("hsm timer fired before all secondary harts non-retentive resumed");
> +			report_prefix_popn(2);
> +			return;

Do we need to give up on all test if some harts fail?

> +		}
> +
> +		report(cpumask_weight(&hsm_resume) == max_cpu, "all secondary harts non-retentive resumed");
> +
> +		cpumask_clear(&hsm_check);
> +
> +		for_each_cpu(cpu, &secondary_cpus_mask) {
> +			hartid = cpus[cpu].hartid;
> +

timer_start

> +			while (!((READ_ONCE(sbi_hsm_non_retentive_hart_suspend_checks[cpu])) & SBI_HSM_TEST_DONE)
> +			       && !hsm_timer_fired)
> +				cpu_relax();

timer_stop

> +
> +			if (hsm_timer_fired)
> +				break;
> +
> +			if (!(sbi_hsm_non_retentive_hart_suspend_checks[cpu] & SBI_HSM_TEST_SATP))
> +				report_info("satp is not zero for test hart %ld", hartid);
> +			else if (!(sbi_hsm_non_retentive_hart_suspend_checks[cpu] & SBI_HSM_TEST_SIE))
> +				report_info("sstatus.SIE is not zero for test hart %ld", hartid);
> +			else if (!(sbi_hsm_non_retentive_hart_suspend_checks[cpu] & SBI_HSM_TEST_HARTID_A1))
> +				report_info("either a0 or a1 is not hartid for test hart %ld", hartid);
> +			else
> +				cpumask_set_cpu(cpu, &hsm_check);
> +		}
> +
> +		if (hsm_timer_fired) {
> +			hsm_timer_teardown();
> +			report_fail("hsm timer fired before all secondary harts are done with checks");
> +			report_prefix_popn(2);
> +			return;

Do we need to give up on all test if some harts fail?

> +		}
> +
> +		timer_stop();
> +
> +		report(cpumask_weight(&hsm_check) == max_cpu,
> +		       "all secondary harts have expected register values after non-retentive resume");
> +
> +		report_prefix_pop();
> +
> +		report_prefix_push("hart_stop");
> +
> +		memset(sbi_hsm_stop_hart, 1, sizeof(sbi_hsm_stop_hart));
> +
> +		cpumask_clear(&hsm_stop);
> +		hsm_timer_fired = false;
> +		timer_start(hsm_timer_duration);

move timer_start/top into hart_wait_on_status

> +
> +		for_each_cpu(cpu, &secondary_cpus_mask) {
> +			hartid = cpus[cpu].hartid;
> +			hart_wait_on_status(hartid, SBI_EXT_HSM_STARTED);
> +			if (hsm_timer_fired)
> +				break;
> +			hart_wait_on_status(hartid, SBI_EXT_HSM_STOP_PENDING);
> +			if (hsm_timer_fired)
> +				break;
> +			ret = sbi_hart_get_status(hartid);
> +			if (ret.error)
> +				report_info("hart %ld get status failed (error=%ld)", hartid, ret.error);
> +			else if (ret.value != SBI_EXT_HSM_STOPPED)
> +				report_info("hart %ld status is not 'stopped' (ret.value=%ld)", hartid, ret.value);
> +			else
> +				cpumask_set_cpu(cpu, &hsm_stop);
> +		}
> +
> +		if (hsm_timer_fired) {
> +			hsm_timer_teardown();
> +			report_fail("hsm timer fired before all secondary harts stopped after resumption");
> +			report_prefix_popn(2);
> +			return;

Do we need to give up on all test if some harts fail?

> +		}
> +
> +		timer_stop();
> +
> +		report(cpumask_weight(&hsm_stop) == max_cpu, "all secondary harts stopped after resumption");
> +	}
> +
> +	hsm_timer_teardown();
> +	report_prefix_popn(2);
> +}
> +
>  int main(int argc, char **argv)
>  {
>  	if (argc > 1 && !strcmp(argv[1], "-h")) {
> @@ -444,6 +1004,7 @@ int main(int argc, char **argv)
>  	report_prefix_push("sbi");
>  	check_base();
>  	check_time();
> +	check_hsm();
>  	check_dbcn();
>  
>  	return report_summary();
> -- 
> 2.43.0
> 
>

Thanks,
drew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH v4 3/3] riscv: sbi: Add tests for HSM extension
  2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 3/3] riscv: sbi: Add tests for HSM extension James Raphael Tiovalen
  2024-09-16  9:23   ` Andrew Jones
@ 2024-09-16 10:49   ` Andrew Jones
  1 sibling, 0 replies; 7+ messages in thread
From: Andrew Jones @ 2024-09-16 10:49 UTC (permalink / raw)
  To: James Raphael Tiovalen; +Cc: kvm, kvm-riscv, atishp, cade.richard

On Mon, Sep 16, 2024 at 02:34:59AM GMT, James Raphael Tiovalen wrote:
...
> +static struct sbiret sbi_hart_suspend(uint32_t suspend_type, unsigned long resume_addr, unsigned long opaque)
> +{
> +	return sbi_ecall(SBI_EXT_HSM, SBI_EXT_HSM_HART_SUSPEND, suspend_type, resume_addr, opaque, 0, 0, 0);
> +}
> +

It's good to create these wrappers per the spec, i.e. use the parameter
name types from the spec to define the wrapper, but we should also
remember that if, like in this case, we force a parameter to be of a
certain type that we also add tests for when we use input outside the
expected range by using a "raw" ecall. See the last test of the DBCN
extension where we test a byte write with a word full of data to
ensure the SBI call does the right thing. We have to call sbi_ecall()
instead of sbi_dbcn_write_byte() to do that because sbi_dbcn_write_byte()
has been written to only accept uint8_t input.

So we'll need some sort of test for suspend where we pass a value
with high bits set for suspend_type when running the test on rv64.

Thanks,
drew

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-09-16 10:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-15 18:34 [kvm-unit-tests PATCH v4 0/3] riscv: sbi: Add support to test HSM extension James Raphael Tiovalen
2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 1/3] riscv: Rewrite hartid_to_cpu in assembly James Raphael Tiovalen
2024-09-16  7:19   ` Andrew Jones
2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 2/3] riscv: sbi: Provide entry point for HSM tests James Raphael Tiovalen
2024-09-15 18:34 ` [kvm-unit-tests PATCH v4 3/3] riscv: sbi: Add tests for HSM extension James Raphael Tiovalen
2024-09-16  9:23   ` Andrew Jones
2024-09-16 10:49   ` Andrew Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).