* [PULL v2 01/47] target/riscv: Add a property to set vl to ceil(AVL/2)
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 02/47] tests/acpi: Add empty ACPI SRAT data file for RISC-V Alistair Francis
` (46 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Jason Chien, Frank Chang, Alistair Francis
From: Jason Chien <jason.chien@sifive.com>
RVV spec allows implementations to set vl with values within
[ceil(AVL/2),VLMAX] when VLMAX < AVL < 2*VLMAX. This commit adds a
property "rvv_vl_half_avl" to enable setting vl = ceil(AVL/2). This
behavior helps identify compiler issues and bugs.
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-ID: <20240722175004.23666-1-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/cpu_cfg.h | 1 +
target/riscv/cpu.c | 1 +
target/riscv/vector_helper.c | 2 ++
3 files changed, 4 insertions(+)
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index 8b272fb826..96fe26d4ea 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -127,6 +127,7 @@ struct RISCVCPUConfig {
bool ext_smepmp;
bool rvv_ta_all_1s;
bool rvv_ma_all_1s;
+ bool rvv_vl_half_avl;
uint32_t mvendorid;
uint64_t marchid;
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 4bda754b01..cc5552500a 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -2661,6 +2661,7 @@ static Property riscv_cpu_properties[] = {
DEFINE_PROP_BOOL("rvv_ta_all_1s", RISCVCPU, cfg.rvv_ta_all_1s, false),
DEFINE_PROP_BOOL("rvv_ma_all_1s", RISCVCPU, cfg.rvv_ma_all_1s, false),
+ DEFINE_PROP_BOOL("rvv_vl_half_avl", RISCVCPU, cfg.rvv_vl_half_avl, false),
/*
* write_misa() is marked as experimental for now so mark
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 10a52ceb5b..072bd444b1 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -75,6 +75,8 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong s1,
vlmax = vext_get_vlmax(cpu->cfg.vlenb, vsew, lmul);
if (s1 <= vlmax) {
vl = s1;
+ } else if (s1 < 2 * vlmax && cpu->cfg.rvv_vl_half_avl) {
+ vl = (s1 + 1) >> 1;
} else {
vl = vlmax;
}
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 02/47] tests/acpi: Add empty ACPI SRAT data file for RISC-V
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
2024-09-24 22:17 ` [PULL v2 01/47] target/riscv: Add a property to set vl to ceil(AVL/2) Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 03/47] tests/qtest/bios-tables-test.c: Enable numamem testing " Alistair Francis
` (45 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Haibo Xu, Sunil V L, Alistair Francis
From: Haibo Xu <haibo1.xu@intel.com>
As per process documented (steps 1-3) in bios-tables-test.c, add
empty AML data file for RISC-V ACPI SRAT table and add the entry
in bios-tables-test-allowed-diff.h.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <0e30216273f2f59916bc651350578d8e8bc3a75f.1723172696.git.haibo1.xu@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
tests/qtest/bios-tables-test-allowed-diff.h | 1 +
tests/data/acpi/riscv64/virt/SRAT.numamem | 0
2 files changed, 1 insertion(+)
create mode 100644 tests/data/acpi/riscv64/virt/SRAT.numamem
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..a3e01d2eb7 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,2 @@
/* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/riscv64/virt/SRAT.numamem",
diff --git a/tests/data/acpi/riscv64/virt/SRAT.numamem b/tests/data/acpi/riscv64/virt/SRAT.numamem
new file mode 100644
index 0000000000..e69de29bb2
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 03/47] tests/qtest/bios-tables-test.c: Enable numamem testing for RISC-V
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
2024-09-24 22:17 ` [PULL v2 01/47] target/riscv: Add a property to set vl to ceil(AVL/2) Alistair Francis
2024-09-24 22:17 ` [PULL v2 02/47] tests/acpi: Add empty ACPI SRAT data file for RISC-V Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 04/47] tests/acpi: Add expected ACPI SRAT AML file " Alistair Francis
` (44 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Haibo Xu, Sunil V L, Alistair Francis
From: Haibo Xu <haibo1.xu@intel.com>
Add ACPI SRAT table test case for RISC-V when NUMA was enabled.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <a6f7e1a4b20ff7eb199e94ca0c8aa2e6794ce5b2.1723172696.git.haibo1.xu@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
tests/qtest/bios-tables-test.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 36e5c0adde..e79f3a03df 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -1706,6 +1706,32 @@ static void test_acpi_microvm_ioapic2_tcg(void)
free_test_data(&data);
}
+static void test_acpi_riscv64_virt_tcg_numamem(void)
+{
+ test_data data = {
+ .machine = "virt",
+ .arch = "riscv64",
+ .tcg_only = true,
+ .uefi_fl1 = "pc-bios/edk2-riscv-code.fd",
+ .uefi_fl2 = "pc-bios/edk2-riscv-vars.fd",
+ .cd = "tests/data/uefi-boot-images/bios-tables-test.riscv64.iso.qcow2",
+ .ram_start = 0x80000000ULL,
+ .scan_len = 128ULL * 1024 * 1024,
+ };
+
+ data.variant = ".numamem";
+ /*
+ * RHCT will have ISA string encoded. To reduce the effort
+ * of updating expected AML file for any new default ISA extension,
+ * use the profile rva22s64.
+ */
+ test_acpi_one(" -cpu rva22s64"
+ " -object memory-backend-ram,id=ram0,size=128M"
+ " -numa node,memdev=ram0",
+ &data);
+ free_test_data(&data);
+}
+
static void test_acpi_aarch64_virt_tcg_numamem(void)
{
test_data data = {
@@ -2466,6 +2492,8 @@ int main(int argc, char *argv[])
} else if (strcmp(arch, "riscv64") == 0) {
if (has_tcg && qtest_has_device("virtio-blk-pci")) {
qtest_add_func("acpi/virt", test_acpi_riscv64_virt_tcg);
+ qtest_add_func("acpi/virt/numamem",
+ test_acpi_riscv64_virt_tcg_numamem);
}
}
ret = g_test_run();
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 04/47] tests/acpi: Add expected ACPI SRAT AML file for RISC-V
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (2 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 03/47] tests/qtest/bios-tables-test.c: Enable numamem testing " Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 05/47] target/riscv/tcg/tcg-cpu.c: consider MISA bit choice in implied rule Alistair Francis
` (43 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Haibo Xu, Sunil V L, Alistair Francis
From: Haibo Xu <haibo1.xu@intel.com>
As per the step 5 in the process documented in bios-tables-test.c,
generate the expected ACPI SRAT AML data file for RISC-V using the
rebuild-expected-aml.sh script and update the
bios-tables-test-allowed-diff.h.
This is a new file being added for the first time. Hence, iASL diff
output is not added.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <a667480203b35508038176c8ce4722370294cc57.1723172696.git.haibo1.xu@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
tests/qtest/bios-tables-test-allowed-diff.h | 1 -
tests/data/acpi/riscv64/virt/SRAT.numamem | Bin 0 -> 108 bytes
2 files changed, 1 deletion(-)
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index a3e01d2eb7..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,2 +1 @@
/* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/riscv64/virt/SRAT.numamem",
diff --git a/tests/data/acpi/riscv64/virt/SRAT.numamem b/tests/data/acpi/riscv64/virt/SRAT.numamem
index e69de29bb2..2b6467364b 100644
Binary files a/tests/data/acpi/riscv64/virt/SRAT.numamem and b/tests/data/acpi/riscv64/virt/SRAT.numamem differ
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 05/47] target/riscv/tcg/tcg-cpu.c: consider MISA bit choice in implied rule
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (3 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 04/47] tests/acpi: Add expected ACPI SRAT AML file " Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 06/47] target/riscv: fix za64rs enabling Alistair Francis
` (42 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Daniel Henrique Barboza, Frank Chang,
Alistair Francis
From: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Gitlab issue [1] reports a misleading error when trying to run a 'rv64'
cpu with 'zfinx' and without 'f':
$ ./build/qemu-system-riscv64 -nographic -M virt -cpu rv64,zfinx=true,f=false
qemu-system-riscv64: Zfinx cannot be supported together with F extension
The user explicitly disabled F and the error message mentions a conflict
with Zfinx and F.
The problem isn't the error reporting, but the logic used when applying
the implied ZFA rule that enables RVF unconditionally, without honoring
user choice (i.e. keep F disabled).
Change cpu_enable_implied_rule() to check if the user deliberately
disabled a MISA bit. In this case we shouldn't either re-enable the bit
nor apply any implied rules related to it.
After this change the error message now shows:
$ ./build/qemu-system-riscv64 -nographic -M virt -cpu rv64,zfinx=true,f=false
qemu-system-riscv64: Zfa extension requires F extension
Disabling 'zfa':
$ ./build/qemu-system-riscv64 -nographic -M virt -cpu rv64,zfinx=true,f=false,zfa=false
qemu-system-riscv64: D extension requires F extension
And finally after disabling 'd':
$ ./build/qemu-system-riscv64 -nographic -M virt -cpu rv64,zfinx=true,f=false,zfa=false,d=false
(OpenSBI boots ...)
[1] https://gitlab.com/qemu-project/qemu/-/issues/2486
Cc: Frank Chang <frank.chang@sifive.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2486
Fixes: 047da861f9 ("target/riscv: Introduce extension implied rule helpers")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240824173338.316666-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/tcg/tcg-cpu.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index b8814ab753..dea8ab7a43 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -778,11 +778,18 @@ static void cpu_enable_implied_rule(RISCVCPU *cpu,
if (!enabled) {
/* Enable the implied MISAs. */
if (rule->implied_misa_exts) {
- riscv_cpu_set_misa_ext(env,
- env->misa_ext | rule->implied_misa_exts);
-
for (i = 0; misa_bits[i] != 0; i++) {
if (rule->implied_misa_exts & misa_bits[i]) {
+ /*
+ * If the user disabled the misa_bit do not re-enable it
+ * and do not apply any implied rules related to it.
+ */
+ if (cpu_misa_ext_is_user_set(misa_bits[i]) &&
+ !(env->misa_ext & misa_bits[i])) {
+ continue;
+ }
+
+ riscv_cpu_set_misa_ext(env, env->misa_ext | misa_bits[i]);
ir = g_hash_table_lookup(misa_ext_implied_rules,
GUINT_TO_POINTER(misa_bits[i]));
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 06/47] target/riscv: fix za64rs enabling
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (4 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 05/47] target/riscv/tcg/tcg-cpu.c: consider MISA bit choice in implied rule Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 07/47] target: riscv: Enable Bit Manip for OpenTitan Ibex CPU Alistair Francis
` (41 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Vladimir Isaev, Alistair Francis,
Daniel Henrique Barboza
From: Vladimir Isaev <vladimir.isaev@syntacore.com>
za64rs requires priv 1.12 when enabled by priv 1.11.
This fixes annoying warning:
warning: disabling za64rs extension for hart 0x00000000 because privilege spec version does not match
on priv 1.11 CPUs.
Fixes: 68c9e54beae8 ("target/riscv: do not enable all named features by default")
Signed-off-by: Vladimir Isaev <vladimir.isaev@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240823063431.17474-1-vladimir.isaev@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/cpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index cc5552500a..0f8189bcf0 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -115,7 +115,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
ISA_EXT_DATA_ENTRY(zihpm, PRIV_VERSION_1_12_0, ext_zihpm),
ISA_EXT_DATA_ENTRY(zimop, PRIV_VERSION_1_13_0, ext_zimop),
ISA_EXT_DATA_ENTRY(zmmul, PRIV_VERSION_1_12_0, ext_zmmul),
- ISA_EXT_DATA_ENTRY(za64rs, PRIV_VERSION_1_12_0, has_priv_1_11),
+ ISA_EXT_DATA_ENTRY(za64rs, PRIV_VERSION_1_12_0, has_priv_1_12),
ISA_EXT_DATA_ENTRY(zaamo, PRIV_VERSION_1_12_0, ext_zaamo),
ISA_EXT_DATA_ENTRY(zabha, PRIV_VERSION_1_13_0, ext_zabha),
ISA_EXT_DATA_ENTRY(zacas, PRIV_VERSION_1_12_0, ext_zacas),
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 07/47] target: riscv: Enable Bit Manip for OpenTitan Ibex CPU
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (5 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 06/47] target/riscv: fix za64rs enabling Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 08/47] target/riscv/kvm: Fix the group bit setting of AIA Alistair Francis
` (40 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Alistair Francis, Daniel Henrique Barboza
From: Alistair Francis <alistair23@gmail.com>
The OpenTitan Ibex CPU now supports the the Zba, Zbb, Zbc
and Zbs bit-manipulation sub-extensions ratified in
v.1.0.0 of the RISC-V Bit- Manipulation ISA Extension, so let's enable
them in QEMU as well.
1: https://github.com/lowRISC/opentitan/pull/9748
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240823003231.3522113-1-alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/cpu.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 0f8189bcf0..a1ca12077f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -680,6 +680,11 @@ static void rv32_ibex_cpu_init(Object *obj)
cpu->cfg.ext_zicsr = true;
cpu->cfg.pmp = true;
cpu->cfg.ext_smepmp = true;
+
+ cpu->cfg.ext_zba = true;
+ cpu->cfg.ext_zbb = true;
+ cpu->cfg.ext_zbc = true;
+ cpu->cfg.ext_zbs = true;
}
static void rv32_imafcu_nommu_cpu_init(Object *obj)
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 08/47] target/riscv/kvm: Fix the group bit setting of AIA
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (6 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 07/47] target: riscv: Enable Bit Manip for OpenTitan Ibex CPU Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 09/47] target/riscv: Stop timer with infinite timecmp Alistair Francis
` (39 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Andrew Jones, Daniel Henrique Barboza,
Alistair Francis
From: Andrew Jones <ajones@ventanamicro.com>
Just as the hart bit setting of the AIA should be calculated as
ceil(log2(max_hart_id + 1)) the group bit setting should be
calculated as ceil(log2(max_group_id + 1)). The hart bits are
implemented by passing max_hart_id to find_last_bit() and adding
one to the result. Do the same for the group bit setting.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240821075040.498945-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/kvm/kvm-cpu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index f6e3156b8d..341af901c5 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -1695,6 +1695,7 @@ void kvm_riscv_aia_create(MachineState *machine, uint64_t group_shift,
uint64_t max_hart_per_socket = 0;
uint64_t socket, base_hart, hart_count, socket_imsic_base, imsic_addr;
uint64_t socket_bits, hart_bits, guest_bits;
+ uint64_t max_group_id;
aia_fd = kvm_create_device(kvm_state, KVM_DEV_TYPE_RISCV_AIA, false);
@@ -1742,7 +1743,8 @@ void kvm_riscv_aia_create(MachineState *machine, uint64_t group_shift,
if (socket_count > 1) {
- socket_bits = find_last_bit(&socket_count, BITS_PER_LONG) + 1;
+ max_group_id = socket_count - 1;
+ socket_bits = find_last_bit(&max_group_id, BITS_PER_LONG) + 1;
ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
KVM_DEV_RISCV_AIA_CONFIG_GROUP_BITS,
&socket_bits, true, NULL);
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 09/47] target/riscv: Stop timer with infinite timecmp
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (7 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 08/47] target/riscv/kvm: Fix the group bit setting of AIA Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 10/47] target/riscv/cpu.c: Add 'fcsr' register to QEMU log as a part of F extension Alistair Francis
` (38 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Andrew Jones, Alistair Francis
From: Andrew Jones <ajones@ventanamicro.com>
While the spec doesn't state it, setting timecmp to UINT64_MAX is
another way to stop a timer, as it's considered setting the next
timer event to occur at infinity. And, even if the time CSR does
eventually reach UINT64_MAX, the very next tick will bring it back to
zero, once again less than timecmp. For this reason
riscv_timer_write_timecmp() special cases UINT64_MAX. However, if a
previously set timecmp has not yet expired, then setting timecmp to
UINT64_MAX to disable / stop it would not work, as the special case
left the previous QEMU timer active, which would then still deliver
an interrupt at that previous timecmp time. Ensure the stopped timer
will not still deliver an interrupt by also deleting the QEMU timer
in the UINT64_MAX special case.
Fixes: ae0edf2188b3 ("target/riscv: No need to re-start QEMU timer when timecmp == UINT64_MAX")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240829084002.1805006-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/time_helper.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/target/riscv/time_helper.c b/target/riscv/time_helper.c
index 8d245bed3a..bc0d9a0c4c 100644
--- a/target/riscv/time_helper.c
+++ b/target/riscv/time_helper.c
@@ -92,6 +92,7 @@ void riscv_timer_write_timecmp(CPURISCVState *env, QEMUTimer *timer,
* equals UINT64_MAX.
*/
if (timecmp == UINT64_MAX) {
+ timer_del(timer);
return;
}
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 10/47] target/riscv/cpu.c: Add 'fcsr' register to QEMU log as a part of F extension
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (8 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 09/47] target/riscv: Stop timer with infinite timecmp Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 11/47] util/util/cpuinfo-riscv.c: fix riscv64 build on musl libc Alistair Francis
` (37 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Maria Klauchek, Daniel Henrique Barboza,
Alistair Francis
From: Maria Klauchek <m.klauchek@syntacore.com>
FCSR is a part of F extension. Print it to log if FPU option is enabled.
Signed-off-by: Maria Klauchek <m.klauchek@syntacore.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240902103433.18424-1-m.klauchek@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/cpu.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index a1ca12077f..89bc3955ee 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -823,6 +823,12 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, int flags)
}
}
if (flags & CPU_DUMP_FPU) {
+ target_ulong val = 0;
+ RISCVException res = riscv_csrrw_debug(env, CSR_FCSR, &val, 0, 0);
+ if (res == RISCV_EXCP_NONE) {
+ qemu_fprintf(f, " %-8s " TARGET_FMT_lx "\n",
+ csr_ops[CSR_FCSR].name, val);
+ }
for (i = 0; i < 32; i++) {
qemu_fprintf(f, " %-8s %016" PRIx64,
riscv_fpr_regnames[i], env->fpr[i]);
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 11/47] util/util/cpuinfo-riscv.c: fix riscv64 build on musl libc
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (9 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 10/47] target/riscv/cpu.c: Add 'fcsr' register to QEMU log as a part of F extension Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 12/47] target/riscv: Preliminary textra trigger CSR writting support Alistair Francis
` (36 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Milan P. Stanić, Alistair Francis
From: Milan P. Stanić <mps@arvanta.net>
build fails on musl libc (alpine linux) with this error:
../util/cpuinfo-riscv.c: In function 'cpuinfo_init':
../util/cpuinfo-riscv.c:63:21: error: '__NR_riscv_hwprobe' undeclared (first use in this function); did you mean 'riscv_hwprobe'?
63 | if (syscall(__NR_riscv_hwprobe, &pair, 1, 0, NULL, 0) == 0
| ^~~~~~~~~~~~~~~~~~
| riscv_hwprobe
../util/cpuinfo-riscv.c:63:21: note: each undeclared identifier is reported only once for each function it appears in
ninja: subcommand failed
add '#include "asm/unistd.h"' to util/cpuinfo-riscv.c fixes build
Signed-off-by: Milan P. Stanić <mps@arvanta.net>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240905150702.2484-1-mps@arvanta.net>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
util/cpuinfo-riscv.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/util/cpuinfo-riscv.c b/util/cpuinfo-riscv.c
index 497ce12680..8cacc67645 100644
--- a/util/cpuinfo-riscv.c
+++ b/util/cpuinfo-riscv.c
@@ -9,6 +9,7 @@
#ifdef CONFIG_ASM_HWPROBE_H
#include <asm/hwprobe.h>
#include <sys/syscall.h>
+#include <asm/unistd.h>
#endif
unsigned cpuinfo;
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 12/47] target/riscv: Preliminary textra trigger CSR writting support
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (10 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 11/47] util/util/cpuinfo-riscv.c: fix riscv64 build on musl libc Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 13/47] target/riscv: Add textra matching condition for the triggers Alistair Francis
` (35 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Alvin Chang, Alistair Francis
From: Alvin Chang <alvinga@andestech.com>
This commit allows program to write textra trigger CSR for type 2, 3, 6
triggers. In this preliminary patch, the textra.MHVALUE and the
textra.MHSELECT fields are allowed to be configured. Other fields, such
as textra.SBYTEMASK, textra.SVALUE, and textra.SSELECT, are hardwired to
zero for now.
For textra.MHSELECT field, the only legal values are 0 (ignore) and 4
(mcontext). Writing 1~3 into textra.MHSELECT will be changed to 0, and
writing 5~7 into textra.MHSELECT will be changed to 4. This behavior is
aligned to RISC-V SPIKE simulator.
Signed-off-by: Alvin Chang <alvinga@andestech.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240826024657.262553-2-alvinga@andestech.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/cpu_bits.h | 10 ++++++
target/riscv/debug.c | 69 +++++++++++++++++++++++++++++++++++++----
2 files changed, 73 insertions(+), 6 deletions(-)
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 32b068f18a..7e3f629356 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -947,6 +947,16 @@ typedef enum RISCVException {
#define JVT_BASE (~0x3F)
/* Debug Sdtrig CSR masks */
+#define TEXTRA32_MHVALUE 0xFC000000
+#define TEXTRA32_MHSELECT 0x03800000
+#define TEXTRA32_SBYTEMASK 0x000C0000
+#define TEXTRA32_SVALUE 0x0003FFFC
+#define TEXTRA32_SSELECT 0x00000003
+#define TEXTRA64_MHVALUE 0xFFF8000000000000ULL
+#define TEXTRA64_MHSELECT 0x0007000000000000ULL
+#define TEXTRA64_SBYTEMASK 0x000000F000000000ULL
+#define TEXTRA64_SVALUE 0x00000003FFFFFFFCULL
+#define TEXTRA64_SSELECT 0x0000000000000003ULL
#define MCONTEXT32 0x0000003F
#define MCONTEXT64 0x0000000000001FFFULL
#define MCONTEXT32_HCONTEXT 0x0000007F
diff --git a/target/riscv/debug.c b/target/riscv/debug.c
index 0b5099ff9a..d6b4a06144 100644
--- a/target/riscv/debug.c
+++ b/target/riscv/debug.c
@@ -217,6 +217,66 @@ static inline void warn_always_zero_bit(target_ulong val, target_ulong mask,
}
}
+static target_ulong textra_validate(CPURISCVState *env, target_ulong tdata3)
+{
+ target_ulong mhvalue, mhselect;
+ target_ulong mhselect_new;
+ target_ulong textra;
+ const uint32_t mhselect_no_rvh[8] = { 0, 0, 0, 0, 4, 4, 4, 4 };
+
+ switch (riscv_cpu_mxl(env)) {
+ case MXL_RV32:
+ mhvalue = get_field(tdata3, TEXTRA32_MHVALUE);
+ mhselect = get_field(tdata3, TEXTRA32_MHSELECT);
+ /* Validate unimplemented (always zero) bits */
+ warn_always_zero_bit(tdata3, (target_ulong)TEXTRA32_SBYTEMASK,
+ "sbytemask");
+ warn_always_zero_bit(tdata3, (target_ulong)TEXTRA32_SVALUE,
+ "svalue");
+ warn_always_zero_bit(tdata3, (target_ulong)TEXTRA32_SSELECT,
+ "sselect");
+ break;
+ case MXL_RV64:
+ case MXL_RV128:
+ mhvalue = get_field(tdata3, TEXTRA64_MHVALUE);
+ mhselect = get_field(tdata3, TEXTRA64_MHSELECT);
+ /* Validate unimplemented (always zero) bits */
+ warn_always_zero_bit(tdata3, (target_ulong)TEXTRA64_SBYTEMASK,
+ "sbytemask");
+ warn_always_zero_bit(tdata3, (target_ulong)TEXTRA64_SVALUE,
+ "svalue");
+ warn_always_zero_bit(tdata3, (target_ulong)TEXTRA64_SSELECT,
+ "sselect");
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ /* Validate mhselect. */
+ mhselect_new = mhselect_no_rvh[mhselect];
+ if (mhselect != mhselect_new) {
+ qemu_log_mask(LOG_UNIMP, "mhselect only supports 0 or 4 for now\n");
+ }
+
+ /* Write legal values into textra */
+ textra = 0;
+ switch (riscv_cpu_mxl(env)) {
+ case MXL_RV32:
+ textra = set_field(textra, TEXTRA32_MHVALUE, mhvalue);
+ textra = set_field(textra, TEXTRA32_MHSELECT, mhselect_new);
+ break;
+ case MXL_RV64:
+ case MXL_RV128:
+ textra = set_field(textra, TEXTRA64_MHVALUE, mhvalue);
+ textra = set_field(textra, TEXTRA64_MHSELECT, mhselect_new);
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ return textra;
+}
+
static void do_trigger_action(CPURISCVState *env, target_ulong trigger_index)
{
trigger_action_t action = get_trigger_action(env, trigger_index);
@@ -441,8 +501,7 @@ static void type2_reg_write(CPURISCVState *env, target_ulong index,
}
break;
case TDATA3:
- qemu_log_mask(LOG_UNIMP,
- "tdata3 is not supported for type 2 trigger\n");
+ env->tdata3[index] = textra_validate(env, val);
break;
default:
g_assert_not_reached();
@@ -558,8 +617,7 @@ static void type6_reg_write(CPURISCVState *env, target_ulong index,
}
break;
case TDATA3:
- qemu_log_mask(LOG_UNIMP,
- "tdata3 is not supported for type 6 trigger\n");
+ env->tdata3[index] = textra_validate(env, val);
break;
default:
g_assert_not_reached();
@@ -741,8 +799,7 @@ static void itrigger_reg_write(CPURISCVState *env, target_ulong index,
"tdata2 is not supported for icount trigger\n");
break;
case TDATA3:
- qemu_log_mask(LOG_UNIMP,
- "tdata3 is not supported for icount trigger\n");
+ env->tdata3[index] = textra_validate(env, val);
break;
default:
g_assert_not_reached();
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 13/47] target/riscv: Add textra matching condition for the triggers
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (11 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 12/47] target/riscv: Preliminary textra trigger CSR writting support Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 14/47] exec/memtxattr: add process identifier to the transaction attributes Alistair Francis
` (34 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Alvin Chang, Alistair Francis
From: Alvin Chang <alvinga@andestech.com>
According to RISC-V Debug specification, the optional textra32 and
textra64 trigger CSRs can be used to configure additional matching
conditions for the triggers. For example, if the textra.MHSELECT field
is set to 4 (mcontext), this trigger will only match or fire if the low
bits of mcontext/hcontext equal textra.MHVALUE field.
This commit adds the aforementioned matching condition as common trigger
matching conditions. Currently, the only legal values of textra.MHSELECT
are 0 (ignore) and 4 (mcontext). When textra.MHSELECT is 0, we pass the
checking. When textra.MHSELECT is 4, we compare textra.MHVALUE with
mcontext CSR. The remaining fields, such as textra.SBYTEMASK,
textra.SVALUE, and textra.SSELECT, are hardwired to zero for now. Thus,
we skip checking them here.
Signed-off-by: Alvin Chang <alvinga@andestech.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240826024657.262553-3-alvinga@andestech.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/debug.h | 3 +++
target/riscv/debug.c | 45 +++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 47 insertions(+), 1 deletion(-)
diff --git a/target/riscv/debug.h b/target/riscv/debug.h
index c347863578..f76b8f944a 100644
--- a/target/riscv/debug.h
+++ b/target/riscv/debug.h
@@ -131,6 +131,9 @@ enum {
#define ITRIGGER_VU BIT(25)
#define ITRIGGER_VS BIT(26)
+#define MHSELECT_IGNORE 0
+#define MHSELECT_MCONTEXT 4
+
bool tdata_available(CPURISCVState *env, int tdata_index);
target_ulong tselect_csr_read(CPURISCVState *env);
diff --git a/target/riscv/debug.c b/target/riscv/debug.c
index d6b4a06144..c79b51af30 100644
--- a/target/riscv/debug.c
+++ b/target/riscv/debug.c
@@ -364,11 +364,54 @@ static bool trigger_priv_match(CPURISCVState *env, trigger_type_t type,
return false;
}
+static bool trigger_textra_match(CPURISCVState *env, trigger_type_t type,
+ int trigger_index)
+{
+ target_ulong textra = env->tdata3[trigger_index];
+ target_ulong mhvalue, mhselect;
+
+ if (type < TRIGGER_TYPE_AD_MATCH || type > TRIGGER_TYPE_AD_MATCH6) {
+ /* textra checking is only applicable when type is 2, 3, 4, 5, or 6 */
+ return true;
+ }
+
+ switch (riscv_cpu_mxl(env)) {
+ case MXL_RV32:
+ mhvalue = get_field(textra, TEXTRA32_MHVALUE);
+ mhselect = get_field(textra, TEXTRA32_MHSELECT);
+ break;
+ case MXL_RV64:
+ case MXL_RV128:
+ mhvalue = get_field(textra, TEXTRA64_MHVALUE);
+ mhselect = get_field(textra, TEXTRA64_MHSELECT);
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ /* Check mhvalue and mhselect. */
+ switch (mhselect) {
+ case MHSELECT_IGNORE:
+ break;
+ case MHSELECT_MCONTEXT:
+ /* Match if the low bits of mcontext/hcontext equal mhvalue. */
+ if (mhvalue != env->mcontext) {
+ return false;
+ }
+ break;
+ default:
+ break;
+ }
+
+ return true;
+}
+
/* Common matching conditions for all types of the triggers. */
static bool trigger_common_match(CPURISCVState *env, trigger_type_t type,
int trigger_index)
{
- return trigger_priv_match(env, type, trigger_index);
+ return trigger_priv_match(env, type, trigger_index) &&
+ trigger_textra_match(env, type, trigger_index);
}
/* type 2 trigger */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 14/47] exec/memtxattr: add process identifier to the transaction attributes
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (12 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 13/47] target/riscv: Add textra matching condition for the triggers Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 15/47] hw/riscv: add riscv-iommu-bits.h Alistair Francis
` (33 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Tomasz Jeznach, Frank Chang, Jason Chien,
Alistair Francis, Daniel Henrique Barboza
From: Tomasz Jeznach <tjeznach@rivosinc.com>
Extend memory transaction attributes with process identifier to allow
per-request address translation logic to use requester_id / process_id
to identify memory mapping (e.g. enabling IOMMU w/ PASID translations).
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240903201633.93182-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
include/exec/memattrs.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/exec/memattrs.h b/include/exec/memattrs.h
index 14cdd8d582..e27c18f3dc 100644
--- a/include/exec/memattrs.h
+++ b/include/exec/memattrs.h
@@ -52,6 +52,11 @@ typedef struct MemTxAttrs {
unsigned int memory:1;
/* Requester ID (for MSI for example) */
unsigned int requester_id:16;
+
+ /*
+ * PID (PCI PASID) support: Limited to 8 bits process identifier.
+ */
+ unsigned int pid:8;
} MemTxAttrs;
/* Bus masters which don't specify any attributes will get this,
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 15/47] hw/riscv: add riscv-iommu-bits.h
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (13 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 14/47] exec/memtxattr: add process identifier to the transaction attributes Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation Alistair Francis
` (32 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Tomasz Jeznach, Daniel Henrique Barboza, Frank Chang,
Jason Chien, Alistair Francis
From: Tomasz Jeznach <tjeznach@rivosinc.com>
This header will be used by the RISC-V IOMMU emulation to be added
in the next patch. Due to its size it's being sent in separate for
an easier review.
One thing to notice is that this header can be replaced by the future
Linux RISC-V IOMMU driver header, which would become a linux-header we
would import instead of keeping our own. The Linux implementation isn't
upstream yet so for now we'll have to manage riscv-iommu-bits.h.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
hw/riscv/riscv-iommu-bits.h | 345 ++++++++++++++++++++++++++++++++++++
1 file changed, 345 insertions(+)
create mode 100644 hw/riscv/riscv-iommu-bits.h
diff --git a/hw/riscv/riscv-iommu-bits.h b/hw/riscv/riscv-iommu-bits.h
new file mode 100644
index 0000000000..c46d7d18ab
--- /dev/null
+++ b/hw/riscv/riscv-iommu-bits.h
@@ -0,0 +1,345 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright © 2022-2023 Rivos Inc.
+ * Copyright © 2023 FORTH-ICS/CARV
+ * Copyright © 2023 RISC-V IOMMU Task Group
+ *
+ * RISC-V IOMMU - Register Layout and Data Structures.
+ *
+ * Based on the IOMMU spec version 1.0, 3/2023
+ * https://github.com/riscv-non-isa/riscv-iommu
+ */
+
+#ifndef HW_RISCV_IOMMU_BITS_H
+#define HW_RISCV_IOMMU_BITS_H
+
+#define RISCV_IOMMU_SPEC_DOT_VER 0x010
+
+#ifndef GENMASK_ULL
+#define GENMASK_ULL(h, l) (((~0ULL) >> (63 - (h) + (l))) << (l))
+#endif
+
+/*
+ * struct riscv_iommu_fq_record - Fault/Event Queue Record
+ * See section 3.2 for more info.
+ */
+struct riscv_iommu_fq_record {
+ uint64_t hdr;
+ uint64_t _reserved;
+ uint64_t iotval;
+ uint64_t iotval2;
+};
+/* Header fields */
+#define RISCV_IOMMU_FQ_HDR_CAUSE GENMASK_ULL(11, 0)
+#define RISCV_IOMMU_FQ_HDR_PID GENMASK_ULL(31, 12)
+#define RISCV_IOMMU_FQ_HDR_PV BIT_ULL(32)
+#define RISCV_IOMMU_FQ_HDR_TTYPE GENMASK_ULL(39, 34)
+#define RISCV_IOMMU_FQ_HDR_DID GENMASK_ULL(63, 40)
+
+/*
+ * struct riscv_iommu_pq_record - PCIe Page Request record
+ * For more infos on the PCIe Page Request queue see chapter 3.3.
+ */
+struct riscv_iommu_pq_record {
+ uint64_t hdr;
+ uint64_t payload;
+};
+/* Header fields */
+#define RISCV_IOMMU_PREQ_HDR_PID GENMASK_ULL(31, 12)
+#define RISCV_IOMMU_PREQ_HDR_PV BIT_ULL(32)
+#define RISCV_IOMMU_PREQ_HDR_PRIV BIT_ULL(33)
+#define RISCV_IOMMU_PREQ_HDR_EXEC BIT_ULL(34)
+#define RISCV_IOMMU_PREQ_HDR_DID GENMASK_ULL(63, 40)
+/* Payload fields */
+#define RISCV_IOMMU_PREQ_PAYLOAD_M GENMASK_ULL(2, 0)
+
+/* Common field positions */
+#define RISCV_IOMMU_PPN_FIELD GENMASK_ULL(53, 10)
+#define RISCV_IOMMU_QUEUE_LOGSZ_FIELD GENMASK_ULL(4, 0)
+#define RISCV_IOMMU_QUEUE_INDEX_FIELD GENMASK_ULL(31, 0)
+#define RISCV_IOMMU_QUEUE_ENABLE BIT(0)
+#define RISCV_IOMMU_QUEUE_INTR_ENABLE BIT(1)
+#define RISCV_IOMMU_QUEUE_MEM_FAULT BIT(8)
+#define RISCV_IOMMU_QUEUE_OVERFLOW BIT(9)
+#define RISCV_IOMMU_QUEUE_ACTIVE BIT(16)
+#define RISCV_IOMMU_QUEUE_BUSY BIT(17)
+#define RISCV_IOMMU_ATP_PPN_FIELD GENMASK_ULL(43, 0)
+#define RISCV_IOMMU_ATP_MODE_FIELD GENMASK_ULL(63, 60)
+
+/* 5.3 IOMMU Capabilities (64bits) */
+#define RISCV_IOMMU_REG_CAP 0x0000
+#define RISCV_IOMMU_CAP_VERSION GENMASK_ULL(7, 0)
+#define RISCV_IOMMU_CAP_MSI_FLAT BIT_ULL(22)
+#define RISCV_IOMMU_CAP_MSI_MRIF BIT_ULL(23)
+#define RISCV_IOMMU_CAP_T2GPA BIT_ULL(26)
+#define RISCV_IOMMU_CAP_IGS GENMASK_ULL(29, 28)
+#define RISCV_IOMMU_CAP_PAS GENMASK_ULL(37, 32)
+#define RISCV_IOMMU_CAP_PD8 BIT_ULL(38)
+#define RISCV_IOMMU_CAP_PD17 BIT_ULL(39)
+#define RISCV_IOMMU_CAP_PD20 BIT_ULL(40)
+
+/* 5.4 Features control register (32bits) */
+#define RISCV_IOMMU_REG_FCTL 0x0008
+#define RISCV_IOMMU_FCTL_WSI BIT(1)
+
+/* 5.5 Device-directory-table pointer (64bits) */
+#define RISCV_IOMMU_REG_DDTP 0x0010
+#define RISCV_IOMMU_DDTP_MODE GENMASK_ULL(3, 0)
+#define RISCV_IOMMU_DDTP_BUSY BIT_ULL(4)
+#define RISCV_IOMMU_DDTP_PPN RISCV_IOMMU_PPN_FIELD
+
+enum riscv_iommu_ddtp_modes {
+ RISCV_IOMMU_DDTP_MODE_OFF = 0,
+ RISCV_IOMMU_DDTP_MODE_BARE = 1,
+ RISCV_IOMMU_DDTP_MODE_1LVL = 2,
+ RISCV_IOMMU_DDTP_MODE_2LVL = 3,
+ RISCV_IOMMU_DDTP_MODE_3LVL = 4,
+ RISCV_IOMMU_DDTP_MODE_MAX = 4
+};
+
+/* 5.6 Command Queue Base (64bits) */
+#define RISCV_IOMMU_REG_CQB 0x0018
+#define RISCV_IOMMU_CQB_LOG2SZ RISCV_IOMMU_QUEUE_LOGSZ_FIELD
+#define RISCV_IOMMU_CQB_PPN RISCV_IOMMU_PPN_FIELD
+
+/* 5.7 Command Queue head (32bits) */
+#define RISCV_IOMMU_REG_CQH 0x0020
+
+/* 5.8 Command Queue tail (32bits) */
+#define RISCV_IOMMU_REG_CQT 0x0024
+
+/* 5.9 Fault Queue Base (64bits) */
+#define RISCV_IOMMU_REG_FQB 0x0028
+#define RISCV_IOMMU_FQB_LOG2SZ RISCV_IOMMU_QUEUE_LOGSZ_FIELD
+#define RISCV_IOMMU_FQB_PPN RISCV_IOMMU_PPN_FIELD
+
+/* 5.10 Fault Queue Head (32bits) */
+#define RISCV_IOMMU_REG_FQH 0x0030
+
+/* 5.11 Fault Queue tail (32bits) */
+#define RISCV_IOMMU_REG_FQT 0x0034
+
+/* 5.12 Page Request Queue base (64bits) */
+#define RISCV_IOMMU_REG_PQB 0x0038
+#define RISCV_IOMMU_PQB_LOG2SZ RISCV_IOMMU_QUEUE_LOGSZ_FIELD
+#define RISCV_IOMMU_PQB_PPN RISCV_IOMMU_PPN_FIELD
+
+/* 5.13 Page Request Queue head (32bits) */
+#define RISCV_IOMMU_REG_PQH 0x0040
+
+/* 5.14 Page Request Queue tail (32bits) */
+#define RISCV_IOMMU_REG_PQT 0x0044
+
+/* 5.15 Command Queue CSR (32bits) */
+#define RISCV_IOMMU_REG_CQCSR 0x0048
+#define RISCV_IOMMU_CQCSR_CQEN RISCV_IOMMU_QUEUE_ENABLE
+#define RISCV_IOMMU_CQCSR_CIE RISCV_IOMMU_QUEUE_INTR_ENABLE
+#define RISCV_IOMMU_CQCSR_CQMF RISCV_IOMMU_QUEUE_MEM_FAULT
+#define RISCV_IOMMU_CQCSR_CMD_TO BIT(9)
+#define RISCV_IOMMU_CQCSR_CMD_ILL BIT(10)
+#define RISCV_IOMMU_CQCSR_FENCE_W_IP BIT(11)
+#define RISCV_IOMMU_CQCSR_CQON RISCV_IOMMU_QUEUE_ACTIVE
+#define RISCV_IOMMU_CQCSR_BUSY RISCV_IOMMU_QUEUE_BUSY
+
+/* 5.16 Fault Queue CSR (32bits) */
+#define RISCV_IOMMU_REG_FQCSR 0x004C
+#define RISCV_IOMMU_FQCSR_FQEN RISCV_IOMMU_QUEUE_ENABLE
+#define RISCV_IOMMU_FQCSR_FIE RISCV_IOMMU_QUEUE_INTR_ENABLE
+#define RISCV_IOMMU_FQCSR_FQMF RISCV_IOMMU_QUEUE_MEM_FAULT
+#define RISCV_IOMMU_FQCSR_FQOF RISCV_IOMMU_QUEUE_OVERFLOW
+#define RISCV_IOMMU_FQCSR_FQON RISCV_IOMMU_QUEUE_ACTIVE
+#define RISCV_IOMMU_FQCSR_BUSY RISCV_IOMMU_QUEUE_BUSY
+
+/* 5.17 Page Request Queue CSR (32bits) */
+#define RISCV_IOMMU_REG_PQCSR 0x0050
+#define RISCV_IOMMU_PQCSR_PQEN RISCV_IOMMU_QUEUE_ENABLE
+#define RISCV_IOMMU_PQCSR_PIE RISCV_IOMMU_QUEUE_INTR_ENABLE
+#define RISCV_IOMMU_PQCSR_PQMF RISCV_IOMMU_QUEUE_MEM_FAULT
+#define RISCV_IOMMU_PQCSR_PQOF RISCV_IOMMU_QUEUE_OVERFLOW
+#define RISCV_IOMMU_PQCSR_PQON RISCV_IOMMU_QUEUE_ACTIVE
+#define RISCV_IOMMU_PQCSR_BUSY RISCV_IOMMU_QUEUE_BUSY
+
+/* 5.18 Interrupt Pending Status (32bits) */
+#define RISCV_IOMMU_REG_IPSR 0x0054
+#define RISCV_IOMMU_IPSR_CIP BIT(0)
+#define RISCV_IOMMU_IPSR_FIP BIT(1)
+#define RISCV_IOMMU_IPSR_PIP BIT(3)
+
+enum {
+ RISCV_IOMMU_INTR_CQ,
+ RISCV_IOMMU_INTR_FQ,
+ RISCV_IOMMU_INTR_PM,
+ RISCV_IOMMU_INTR_PQ,
+ RISCV_IOMMU_INTR_COUNT
+};
+
+/* 5.27 Interrupt cause to vector (64bits) */
+#define RISCV_IOMMU_REG_ICVEC 0x02F8
+
+/* 5.28 MSI Configuration table (32 * 64bits) */
+#define RISCV_IOMMU_REG_MSI_CONFIG 0x0300
+
+#define RISCV_IOMMU_REG_SIZE 0x1000
+
+#define RISCV_IOMMU_DDTE_VALID BIT_ULL(0)
+#define RISCV_IOMMU_DDTE_PPN RISCV_IOMMU_PPN_FIELD
+
+/* Struct riscv_iommu_dc - Device Context - section 2.1 */
+struct riscv_iommu_dc {
+ uint64_t tc;
+ uint64_t iohgatp;
+ uint64_t ta;
+ uint64_t fsc;
+ uint64_t msiptp;
+ uint64_t msi_addr_mask;
+ uint64_t msi_addr_pattern;
+ uint64_t _reserved;
+};
+
+/* Translation control fields */
+#define RISCV_IOMMU_DC_TC_V BIT_ULL(0)
+#define RISCV_IOMMU_DC_TC_EN_PRI BIT_ULL(2)
+#define RISCV_IOMMU_DC_TC_T2GPA BIT_ULL(3)
+#define RISCV_IOMMU_DC_TC_DTF BIT_ULL(4)
+#define RISCV_IOMMU_DC_TC_PDTV BIT_ULL(5)
+#define RISCV_IOMMU_DC_TC_PRPR BIT_ULL(6)
+#define RISCV_IOMMU_DC_TC_DPE BIT_ULL(9)
+#define RISCV_IOMMU_DC_TC_SBE BIT_ULL(10)
+#define RISCV_IOMMU_DC_TC_SXL BIT_ULL(11)
+
+/* Second-stage (aka G-stage) context fields */
+#define RISCV_IOMMU_DC_IOHGATP_PPN RISCV_IOMMU_ATP_PPN_FIELD
+#define RISCV_IOMMU_DC_IOHGATP_GSCID GENMASK_ULL(59, 44)
+#define RISCV_IOMMU_DC_IOHGATP_MODE RISCV_IOMMU_ATP_MODE_FIELD
+
+enum riscv_iommu_dc_iohgatp_modes {
+ RISCV_IOMMU_DC_IOHGATP_MODE_BARE = 0,
+ RISCV_IOMMU_DC_IOHGATP_MODE_SV32X4 = 8,
+ RISCV_IOMMU_DC_IOHGATP_MODE_SV39X4 = 8,
+ RISCV_IOMMU_DC_IOHGATP_MODE_SV48X4 = 9,
+ RISCV_IOMMU_DC_IOHGATP_MODE_SV57X4 = 10
+};
+
+/* Translation attributes fields */
+#define RISCV_IOMMU_DC_TA_PSCID GENMASK_ULL(31, 12)
+
+/* First-stage context fields */
+#define RISCV_IOMMU_DC_FSC_PPN RISCV_IOMMU_ATP_PPN_FIELD
+#define RISCV_IOMMU_DC_FSC_MODE RISCV_IOMMU_ATP_MODE_FIELD
+
+/* Generic I/O MMU command structure - check section 3.1 */
+struct riscv_iommu_command {
+ uint64_t dword0;
+ uint64_t dword1;
+};
+
+#define RISCV_IOMMU_CMD_OPCODE GENMASK_ULL(6, 0)
+#define RISCV_IOMMU_CMD_FUNC GENMASK_ULL(9, 7)
+
+#define RISCV_IOMMU_CMD_IOTINVAL_OPCODE 1
+#define RISCV_IOMMU_CMD_IOTINVAL_FUNC_VMA 0
+#define RISCV_IOMMU_CMD_IOTINVAL_FUNC_GVMA 1
+#define RISCV_IOMMU_CMD_IOTINVAL_AV BIT_ULL(10)
+#define RISCV_IOMMU_CMD_IOTINVAL_PSCID GENMASK_ULL(31, 12)
+#define RISCV_IOMMU_CMD_IOTINVAL_PSCV BIT_ULL(32)
+#define RISCV_IOMMU_CMD_IOTINVAL_GV BIT_ULL(33)
+#define RISCV_IOMMU_CMD_IOTINVAL_GSCID GENMASK_ULL(59, 44)
+
+#define RISCV_IOMMU_CMD_IOFENCE_OPCODE 2
+#define RISCV_IOMMU_CMD_IOFENCE_FUNC_C 0
+#define RISCV_IOMMU_CMD_IOFENCE_AV BIT_ULL(10)
+#define RISCV_IOMMU_CMD_IOFENCE_DATA GENMASK_ULL(63, 32)
+
+#define RISCV_IOMMU_CMD_IODIR_OPCODE 3
+#define RISCV_IOMMU_CMD_IODIR_FUNC_INVAL_DDT 0
+#define RISCV_IOMMU_CMD_IODIR_FUNC_INVAL_PDT 1
+#define RISCV_IOMMU_CMD_IODIR_PID GENMASK_ULL(31, 12)
+#define RISCV_IOMMU_CMD_IODIR_DV BIT_ULL(33)
+#define RISCV_IOMMU_CMD_IODIR_DID GENMASK_ULL(63, 40)
+
+enum riscv_iommu_dc_fsc_atp_modes {
+ RISCV_IOMMU_DC_FSC_MODE_BARE = 0,
+ RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV32 = 8,
+ RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV39 = 8,
+ RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV48 = 9,
+ RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV57 = 10,
+ RISCV_IOMMU_DC_FSC_PDTP_MODE_PD8 = 1,
+ RISCV_IOMMU_DC_FSC_PDTP_MODE_PD17 = 2,
+ RISCV_IOMMU_DC_FSC_PDTP_MODE_PD20 = 3
+};
+
+enum riscv_iommu_fq_causes {
+ RISCV_IOMMU_FQ_CAUSE_INST_FAULT = 1,
+ RISCV_IOMMU_FQ_CAUSE_RD_ADDR_MISALIGNED = 4,
+ RISCV_IOMMU_FQ_CAUSE_RD_FAULT = 5,
+ RISCV_IOMMU_FQ_CAUSE_WR_ADDR_MISALIGNED = 6,
+ RISCV_IOMMU_FQ_CAUSE_WR_FAULT = 7,
+ RISCV_IOMMU_FQ_CAUSE_INST_FAULT_S = 12,
+ RISCV_IOMMU_FQ_CAUSE_RD_FAULT_S = 13,
+ RISCV_IOMMU_FQ_CAUSE_WR_FAULT_S = 15,
+ RISCV_IOMMU_FQ_CAUSE_INST_FAULT_VS = 20,
+ RISCV_IOMMU_FQ_CAUSE_RD_FAULT_VS = 21,
+ RISCV_IOMMU_FQ_CAUSE_WR_FAULT_VS = 23,
+ RISCV_IOMMU_FQ_CAUSE_DMA_DISABLED = 256,
+ RISCV_IOMMU_FQ_CAUSE_DDT_LOAD_FAULT = 257,
+ RISCV_IOMMU_FQ_CAUSE_DDT_INVALID = 258,
+ RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED = 259,
+ RISCV_IOMMU_FQ_CAUSE_TTYPE_BLOCKED = 260,
+ RISCV_IOMMU_FQ_CAUSE_MSI_LOAD_FAULT = 261,
+ RISCV_IOMMU_FQ_CAUSE_MSI_INVALID = 262,
+ RISCV_IOMMU_FQ_CAUSE_MSI_MISCONFIGURED = 263,
+ RISCV_IOMMU_FQ_CAUSE_MRIF_FAULT = 264,
+ RISCV_IOMMU_FQ_CAUSE_PDT_LOAD_FAULT = 265,
+ RISCV_IOMMU_FQ_CAUSE_PDT_INVALID = 266,
+ RISCV_IOMMU_FQ_CAUSE_PDT_MISCONFIGURED = 267,
+ RISCV_IOMMU_FQ_CAUSE_DDT_CORRUPTED = 268,
+ RISCV_IOMMU_FQ_CAUSE_PDT_CORRUPTED = 269,
+ RISCV_IOMMU_FQ_CAUSE_MSI_PT_CORRUPTED = 270,
+ RISCV_IOMMU_FQ_CAUSE_MRIF_CORRUIPTED = 271,
+ RISCV_IOMMU_FQ_CAUSE_INTERNAL_DP_ERROR = 272,
+ RISCV_IOMMU_FQ_CAUSE_MSI_WR_FAULT = 273,
+ RISCV_IOMMU_FQ_CAUSE_PT_CORRUPTED = 274
+};
+
+/* MSI page table pointer */
+#define RISCV_IOMMU_DC_MSIPTP_PPN RISCV_IOMMU_ATP_PPN_FIELD
+#define RISCV_IOMMU_DC_MSIPTP_MODE RISCV_IOMMU_ATP_MODE_FIELD
+#define RISCV_IOMMU_DC_MSIPTP_MODE_OFF 0
+#define RISCV_IOMMU_DC_MSIPTP_MODE_FLAT 1
+
+/* Translation attributes fields */
+#define RISCV_IOMMU_PC_TA_V BIT_ULL(0)
+
+/* First stage context fields */
+#define RISCV_IOMMU_PC_FSC_PPN GENMASK_ULL(43, 0)
+
+enum riscv_iommu_fq_ttypes {
+ RISCV_IOMMU_FQ_TTYPE_NONE = 0,
+ RISCV_IOMMU_FQ_TTYPE_UADDR_INST_FETCH = 1,
+ RISCV_IOMMU_FQ_TTYPE_UADDR_RD = 2,
+ RISCV_IOMMU_FQ_TTYPE_UADDR_WR = 3,
+ RISCV_IOMMU_FQ_TTYPE_TADDR_INST_FETCH = 5,
+ RISCV_IOMMU_FQ_TTYPE_TADDR_RD = 6,
+ RISCV_IOMMU_FQ_TTYPE_TADDR_WR = 7,
+ RISCV_IOMMU_FW_TTYPE_PCIE_MSG_REQ = 8,
+};
+
+/* Fields on pte */
+#define RISCV_IOMMU_MSI_PTE_V BIT_ULL(0)
+#define RISCV_IOMMU_MSI_PTE_M GENMASK_ULL(2, 1)
+
+#define RISCV_IOMMU_MSI_PTE_M_MRIF 1
+#define RISCV_IOMMU_MSI_PTE_M_BASIC 3
+
+/* When M == 1 (MRIF mode) */
+#define RISCV_IOMMU_MSI_PTE_MRIF_ADDR GENMASK_ULL(53, 7)
+/* When M == 3 (basic mode) */
+#define RISCV_IOMMU_MSI_PTE_PPN RISCV_IOMMU_PPN_FIELD
+#define RISCV_IOMMU_MSI_PTE_C BIT_ULL(63)
+
+/* Fields on mrif_info */
+#define RISCV_IOMMU_MSI_MRIF_NID GENMASK_ULL(9, 0)
+#define RISCV_IOMMU_MSI_MRIF_NPPN RISCV_IOMMU_PPN_FIELD
+#define RISCV_IOMMU_MSI_MRIF_NID_MSB BIT_ULL(60)
+
+#endif /* _RISCV_IOMMU_BITS_H_ */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (14 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 15/47] hw/riscv: add riscv-iommu-bits.h Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-28 20:22 ` Peter Maydell
2024-10-01 22:24 ` Tomasz Jeznach
2024-09-24 22:17 ` [PULL v2 17/47] pci-ids.rst: add Red Hat pci-id for RISC-V IOMMU device Alistair Francis
` (31 subsequent siblings)
47 siblings, 2 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Tomasz Jeznach, Sebastien Boeuf,
Daniel Henrique Barboza, Alistair Francis
From: Tomasz Jeznach <tjeznach@rivosinc.com>
The RISC-V IOMMU specification is now ratified as-per the RISC-V
international process. The latest frozen specifcation can be found at:
https://github.com/riscv-non-isa/riscv-iommu/releases/download/v1.0/riscv-iommu.pdf
Add the foundation of the device emulation for RISC-V IOMMU. It includes
support for s-stage (sv32, sv39, sv48, sv57 caps) and g-stage (sv32x4,
sv39x4, sv48x4, sv57x4 caps).
Other capabilities like ATS and DBG support will be added incrementally
in the next patches.
Co-developed-by: Sebastien Boeuf <seb@rivosinc.com>
Signed-off-by: Sebastien Boeuf <seb@rivosinc.com>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
meson.build | 1 +
hw/riscv/riscv-iommu-bits.h | 18 +
hw/riscv/riscv-iommu.h | 145 +++
hw/riscv/trace.h | 1 +
include/hw/riscv/iommu.h | 36 +
hw/riscv/riscv-iommu.c | 2050 +++++++++++++++++++++++++++++++++++
hw/riscv/Kconfig | 4 +
hw/riscv/meson.build | 1 +
hw/riscv/trace-events | 14 +
9 files changed, 2270 insertions(+)
create mode 100644 hw/riscv/riscv-iommu.h
create mode 100644 hw/riscv/trace.h
create mode 100644 include/hw/riscv/iommu.h
create mode 100644 hw/riscv/riscv-iommu.c
create mode 100644 hw/riscv/trace-events
diff --git a/meson.build b/meson.build
index 10464466ff..71de8a5cd1 100644
--- a/meson.build
+++ b/meson.build
@@ -3439,6 +3439,7 @@ if have_system
'hw/pci-host',
'hw/ppc',
'hw/rtc',
+ 'hw/riscv',
'hw/s390x',
'hw/scsi',
'hw/sd',
diff --git a/hw/riscv/riscv-iommu-bits.h b/hw/riscv/riscv-iommu-bits.h
index c46d7d18ab..b1c477f5c3 100644
--- a/hw/riscv/riscv-iommu-bits.h
+++ b/hw/riscv/riscv-iommu-bits.h
@@ -69,6 +69,14 @@ struct riscv_iommu_pq_record {
/* 5.3 IOMMU Capabilities (64bits) */
#define RISCV_IOMMU_REG_CAP 0x0000
#define RISCV_IOMMU_CAP_VERSION GENMASK_ULL(7, 0)
+#define RISCV_IOMMU_CAP_SV32 BIT_ULL(8)
+#define RISCV_IOMMU_CAP_SV39 BIT_ULL(9)
+#define RISCV_IOMMU_CAP_SV48 BIT_ULL(10)
+#define RISCV_IOMMU_CAP_SV57 BIT_ULL(11)
+#define RISCV_IOMMU_CAP_SV32X4 BIT_ULL(16)
+#define RISCV_IOMMU_CAP_SV39X4 BIT_ULL(17)
+#define RISCV_IOMMU_CAP_SV48X4 BIT_ULL(18)
+#define RISCV_IOMMU_CAP_SV57X4 BIT_ULL(19)
#define RISCV_IOMMU_CAP_MSI_FLAT BIT_ULL(22)
#define RISCV_IOMMU_CAP_MSI_MRIF BIT_ULL(23)
#define RISCV_IOMMU_CAP_T2GPA BIT_ULL(26)
@@ -80,7 +88,9 @@ struct riscv_iommu_pq_record {
/* 5.4 Features control register (32bits) */
#define RISCV_IOMMU_REG_FCTL 0x0008
+#define RISCV_IOMMU_FCTL_BE BIT(0)
#define RISCV_IOMMU_FCTL_WSI BIT(1)
+#define RISCV_IOMMU_FCTL_GXL BIT(2)
/* 5.5 Device-directory-table pointer (64bits) */
#define RISCV_IOMMU_REG_DDTP 0x0010
@@ -175,6 +185,10 @@ enum {
/* 5.27 Interrupt cause to vector (64bits) */
#define RISCV_IOMMU_REG_ICVEC 0x02F8
+#define RISCV_IOMMU_ICVEC_CIV GENMASK_ULL(3, 0)
+#define RISCV_IOMMU_ICVEC_FIV GENMASK_ULL(7, 4)
+#define RISCV_IOMMU_ICVEC_PMIV GENMASK_ULL(11, 8)
+#define RISCV_IOMMU_ICVEC_PIV GENMASK_ULL(15, 12)
/* 5.28 MSI Configuration table (32 * 64bits) */
#define RISCV_IOMMU_REG_MSI_CONFIG 0x0300
@@ -203,6 +217,8 @@ struct riscv_iommu_dc {
#define RISCV_IOMMU_DC_TC_DTF BIT_ULL(4)
#define RISCV_IOMMU_DC_TC_PDTV BIT_ULL(5)
#define RISCV_IOMMU_DC_TC_PRPR BIT_ULL(6)
+#define RISCV_IOMMU_DC_TC_GADE BIT_ULL(7)
+#define RISCV_IOMMU_DC_TC_SADE BIT_ULL(8)
#define RISCV_IOMMU_DC_TC_DPE BIT_ULL(9)
#define RISCV_IOMMU_DC_TC_SBE BIT_ULL(10)
#define RISCV_IOMMU_DC_TC_SXL BIT_ULL(11)
@@ -309,9 +325,11 @@ enum riscv_iommu_fq_causes {
/* Translation attributes fields */
#define RISCV_IOMMU_PC_TA_V BIT_ULL(0)
+#define RISCV_IOMMU_PC_TA_RESERVED GENMASK_ULL(63, 32)
/* First stage context fields */
#define RISCV_IOMMU_PC_FSC_PPN GENMASK_ULL(43, 0)
+#define RISCV_IOMMU_PC_FSC_RESERVED GENMASK_ULL(59, 44)
enum riscv_iommu_fq_ttypes {
RISCV_IOMMU_FQ_TTYPE_NONE = 0,
diff --git a/hw/riscv/riscv-iommu.h b/hw/riscv/riscv-iommu.h
new file mode 100644
index 0000000000..95b4ce8d50
--- /dev/null
+++ b/hw/riscv/riscv-iommu.h
@@ -0,0 +1,145 @@
+/*
+ * QEMU emulation of an RISC-V IOMMU
+ *
+ * Copyright (C) 2022-2023 Rivos Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_RISCV_IOMMU_STATE_H
+#define HW_RISCV_IOMMU_STATE_H
+
+#include "qemu/osdep.h"
+#include "qom/object.h"
+
+#include "hw/riscv/iommu.h"
+
+struct RISCVIOMMUState {
+ /*< private >*/
+ DeviceState parent_obj;
+
+ /*< public >*/
+ uint32_t version; /* Reported interface version number */
+ uint32_t pid_bits; /* process identifier width */
+ uint32_t bus; /* PCI bus mapping for non-root endpoints */
+
+ uint64_t cap; /* IOMMU supported capabilities */
+ uint64_t fctl; /* IOMMU enabled features */
+ uint64_t icvec_avail_vectors; /* Available interrupt vectors in ICVEC */
+
+ bool enable_off; /* Enable out-of-reset OFF mode (DMA disabled) */
+ bool enable_msi; /* Enable MSI remapping */
+ bool enable_s_stage; /* Enable S/VS-Stage translation */
+ bool enable_g_stage; /* Enable G-Stage translation */
+
+ /* IOMMU Internal State */
+ uint64_t ddtp; /* Validated Device Directory Tree Root Pointer */
+
+ dma_addr_t cq_addr; /* Command queue base physical address */
+ dma_addr_t fq_addr; /* Fault/event queue base physical address */
+ dma_addr_t pq_addr; /* Page request queue base physical address */
+
+ uint32_t cq_mask; /* Command queue index bit mask */
+ uint32_t fq_mask; /* Fault/event queue index bit mask */
+ uint32_t pq_mask; /* Page request queue index bit mask */
+
+ /* interrupt notifier */
+ void (*notify)(RISCVIOMMUState *iommu, unsigned vector);
+
+ /* IOMMU State Machine */
+ QemuThread core_proc; /* Background processing thread */
+ QemuMutex core_lock; /* Global IOMMU lock, used for cache/regs updates */
+ QemuCond core_cond; /* Background processing wake up signal */
+ unsigned core_exec; /* Processing thread execution actions */
+
+ /* IOMMU target address space */
+ AddressSpace *target_as;
+ MemoryRegion *target_mr;
+
+ /* MSI / MRIF access trap */
+ AddressSpace trap_as;
+ MemoryRegion trap_mr;
+
+ GHashTable *ctx_cache; /* Device translation Context Cache */
+ QemuMutex ctx_lock; /* Device translation Cache update lock */
+
+ /* MMIO Hardware Interface */
+ MemoryRegion regs_mr;
+ QemuSpin regs_lock;
+ uint8_t *regs_rw; /* register state (user write) */
+ uint8_t *regs_wc; /* write-1-to-clear mask */
+ uint8_t *regs_ro; /* read-only mask */
+
+ QLIST_ENTRY(RISCVIOMMUState) iommus;
+ QLIST_HEAD(, RISCVIOMMUSpace) spaces;
+};
+
+void riscv_iommu_pci_setup_iommu(RISCVIOMMUState *iommu, PCIBus *bus,
+ Error **errp);
+
+/* private helpers */
+
+/* Register helper functions */
+static inline uint32_t riscv_iommu_reg_mod32(RISCVIOMMUState *s,
+ unsigned idx, uint32_t set, uint32_t clr)
+{
+ uint32_t val;
+ qemu_spin_lock(&s->regs_lock);
+ val = ldl_le_p(s->regs_rw + idx);
+ stl_le_p(s->regs_rw + idx, (val & ~clr) | set);
+ qemu_spin_unlock(&s->regs_lock);
+ return val;
+}
+
+static inline void riscv_iommu_reg_set32(RISCVIOMMUState *s,
+ unsigned idx, uint32_t set)
+{
+ qemu_spin_lock(&s->regs_lock);
+ stl_le_p(s->regs_rw + idx, set);
+ qemu_spin_unlock(&s->regs_lock);
+}
+
+static inline uint32_t riscv_iommu_reg_get32(RISCVIOMMUState *s,
+ unsigned idx)
+{
+ return ldl_le_p(s->regs_rw + idx);
+}
+
+static inline uint64_t riscv_iommu_reg_mod64(RISCVIOMMUState *s,
+ unsigned idx, uint64_t set, uint64_t clr)
+{
+ uint64_t val;
+ qemu_spin_lock(&s->regs_lock);
+ val = ldq_le_p(s->regs_rw + idx);
+ stq_le_p(s->regs_rw + idx, (val & ~clr) | set);
+ qemu_spin_unlock(&s->regs_lock);
+ return val;
+}
+
+static inline void riscv_iommu_reg_set64(RISCVIOMMUState *s,
+ unsigned idx, uint64_t set)
+{
+ qemu_spin_lock(&s->regs_lock);
+ stq_le_p(s->regs_rw + idx, set);
+ qemu_spin_unlock(&s->regs_lock);
+}
+
+static inline uint64_t riscv_iommu_reg_get64(RISCVIOMMUState *s,
+ unsigned idx)
+{
+ return ldq_le_p(s->regs_rw + idx);
+}
+
+
+
+#endif
diff --git a/hw/riscv/trace.h b/hw/riscv/trace.h
new file mode 100644
index 0000000000..8c0e3ca1f3
--- /dev/null
+++ b/hw/riscv/trace.h
@@ -0,0 +1 @@
+#include "trace/trace-hw_riscv.h"
diff --git a/include/hw/riscv/iommu.h b/include/hw/riscv/iommu.h
new file mode 100644
index 0000000000..070ee69973
--- /dev/null
+++ b/include/hw/riscv/iommu.h
@@ -0,0 +1,36 @@
+/*
+ * QEMU emulation of an RISC-V IOMMU
+ *
+ * Copyright (C) 2022-2023 Rivos Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_RISCV_IOMMU_H
+#define HW_RISCV_IOMMU_H
+
+#include "qemu/osdep.h"
+#include "qom/object.h"
+
+#define TYPE_RISCV_IOMMU "riscv-iommu"
+OBJECT_DECLARE_SIMPLE_TYPE(RISCVIOMMUState, RISCV_IOMMU)
+typedef struct RISCVIOMMUState RISCVIOMMUState;
+
+#define TYPE_RISCV_IOMMU_MEMORY_REGION "riscv-iommu-mr"
+typedef struct RISCVIOMMUSpace RISCVIOMMUSpace;
+
+#define TYPE_RISCV_IOMMU_PCI "riscv-iommu-pci"
+OBJECT_DECLARE_SIMPLE_TYPE(RISCVIOMMUStatePci, RISCV_IOMMU_PCI)
+typedef struct RISCVIOMMUStatePci RISCVIOMMUStatePci;
+
+#endif
diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
new file mode 100644
index 0000000000..061a5efe19
--- /dev/null
+++ b/hw/riscv/riscv-iommu.c
@@ -0,0 +1,2050 @@
+/*
+ * QEMU emulation of an RISC-V IOMMU
+ *
+ * Copyright (C) 2021-2023, Rivos Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qom/object.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_device.h"
+#include "hw/qdev-properties.h"
+#include "hw/riscv/riscv_hart.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qemu/timer.h"
+
+#include "cpu_bits.h"
+#include "riscv-iommu.h"
+#include "riscv-iommu-bits.h"
+#include "trace.h"
+
+#define LIMIT_CACHE_CTX (1U << 7)
+#define LIMIT_CACHE_IOT (1U << 20)
+
+/* Physical page number coversions */
+#define PPN_PHYS(ppn) ((ppn) << TARGET_PAGE_BITS)
+#define PPN_DOWN(phy) ((phy) >> TARGET_PAGE_BITS)
+
+typedef struct RISCVIOMMUContext RISCVIOMMUContext;
+typedef struct RISCVIOMMUEntry RISCVIOMMUEntry;
+
+/* Device assigned I/O address space */
+struct RISCVIOMMUSpace {
+ IOMMUMemoryRegion iova_mr; /* IOVA memory region for attached device */
+ AddressSpace iova_as; /* IOVA address space for attached device */
+ RISCVIOMMUState *iommu; /* Managing IOMMU device state */
+ uint32_t devid; /* Requester identifier, AKA device_id */
+ bool notifier; /* IOMMU unmap notifier enabled */
+ QLIST_ENTRY(RISCVIOMMUSpace) list;
+};
+
+/* Device translation context state. */
+struct RISCVIOMMUContext {
+ uint64_t devid:24; /* Requester Id, AKA device_id */
+ uint64_t process_id:20; /* Process ID. PASID for PCIe */
+ uint64_t __rfu:20; /* reserved */
+ uint64_t tc; /* Translation Control */
+ uint64_t ta; /* Translation Attributes */
+ uint64_t satp; /* S-Stage address translation and protection */
+ uint64_t gatp; /* G-Stage address translation and protection */
+ uint64_t msi_addr_mask; /* MSI filtering - address mask */
+ uint64_t msi_addr_pattern; /* MSI filtering - address pattern */
+ uint64_t msiptp; /* MSI redirection page table pointer */
+};
+
+/* IOMMU index for transactions without process_id specified. */
+#define RISCV_IOMMU_NOPROCID 0
+
+static uint8_t riscv_iommu_get_icvec_vector(uint32_t icvec, uint32_t vec_type)
+{
+ switch (vec_type) {
+ case RISCV_IOMMU_INTR_CQ:
+ return icvec & RISCV_IOMMU_ICVEC_CIV;
+ case RISCV_IOMMU_INTR_FQ:
+ return icvec & RISCV_IOMMU_ICVEC_FIV >> 4;
+ case RISCV_IOMMU_INTR_PM:
+ return icvec & RISCV_IOMMU_ICVEC_PMIV >> 8;
+ case RISCV_IOMMU_INTR_PQ:
+ return icvec & RISCV_IOMMU_ICVEC_PIV >> 12;
+ default:
+ g_assert_not_reached();
+ }
+}
+
+static void riscv_iommu_notify(RISCVIOMMUState *s, int vec_type)
+{
+ const uint32_t fctl = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_FCTL);
+ uint32_t ipsr, icvec, vector;
+
+ if (fctl & RISCV_IOMMU_FCTL_WSI || !s->notify) {
+ return;
+ }
+
+ icvec = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_ICVEC);
+ ipsr = riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_IPSR, (1 << vec_type), 0);
+
+ if (!(ipsr & (1 << vec_type))) {
+ vector = riscv_iommu_get_icvec_vector(icvec, vec_type);
+ s->notify(s, vector);
+ trace_riscv_iommu_notify_int_vector(vec_type, vector);
+ }
+}
+
+static void riscv_iommu_fault(RISCVIOMMUState *s,
+ struct riscv_iommu_fq_record *ev)
+{
+ uint32_t ctrl = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_FQCSR);
+ uint32_t head = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_FQH) & s->fq_mask;
+ uint32_t tail = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_FQT) & s->fq_mask;
+ uint32_t next = (tail + 1) & s->fq_mask;
+ uint32_t devid = get_field(ev->hdr, RISCV_IOMMU_FQ_HDR_DID);
+
+ trace_riscv_iommu_flt(s->parent_obj.id, PCI_BUS_NUM(devid), PCI_SLOT(devid),
+ PCI_FUNC(devid), ev->hdr, ev->iotval);
+
+ if (!(ctrl & RISCV_IOMMU_FQCSR_FQON) ||
+ !!(ctrl & (RISCV_IOMMU_FQCSR_FQOF | RISCV_IOMMU_FQCSR_FQMF))) {
+ return;
+ }
+
+ if (head == next) {
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_FQCSR,
+ RISCV_IOMMU_FQCSR_FQOF, 0);
+ } else {
+ dma_addr_t addr = s->fq_addr + tail * sizeof(*ev);
+ if (dma_memory_write(s->target_as, addr, ev, sizeof(*ev),
+ MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_FQCSR,
+ RISCV_IOMMU_FQCSR_FQMF, 0);
+ } else {
+ riscv_iommu_reg_set32(s, RISCV_IOMMU_REG_FQT, next);
+ }
+ }
+
+ if (ctrl & RISCV_IOMMU_FQCSR_FIE) {
+ riscv_iommu_notify(s, RISCV_IOMMU_INTR_FQ);
+ }
+}
+
+static void riscv_iommu_pri(RISCVIOMMUState *s,
+ struct riscv_iommu_pq_record *pr)
+{
+ uint32_t ctrl = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_PQCSR);
+ uint32_t head = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_PQH) & s->pq_mask;
+ uint32_t tail = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_PQT) & s->pq_mask;
+ uint32_t next = (tail + 1) & s->pq_mask;
+ uint32_t devid = get_field(pr->hdr, RISCV_IOMMU_PREQ_HDR_DID);
+
+ trace_riscv_iommu_pri(s->parent_obj.id, PCI_BUS_NUM(devid), PCI_SLOT(devid),
+ PCI_FUNC(devid), pr->payload);
+
+ if (!(ctrl & RISCV_IOMMU_PQCSR_PQON) ||
+ !!(ctrl & (RISCV_IOMMU_PQCSR_PQOF | RISCV_IOMMU_PQCSR_PQMF))) {
+ return;
+ }
+
+ if (head == next) {
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_PQCSR,
+ RISCV_IOMMU_PQCSR_PQOF, 0);
+ } else {
+ dma_addr_t addr = s->pq_addr + tail * sizeof(*pr);
+ if (dma_memory_write(s->target_as, addr, pr, sizeof(*pr),
+ MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_PQCSR,
+ RISCV_IOMMU_PQCSR_PQMF, 0);
+ } else {
+ riscv_iommu_reg_set32(s, RISCV_IOMMU_REG_PQT, next);
+ }
+ }
+
+ if (ctrl & RISCV_IOMMU_PQCSR_PIE) {
+ riscv_iommu_notify(s, RISCV_IOMMU_INTR_PQ);
+ }
+}
+
+/* Portable implementation of pext_u64, bit-mask extraction. */
+static uint64_t _pext_u64(uint64_t val, uint64_t ext)
+{
+ uint64_t ret = 0;
+ uint64_t rot = 1;
+
+ while (ext) {
+ if (ext & 1) {
+ if (val & 1) {
+ ret |= rot;
+ }
+ rot <<= 1;
+ }
+ val >>= 1;
+ ext >>= 1;
+ }
+
+ return ret;
+}
+
+/* Check if GPA matches MSI/MRIF pattern. */
+static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
+ dma_addr_t gpa)
+{
+ if (!s->enable_msi) {
+ return false;
+ }
+
+ if (get_field(ctx->msiptp, RISCV_IOMMU_DC_MSIPTP_MODE) !=
+ RISCV_IOMMU_DC_MSIPTP_MODE_FLAT) {
+ return false; /* Invalid MSI/MRIF mode */
+ }
+
+ if ((PPN_DOWN(gpa) ^ ctx->msi_addr_pattern) & ~ctx->msi_addr_mask) {
+ return false; /* GPA not in MSI range defined by AIA IMSIC rules. */
+ }
+
+ return true;
+}
+
+/*
+ * RISCV IOMMU Address Translation Lookup - Page Table Walk
+ *
+ * Note: Code is based on get_physical_address() from target/riscv/cpu_helper.c
+ * Both implementation can be merged into single helper function in future.
+ * Keeping them separate for now, as error reporting and flow specifics are
+ * sufficiently different for separate implementation.
+ *
+ * @s : IOMMU Device State
+ * @ctx : Translation context for device id and process address space id.
+ * @iotlb : translation data: physical address and access mode.
+ * @return : success or fault cause code.
+ */
+static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
+ IOMMUTLBEntry *iotlb)
+{
+ dma_addr_t addr, base;
+ uint64_t satp, gatp, pte;
+ bool en_s, en_g;
+ struct {
+ unsigned char step;
+ unsigned char levels;
+ unsigned char ptidxbits;
+ unsigned char ptesize;
+ } sc[2];
+ /* Translation stage phase */
+ enum {
+ S_STAGE = 0,
+ G_STAGE = 1,
+ } pass;
+
+ satp = get_field(ctx->satp, RISCV_IOMMU_ATP_MODE_FIELD);
+ gatp = get_field(ctx->gatp, RISCV_IOMMU_ATP_MODE_FIELD);
+
+ en_s = satp != RISCV_IOMMU_DC_FSC_MODE_BARE;
+ en_g = gatp != RISCV_IOMMU_DC_IOHGATP_MODE_BARE;
+
+ /*
+ * Early check for MSI address match when IOVA == GPA. This check
+ * is required to ensure MSI translation is applied in case
+ * first-stage translation is set to BARE mode. In this case IOVA
+ * provided is a valid GPA. Running translation through page walk
+ * second stage translation will incorrectly try to translate GPA
+ * to host physical page, likely hitting IOPF.
+ */
+ if ((iotlb->perm & IOMMU_WO) &&
+ riscv_iommu_msi_check(s, ctx, iotlb->iova)) {
+ iotlb->target_as = &s->trap_as;
+ iotlb->translated_addr = iotlb->iova;
+ iotlb->addr_mask = ~TARGET_PAGE_MASK;
+ return 0;
+ }
+
+ /* Exit early for pass-through mode. */
+ if (!(en_s || en_g)) {
+ iotlb->translated_addr = iotlb->iova;
+ iotlb->addr_mask = ~TARGET_PAGE_MASK;
+ /* Allow R/W in pass-through mode */
+ iotlb->perm = IOMMU_RW;
+ return 0;
+ }
+
+ /* S/G translation parameters. */
+ for (pass = 0; pass < 2; pass++) {
+ uint32_t sv_mode;
+
+ sc[pass].step = 0;
+ if (pass ? (s->fctl & RISCV_IOMMU_FCTL_GXL) :
+ (ctx->tc & RISCV_IOMMU_DC_TC_SXL)) {
+ /* 32bit mode for GXL/SXL == 1 */
+ switch (pass ? gatp : satp) {
+ case RISCV_IOMMU_DC_IOHGATP_MODE_BARE:
+ sc[pass].levels = 0;
+ sc[pass].ptidxbits = 0;
+ sc[pass].ptesize = 0;
+ break;
+ case RISCV_IOMMU_DC_IOHGATP_MODE_SV32X4:
+ sv_mode = pass ? RISCV_IOMMU_CAP_SV32X4 : RISCV_IOMMU_CAP_SV32;
+ if (!(s->cap & sv_mode)) {
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+ sc[pass].levels = 2;
+ sc[pass].ptidxbits = 10;
+ sc[pass].ptesize = 4;
+ break;
+ default:
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+ } else {
+ /* 64bit mode for GXL/SXL == 0 */
+ switch (pass ? gatp : satp) {
+ case RISCV_IOMMU_DC_IOHGATP_MODE_BARE:
+ sc[pass].levels = 0;
+ sc[pass].ptidxbits = 0;
+ sc[pass].ptesize = 0;
+ break;
+ case RISCV_IOMMU_DC_IOHGATP_MODE_SV39X4:
+ sv_mode = pass ? RISCV_IOMMU_CAP_SV39X4 : RISCV_IOMMU_CAP_SV39;
+ if (!(s->cap & sv_mode)) {
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+ sc[pass].levels = 3;
+ sc[pass].ptidxbits = 9;
+ sc[pass].ptesize = 8;
+ break;
+ case RISCV_IOMMU_DC_IOHGATP_MODE_SV48X4:
+ sv_mode = pass ? RISCV_IOMMU_CAP_SV48X4 : RISCV_IOMMU_CAP_SV48;
+ if (!(s->cap & sv_mode)) {
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+ sc[pass].levels = 4;
+ sc[pass].ptidxbits = 9;
+ sc[pass].ptesize = 8;
+ break;
+ case RISCV_IOMMU_DC_IOHGATP_MODE_SV57X4:
+ sv_mode = pass ? RISCV_IOMMU_CAP_SV57X4 : RISCV_IOMMU_CAP_SV57;
+ if (!(s->cap & sv_mode)) {
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+ sc[pass].levels = 5;
+ sc[pass].ptidxbits = 9;
+ sc[pass].ptesize = 8;
+ break;
+ default:
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+ }
+ };
+
+ /* S/G stages translation tables root pointers */
+ gatp = PPN_PHYS(get_field(ctx->gatp, RISCV_IOMMU_ATP_PPN_FIELD));
+ satp = PPN_PHYS(get_field(ctx->satp, RISCV_IOMMU_ATP_PPN_FIELD));
+ addr = (en_s && en_g) ? satp : iotlb->iova;
+ base = en_g ? gatp : satp;
+ pass = en_g ? G_STAGE : S_STAGE;
+
+ do {
+ const unsigned widened = (pass && !sc[pass].step) ? 2 : 0;
+ const unsigned va_bits = widened + sc[pass].ptidxbits;
+ const unsigned va_skip = TARGET_PAGE_BITS + sc[pass].ptidxbits *
+ (sc[pass].levels - 1 - sc[pass].step);
+ const unsigned idx = (addr >> va_skip) & ((1 << va_bits) - 1);
+ const dma_addr_t pte_addr = base + idx * sc[pass].ptesize;
+ const bool ade =
+ ctx->tc & (pass ? RISCV_IOMMU_DC_TC_GADE : RISCV_IOMMU_DC_TC_SADE);
+
+ /* Address range check before first level lookup */
+ if (!sc[pass].step) {
+ const uint64_t va_mask = (1ULL << (va_skip + va_bits)) - 1;
+ if ((addr & va_mask) != addr) {
+ return RISCV_IOMMU_FQ_CAUSE_DMA_DISABLED;
+ }
+ }
+
+ /* Read page table entry */
+ if (dma_memory_read(s->target_as, pte_addr, &pte,
+ sc[pass].ptesize, MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
+ return (iotlb->perm & IOMMU_WO) ? RISCV_IOMMU_FQ_CAUSE_WR_FAULT
+ : RISCV_IOMMU_FQ_CAUSE_RD_FAULT;
+ }
+
+ if (sc[pass].ptesize == 4) {
+ pte = (uint64_t) le32_to_cpu(*((uint32_t *)&pte));
+ } else {
+ pte = le64_to_cpu(pte);
+ }
+
+ sc[pass].step++;
+ hwaddr ppn = pte >> PTE_PPN_SHIFT;
+
+ if (!(pte & PTE_V)) {
+ break; /* Invalid PTE */
+ } else if (!(pte & (PTE_R | PTE_W | PTE_X))) {
+ base = PPN_PHYS(ppn); /* Inner PTE, continue walking */
+ } else if ((pte & (PTE_R | PTE_W | PTE_X)) == PTE_W) {
+ break; /* Reserved leaf PTE flags: PTE_W */
+ } else if ((pte & (PTE_R | PTE_W | PTE_X)) == (PTE_W | PTE_X)) {
+ break; /* Reserved leaf PTE flags: PTE_W + PTE_X */
+ } else if (ppn & ((1ULL << (va_skip - TARGET_PAGE_BITS)) - 1)) {
+ break; /* Misaligned PPN */
+ } else if ((iotlb->perm & IOMMU_RO) && !(pte & PTE_R)) {
+ break; /* Read access check failed */
+ } else if ((iotlb->perm & IOMMU_WO) && !(pte & PTE_W)) {
+ break; /* Write access check failed */
+ } else if ((iotlb->perm & IOMMU_RO) && !ade && !(pte & PTE_A)) {
+ break; /* Access bit not set */
+ } else if ((iotlb->perm & IOMMU_WO) && !ade && !(pte & PTE_D)) {
+ break; /* Dirty bit not set */
+ } else {
+ /* Leaf PTE, translation completed. */
+ sc[pass].step = sc[pass].levels;
+ base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
+ /* Update address mask based on smallest translation granularity */
+ iotlb->addr_mask &= (1ULL << va_skip) - 1;
+ /* Continue with S-Stage translation? */
+ if (pass && sc[0].step != sc[0].levels) {
+ pass = S_STAGE;
+ addr = iotlb->iova;
+ continue;
+ }
+ /* Translation phase completed (GPA or SPA) */
+ iotlb->translated_addr = base;
+ iotlb->perm = (pte & PTE_W) ? ((pte & PTE_R) ? IOMMU_RW : IOMMU_WO)
+ : IOMMU_RO;
+
+ /* Check MSI GPA address match */
+ if (pass == S_STAGE && (iotlb->perm & IOMMU_WO) &&
+ riscv_iommu_msi_check(s, ctx, base)) {
+ /* Trap MSI writes and return GPA address. */
+ iotlb->target_as = &s->trap_as;
+ iotlb->addr_mask = ~TARGET_PAGE_MASK;
+ return 0;
+ }
+
+ /* Continue with G-Stage translation? */
+ if (!pass && en_g) {
+ pass = G_STAGE;
+ addr = base;
+ base = gatp;
+ sc[pass].step = 0;
+ continue;
+ }
+
+ return 0;
+ }
+
+ if (sc[pass].step == sc[pass].levels) {
+ break; /* Can't find leaf PTE */
+ }
+
+ /* Continue with G-Stage translation? */
+ if (!pass && en_g) {
+ pass = G_STAGE;
+ addr = base;
+ base = gatp;
+ sc[pass].step = 0;
+ }
+ } while (1);
+
+ return (iotlb->perm & IOMMU_WO) ?
+ (pass ? RISCV_IOMMU_FQ_CAUSE_WR_FAULT_VS :
+ RISCV_IOMMU_FQ_CAUSE_WR_FAULT_S) :
+ (pass ? RISCV_IOMMU_FQ_CAUSE_RD_FAULT_VS :
+ RISCV_IOMMU_FQ_CAUSE_RD_FAULT_S);
+}
+
+static void riscv_iommu_report_fault(RISCVIOMMUState *s,
+ RISCVIOMMUContext *ctx,
+ uint32_t fault_type, uint32_t cause,
+ bool pv,
+ uint64_t iotval, uint64_t iotval2)
+{
+ struct riscv_iommu_fq_record ev = { 0 };
+
+ if (ctx->tc & RISCV_IOMMU_DC_TC_DTF) {
+ switch (cause) {
+ case RISCV_IOMMU_FQ_CAUSE_DMA_DISABLED:
+ case RISCV_IOMMU_FQ_CAUSE_DDT_LOAD_FAULT:
+ case RISCV_IOMMU_FQ_CAUSE_DDT_INVALID:
+ case RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED:
+ case RISCV_IOMMU_FQ_CAUSE_DDT_CORRUPTED:
+ case RISCV_IOMMU_FQ_CAUSE_INTERNAL_DP_ERROR:
+ case RISCV_IOMMU_FQ_CAUSE_MSI_WR_FAULT:
+ break;
+ default:
+ /* DTF prevents reporting a fault for this given cause */
+ return;
+ }
+ }
+
+ ev.hdr = set_field(ev.hdr, RISCV_IOMMU_FQ_HDR_CAUSE, cause);
+ ev.hdr = set_field(ev.hdr, RISCV_IOMMU_FQ_HDR_TTYPE, fault_type);
+ ev.hdr = set_field(ev.hdr, RISCV_IOMMU_FQ_HDR_DID, ctx->devid);
+ ev.hdr = set_field(ev.hdr, RISCV_IOMMU_FQ_HDR_PV, true);
+
+ if (pv) {
+ ev.hdr = set_field(ev.hdr, RISCV_IOMMU_FQ_HDR_PID, ctx->process_id);
+ }
+
+ ev.iotval = iotval;
+ ev.iotval2 = iotval2;
+
+ riscv_iommu_fault(s, &ev);
+}
+
+/* Redirect MSI write for given GPA. */
+static MemTxResult riscv_iommu_msi_write(RISCVIOMMUState *s,
+ RISCVIOMMUContext *ctx, uint64_t gpa, uint64_t data,
+ unsigned size, MemTxAttrs attrs)
+{
+ MemTxResult res;
+ dma_addr_t addr;
+ uint64_t intn;
+ uint32_t n190;
+ uint64_t pte[2];
+ int fault_type = RISCV_IOMMU_FQ_TTYPE_UADDR_WR;
+ int cause;
+
+ /* Interrupt File Number */
+ intn = _pext_u64(PPN_DOWN(gpa), ctx->msi_addr_mask);
+ if (intn >= 256) {
+ /* Interrupt file number out of range */
+ res = MEMTX_ACCESS_ERROR;
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_LOAD_FAULT;
+ goto err;
+ }
+
+ /* fetch MSI PTE */
+ addr = PPN_PHYS(get_field(ctx->msiptp, RISCV_IOMMU_DC_MSIPTP_PPN));
+ addr = addr | (intn * sizeof(pte));
+ res = dma_memory_read(s->target_as, addr, &pte, sizeof(pte),
+ MEMTXATTRS_UNSPECIFIED);
+ if (res != MEMTX_OK) {
+ if (res == MEMTX_DECODE_ERROR) {
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_PT_CORRUPTED;
+ } else {
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_LOAD_FAULT;
+ }
+ goto err;
+ }
+
+ le64_to_cpus(&pte[0]);
+ le64_to_cpus(&pte[1]);
+
+ if (!(pte[0] & RISCV_IOMMU_MSI_PTE_V) || (pte[0] & RISCV_IOMMU_MSI_PTE_C)) {
+ /*
+ * The spec mentions that: "If msipte.C == 1, then further
+ * processing to interpret the PTE is implementation
+ * defined.". We'll abort with cause = 262 for this
+ * case too.
+ */
+ res = MEMTX_ACCESS_ERROR;
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_INVALID;
+ goto err;
+ }
+
+ switch (get_field(pte[0], RISCV_IOMMU_MSI_PTE_M)) {
+ case RISCV_IOMMU_MSI_PTE_M_BASIC:
+ /* MSI Pass-through mode */
+ addr = PPN_PHYS(get_field(pte[0], RISCV_IOMMU_MSI_PTE_PPN));
+
+ trace_riscv_iommu_msi(s->parent_obj.id, PCI_BUS_NUM(ctx->devid),
+ PCI_SLOT(ctx->devid), PCI_FUNC(ctx->devid),
+ gpa, addr);
+
+ res = dma_memory_write(s->target_as, addr, &data, size, attrs);
+ if (res != MEMTX_OK) {
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_WR_FAULT;
+ goto err;
+ }
+
+ return MEMTX_OK;
+ case RISCV_IOMMU_MSI_PTE_M_MRIF:
+ /* MRIF mode, continue. */
+ break;
+ default:
+ res = MEMTX_ACCESS_ERROR;
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_MISCONFIGURED;
+ goto err;
+ }
+
+ /*
+ * Report an error for interrupt identities exceeding the maximum allowed
+ * for an IMSIC interrupt file (2047) or destination address is not 32-bit
+ * aligned. See IOMMU Specification, Chapter 2.3. MSI page tables.
+ */
+ if ((data > 2047) || (gpa & 3)) {
+ res = MEMTX_ACCESS_ERROR;
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_MISCONFIGURED;
+ goto err;
+ }
+
+ /* MSI MRIF mode, non atomic pending bit update */
+
+ /* MRIF pending bit address */
+ addr = get_field(pte[0], RISCV_IOMMU_MSI_PTE_MRIF_ADDR) << 9;
+ addr = addr | ((data & 0x7c0) >> 3);
+
+ trace_riscv_iommu_msi(s->parent_obj.id, PCI_BUS_NUM(ctx->devid),
+ PCI_SLOT(ctx->devid), PCI_FUNC(ctx->devid),
+ gpa, addr);
+
+ /* MRIF pending bit mask */
+ data = 1ULL << (data & 0x03f);
+ res = dma_memory_read(s->target_as, addr, &intn, sizeof(intn), attrs);
+ if (res != MEMTX_OK) {
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_LOAD_FAULT;
+ goto err;
+ }
+
+ intn = intn | data;
+ res = dma_memory_write(s->target_as, addr, &intn, sizeof(intn), attrs);
+ if (res != MEMTX_OK) {
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_WR_FAULT;
+ goto err;
+ }
+
+ /* Get MRIF enable bits */
+ addr = addr + sizeof(intn);
+ res = dma_memory_read(s->target_as, addr, &intn, sizeof(intn), attrs);
+ if (res != MEMTX_OK) {
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_LOAD_FAULT;
+ goto err;
+ }
+
+ if (!(intn & data)) {
+ /* notification disabled, MRIF update completed. */
+ return MEMTX_OK;
+ }
+
+ /* Send notification message */
+ addr = PPN_PHYS(get_field(pte[1], RISCV_IOMMU_MSI_MRIF_NPPN));
+ n190 = get_field(pte[1], RISCV_IOMMU_MSI_MRIF_NID) |
+ (get_field(pte[1], RISCV_IOMMU_MSI_MRIF_NID_MSB) << 10);
+
+ res = dma_memory_write(s->target_as, addr, &n190, sizeof(n190), attrs);
+ if (res != MEMTX_OK) {
+ cause = RISCV_IOMMU_FQ_CAUSE_MSI_WR_FAULT;
+ goto err;
+ }
+
+ trace_riscv_iommu_mrif_notification(s->parent_obj.id, n190, addr);
+
+ return MEMTX_OK;
+
+err:
+ riscv_iommu_report_fault(s, ctx, fault_type, cause,
+ !!ctx->process_id, 0, 0);
+ return res;
+}
+
+/*
+ * Check device context configuration as described by the
+ * riscv-iommu spec section "Device-context configuration
+ * checks".
+ */
+static bool riscv_iommu_validate_device_ctx(RISCVIOMMUState *s,
+ RISCVIOMMUContext *ctx)
+{
+ uint32_t fsc_mode, msi_mode;
+
+ if (!(ctx->tc & RISCV_IOMMU_DC_TC_EN_PRI) &&
+ ctx->tc & RISCV_IOMMU_DC_TC_PRPR) {
+ return false;
+ }
+
+ if (!(s->cap & RISCV_IOMMU_CAP_T2GPA) &&
+ ctx->tc & RISCV_IOMMU_DC_TC_T2GPA) {
+ return false;
+ }
+
+ if (s->cap & RISCV_IOMMU_CAP_MSI_FLAT) {
+ msi_mode = get_field(ctx->msiptp, RISCV_IOMMU_DC_MSIPTP_MODE);
+
+ if (msi_mode != RISCV_IOMMU_DC_MSIPTP_MODE_OFF &&
+ msi_mode != RISCV_IOMMU_DC_MSIPTP_MODE_FLAT) {
+ return false;
+ }
+ }
+
+ fsc_mode = get_field(ctx->satp, RISCV_IOMMU_DC_FSC_MODE);
+
+ if (ctx->tc & RISCV_IOMMU_DC_TC_PDTV) {
+ switch (fsc_mode) {
+ case RISCV_IOMMU_DC_FSC_PDTP_MODE_PD8:
+ if (!(s->cap & RISCV_IOMMU_CAP_PD8)) {
+ return false;
+ }
+ break;
+ case RISCV_IOMMU_DC_FSC_PDTP_MODE_PD17:
+ if (!(s->cap & RISCV_IOMMU_CAP_PD17)) {
+ return false;
+ }
+ break;
+ case RISCV_IOMMU_DC_FSC_PDTP_MODE_PD20:
+ if (!(s->cap & RISCV_IOMMU_CAP_PD20)) {
+ return false;
+ }
+ break;
+ }
+ } else {
+ /* DC.tc.PDTV is 0 */
+ if (ctx->tc & RISCV_IOMMU_DC_TC_DPE) {
+ return false;
+ }
+
+ if (ctx->tc & RISCV_IOMMU_DC_TC_SXL) {
+ if (fsc_mode == RISCV_IOMMU_CAP_SV32 &&
+ !(s->cap & RISCV_IOMMU_CAP_SV32)) {
+ return false;
+ }
+ } else {
+ switch (fsc_mode) {
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV39:
+ if (!(s->cap & RISCV_IOMMU_CAP_SV39)) {
+ return false;
+ }
+ break;
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV48:
+ if (!(s->cap & RISCV_IOMMU_CAP_SV48)) {
+ return false;
+ }
+ break;
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV57:
+ if (!(s->cap & RISCV_IOMMU_CAP_SV57)) {
+ return false;
+ }
+ break;
+ }
+ }
+ }
+
+ /*
+ * CAP_END is always zero (only one endianess). FCTL_BE is
+ * always zero (little-endian accesses). Thus TC_SBE must
+ * always be LE, i.e. zero.
+ */
+ if (ctx->tc & RISCV_IOMMU_DC_TC_SBE) {
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * Validate process context (PC) according to section
+ * "Process-context configuration checks".
+ */
+static bool riscv_iommu_validate_process_ctx(RISCVIOMMUState *s,
+ RISCVIOMMUContext *ctx)
+{
+ uint32_t mode;
+
+ if (get_field(ctx->ta, RISCV_IOMMU_PC_TA_RESERVED)) {
+ return false;
+ }
+
+ if (get_field(ctx->satp, RISCV_IOMMU_PC_FSC_RESERVED)) {
+ return false;
+ }
+
+ mode = get_field(ctx->satp, RISCV_IOMMU_DC_FSC_MODE);
+ switch (mode) {
+ case RISCV_IOMMU_DC_FSC_MODE_BARE:
+ /* sv39 and sv32 modes have the same value (8) */
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV39:
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV48:
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV57:
+ break;
+ default:
+ return false;
+ }
+
+ if (ctx->tc & RISCV_IOMMU_DC_TC_SXL) {
+ if (mode == RISCV_IOMMU_CAP_SV32 &&
+ !(s->cap & RISCV_IOMMU_CAP_SV32)) {
+ return false;
+ }
+ } else {
+ switch (mode) {
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV39:
+ if (!(s->cap & RISCV_IOMMU_CAP_SV39)) {
+ return false;
+ }
+ break;
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV48:
+ if (!(s->cap & RISCV_IOMMU_CAP_SV48)) {
+ return false;
+ }
+ break;
+ case RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV57:
+ if (!(s->cap & RISCV_IOMMU_CAP_SV57)) {
+ return false;
+ }
+ break;
+ }
+ }
+
+ return true;
+}
+
+/*
+ * RISC-V IOMMU Device Context Loopkup - Device Directory Tree Walk
+ *
+ * @s : IOMMU Device State
+ * @ctx : Device Translation Context with devid and process_id set.
+ * @return : success or fault code.
+ */
+static int riscv_iommu_ctx_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx)
+{
+ const uint64_t ddtp = s->ddtp;
+ unsigned mode = get_field(ddtp, RISCV_IOMMU_DDTP_MODE);
+ dma_addr_t addr = PPN_PHYS(get_field(ddtp, RISCV_IOMMU_DDTP_PPN));
+ struct riscv_iommu_dc dc;
+ /* Device Context format: 0: extended (64 bytes) | 1: base (32 bytes) */
+ const int dc_fmt = !s->enable_msi;
+ const size_t dc_len = sizeof(dc) >> dc_fmt;
+ unsigned depth;
+ uint64_t de;
+
+ switch (mode) {
+ case RISCV_IOMMU_DDTP_MODE_OFF:
+ return RISCV_IOMMU_FQ_CAUSE_DMA_DISABLED;
+
+ case RISCV_IOMMU_DDTP_MODE_BARE:
+ /* mock up pass-through translation context */
+ ctx->gatp = set_field(0, RISCV_IOMMU_ATP_MODE_FIELD,
+ RISCV_IOMMU_DC_IOHGATP_MODE_BARE);
+ ctx->satp = set_field(0, RISCV_IOMMU_ATP_MODE_FIELD,
+ RISCV_IOMMU_DC_FSC_MODE_BARE);
+ ctx->tc = RISCV_IOMMU_DC_TC_V;
+ ctx->ta = 0;
+ ctx->msiptp = 0;
+ return 0;
+
+ case RISCV_IOMMU_DDTP_MODE_1LVL:
+ depth = 0;
+ break;
+
+ case RISCV_IOMMU_DDTP_MODE_2LVL:
+ depth = 1;
+ break;
+
+ case RISCV_IOMMU_DDTP_MODE_3LVL:
+ depth = 2;
+ break;
+
+ default:
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+
+ /*
+ * Check supported device id width (in bits).
+ * See IOMMU Specification, Chapter 6. Software guidelines.
+ * - if extended device-context format is used:
+ * 1LVL: 6, 2LVL: 15, 3LVL: 24
+ * - if base device-context format is used:
+ * 1LVL: 7, 2LVL: 16, 3LVL: 24
+ */
+ if (ctx->devid >= (1 << (depth * 9 + 6 + (dc_fmt && depth != 2)))) {
+ return RISCV_IOMMU_FQ_CAUSE_TTYPE_BLOCKED;
+ }
+
+ /* Device directory tree walk */
+ for (; depth-- > 0; ) {
+ /*
+ * Select device id index bits based on device directory tree level
+ * and device context format.
+ * See IOMMU Specification, Chapter 2. Data Structures.
+ * - if extended device-context format is used:
+ * device index: [23:15][14:6][5:0]
+ * - if base device-context format is used:
+ * device index: [23:16][15:7][6:0]
+ */
+ const int split = depth * 9 + 6 + dc_fmt;
+ addr |= ((ctx->devid >> split) << 3) & ~TARGET_PAGE_MASK;
+ if (dma_memory_read(s->target_as, addr, &de, sizeof(de),
+ MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
+ return RISCV_IOMMU_FQ_CAUSE_DDT_LOAD_FAULT;
+ }
+ le64_to_cpus(&de);
+ if (!(de & RISCV_IOMMU_DDTE_VALID)) {
+ /* invalid directory entry */
+ return RISCV_IOMMU_FQ_CAUSE_DDT_INVALID;
+ }
+ if (de & ~(RISCV_IOMMU_DDTE_PPN | RISCV_IOMMU_DDTE_VALID)) {
+ /* reserved bits set */
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+ addr = PPN_PHYS(get_field(de, RISCV_IOMMU_DDTE_PPN));
+ }
+
+ /* index into device context entry page */
+ addr |= (ctx->devid * dc_len) & ~TARGET_PAGE_MASK;
+
+ memset(&dc, 0, sizeof(dc));
+ if (dma_memory_read(s->target_as, addr, &dc, dc_len,
+ MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
+ return RISCV_IOMMU_FQ_CAUSE_DDT_LOAD_FAULT;
+ }
+
+ /* Set translation context. */
+ ctx->tc = le64_to_cpu(dc.tc);
+ ctx->gatp = le64_to_cpu(dc.iohgatp);
+ ctx->satp = le64_to_cpu(dc.fsc);
+ ctx->ta = le64_to_cpu(dc.ta);
+ ctx->msiptp = le64_to_cpu(dc.msiptp);
+ ctx->msi_addr_mask = le64_to_cpu(dc.msi_addr_mask);
+ ctx->msi_addr_pattern = le64_to_cpu(dc.msi_addr_pattern);
+
+ if (!(ctx->tc & RISCV_IOMMU_DC_TC_V)) {
+ return RISCV_IOMMU_FQ_CAUSE_DDT_INVALID;
+ }
+
+ if (!riscv_iommu_validate_device_ctx(s, ctx)) {
+ return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
+ }
+
+ /* FSC field checks */
+ mode = get_field(ctx->satp, RISCV_IOMMU_DC_FSC_MODE);
+ addr = PPN_PHYS(get_field(ctx->satp, RISCV_IOMMU_DC_FSC_PPN));
+
+ if (!(ctx->tc & RISCV_IOMMU_DC_TC_PDTV)) {
+ if (ctx->process_id != RISCV_IOMMU_NOPROCID) {
+ /* PID is disabled */
+ return RISCV_IOMMU_FQ_CAUSE_TTYPE_BLOCKED;
+ }
+ if (mode > RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV57) {
+ /* Invalid translation mode */
+ return RISCV_IOMMU_FQ_CAUSE_DDT_INVALID;
+ }
+ return 0;
+ }
+
+ if (ctx->process_id == RISCV_IOMMU_NOPROCID) {
+ if (!(ctx->tc & RISCV_IOMMU_DC_TC_DPE)) {
+ /* No default process_id enabled, set BARE mode */
+ ctx->satp = 0ULL;
+ return 0;
+ } else {
+ /* Use default process_id #0 */
+ ctx->process_id = 0;
+ }
+ }
+
+ if (mode == RISCV_IOMMU_DC_FSC_MODE_BARE) {
+ /* No S-Stage translation, done. */
+ return 0;
+ }
+
+ /* FSC.TC.PDTV enabled */
+ if (mode > RISCV_IOMMU_DC_FSC_PDTP_MODE_PD20) {
+ /* Invalid PDTP.MODE */
+ return RISCV_IOMMU_FQ_CAUSE_PDT_MISCONFIGURED;
+ }
+
+ for (depth = mode - RISCV_IOMMU_DC_FSC_PDTP_MODE_PD8; depth-- > 0; ) {
+ /*
+ * Select process id index bits based on process directory tree
+ * level. See IOMMU Specification, 2.2. Process-Directory-Table.
+ */
+ const int split = depth * 9 + 8;
+ addr |= ((ctx->process_id >> split) << 3) & ~TARGET_PAGE_MASK;
+ if (dma_memory_read(s->target_as, addr, &de, sizeof(de),
+ MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
+ return RISCV_IOMMU_FQ_CAUSE_PDT_LOAD_FAULT;
+ }
+ le64_to_cpus(&de);
+ if (!(de & RISCV_IOMMU_PC_TA_V)) {
+ return RISCV_IOMMU_FQ_CAUSE_PDT_INVALID;
+ }
+ addr = PPN_PHYS(get_field(de, RISCV_IOMMU_PC_FSC_PPN));
+ }
+
+ /* Leaf entry in PDT */
+ addr |= (ctx->process_id << 4) & ~TARGET_PAGE_MASK;
+ if (dma_memory_read(s->target_as, addr, &dc.ta, sizeof(uint64_t) * 2,
+ MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
+ return RISCV_IOMMU_FQ_CAUSE_PDT_LOAD_FAULT;
+ }
+
+ /* Use FSC and TA from process directory entry. */
+ ctx->ta = le64_to_cpu(dc.ta);
+ ctx->satp = le64_to_cpu(dc.fsc);
+
+ if (!(ctx->ta & RISCV_IOMMU_PC_TA_V)) {
+ return RISCV_IOMMU_FQ_CAUSE_PDT_INVALID;
+ }
+
+ if (!riscv_iommu_validate_process_ctx(s, ctx)) {
+ return RISCV_IOMMU_FQ_CAUSE_PDT_MISCONFIGURED;
+ }
+
+ return 0;
+}
+
+/* Translation Context cache support */
+static gboolean __ctx_equal(gconstpointer v1, gconstpointer v2)
+{
+ RISCVIOMMUContext *c1 = (RISCVIOMMUContext *) v1;
+ RISCVIOMMUContext *c2 = (RISCVIOMMUContext *) v2;
+ return c1->devid == c2->devid &&
+ c1->process_id == c2->process_id;
+}
+
+static guint __ctx_hash(gconstpointer v)
+{
+ RISCVIOMMUContext *ctx = (RISCVIOMMUContext *) v;
+ /*
+ * Generate simple hash of (process_id, devid)
+ * assuming 24-bit wide devid.
+ */
+ return (guint)(ctx->devid) + ((guint)(ctx->process_id) << 24);
+}
+
+static void __ctx_inval_devid_procid(gpointer key, gpointer value,
+ gpointer data)
+{
+ RISCVIOMMUContext *ctx = (RISCVIOMMUContext *) value;
+ RISCVIOMMUContext *arg = (RISCVIOMMUContext *) data;
+ if (ctx->tc & RISCV_IOMMU_DC_TC_V &&
+ ctx->devid == arg->devid &&
+ ctx->process_id == arg->process_id) {
+ ctx->tc &= ~RISCV_IOMMU_DC_TC_V;
+ }
+}
+
+static void __ctx_inval_devid(gpointer key, gpointer value, gpointer data)
+{
+ RISCVIOMMUContext *ctx = (RISCVIOMMUContext *) value;
+ RISCVIOMMUContext *arg = (RISCVIOMMUContext *) data;
+ if (ctx->tc & RISCV_IOMMU_DC_TC_V &&
+ ctx->devid == arg->devid) {
+ ctx->tc &= ~RISCV_IOMMU_DC_TC_V;
+ }
+}
+
+static void __ctx_inval_all(gpointer key, gpointer value, gpointer data)
+{
+ RISCVIOMMUContext *ctx = (RISCVIOMMUContext *) value;
+ if (ctx->tc & RISCV_IOMMU_DC_TC_V) {
+ ctx->tc &= ~RISCV_IOMMU_DC_TC_V;
+ }
+}
+
+static void riscv_iommu_ctx_inval(RISCVIOMMUState *s, GHFunc func,
+ uint32_t devid, uint32_t process_id)
+{
+ GHashTable *ctx_cache;
+ RISCVIOMMUContext key = {
+ .devid = devid,
+ .process_id = process_id,
+ };
+ ctx_cache = g_hash_table_ref(s->ctx_cache);
+ qemu_mutex_lock(&s->ctx_lock);
+ g_hash_table_foreach(ctx_cache, func, &key);
+ qemu_mutex_unlock(&s->ctx_lock);
+ g_hash_table_unref(ctx_cache);
+}
+
+/* Find or allocate translation context for a given {device_id, process_id} */
+static RISCVIOMMUContext *riscv_iommu_ctx(RISCVIOMMUState *s,
+ unsigned devid, unsigned process_id,
+ void **ref)
+{
+ GHashTable *ctx_cache;
+ RISCVIOMMUContext *ctx;
+ RISCVIOMMUContext key = {
+ .devid = devid,
+ .process_id = process_id,
+ };
+
+ ctx_cache = g_hash_table_ref(s->ctx_cache);
+ qemu_mutex_lock(&s->ctx_lock);
+ ctx = g_hash_table_lookup(ctx_cache, &key);
+ qemu_mutex_unlock(&s->ctx_lock);
+
+ if (ctx && (ctx->tc & RISCV_IOMMU_DC_TC_V)) {
+ *ref = ctx_cache;
+ return ctx;
+ }
+
+ ctx = g_new0(RISCVIOMMUContext, 1);
+ ctx->devid = devid;
+ ctx->process_id = process_id;
+
+ int fault = riscv_iommu_ctx_fetch(s, ctx);
+ if (!fault) {
+ qemu_mutex_lock(&s->ctx_lock);
+ if (g_hash_table_size(ctx_cache) >= LIMIT_CACHE_CTX) {
+ g_hash_table_unref(ctx_cache);
+ ctx_cache = g_hash_table_new_full(__ctx_hash, __ctx_equal,
+ g_free, NULL);
+ g_hash_table_ref(ctx_cache);
+ g_hash_table_unref(qatomic_xchg(&s->ctx_cache, ctx_cache));
+ }
+ g_hash_table_add(ctx_cache, ctx);
+ qemu_mutex_unlock(&s->ctx_lock);
+ *ref = ctx_cache;
+ return ctx;
+ }
+
+ g_hash_table_unref(ctx_cache);
+ *ref = NULL;
+
+ riscv_iommu_report_fault(s, ctx, RISCV_IOMMU_FQ_TTYPE_UADDR_RD,
+ fault, !!process_id, 0, 0);
+
+ g_free(ctx);
+ return NULL;
+}
+
+static void riscv_iommu_ctx_put(RISCVIOMMUState *s, void *ref)
+{
+ if (ref) {
+ g_hash_table_unref((GHashTable *)ref);
+ }
+}
+
+/* Find or allocate address space for a given device */
+static AddressSpace *riscv_iommu_space(RISCVIOMMUState *s, uint32_t devid)
+{
+ RISCVIOMMUSpace *as;
+
+ /* FIXME: PCIe bus remapping for attached endpoints. */
+ devid |= s->bus << 8;
+
+ qemu_mutex_lock(&s->core_lock);
+ QLIST_FOREACH(as, &s->spaces, list) {
+ if (as->devid == devid) {
+ break;
+ }
+ }
+ qemu_mutex_unlock(&s->core_lock);
+
+ if (as == NULL) {
+ char name[64];
+ as = g_new0(RISCVIOMMUSpace, 1);
+
+ as->iommu = s;
+ as->devid = devid;
+
+ snprintf(name, sizeof(name), "riscv-iommu-%04x:%02x.%d-iova",
+ PCI_BUS_NUM(as->devid), PCI_SLOT(as->devid), PCI_FUNC(as->devid));
+
+ /* IOVA address space, untranslated addresses */
+ memory_region_init_iommu(&as->iova_mr, sizeof(as->iova_mr),
+ TYPE_RISCV_IOMMU_MEMORY_REGION,
+ OBJECT(as), "riscv_iommu", UINT64_MAX);
+ address_space_init(&as->iova_as, MEMORY_REGION(&as->iova_mr), name);
+
+ qemu_mutex_lock(&s->core_lock);
+ QLIST_INSERT_HEAD(&s->spaces, as, list);
+ qemu_mutex_unlock(&s->core_lock);
+
+ trace_riscv_iommu_new(s->parent_obj.id, PCI_BUS_NUM(as->devid),
+ PCI_SLOT(as->devid), PCI_FUNC(as->devid));
+ }
+ return &as->iova_as;
+}
+
+static int riscv_iommu_translate(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
+ IOMMUTLBEntry *iotlb)
+{
+ bool enable_pid;
+ bool enable_pri;
+ int fault;
+
+ /*
+ * TC[32] is reserved for custom extensions, used here to temporarily
+ * enable automatic page-request generation for ATS queries.
+ */
+ enable_pri = (iotlb->perm == IOMMU_NONE) && (ctx->tc & BIT_ULL(32));
+ enable_pid = (ctx->tc & RISCV_IOMMU_DC_TC_PDTV);
+
+ /* Translate using device directory / page table information. */
+ fault = riscv_iommu_spa_fetch(s, ctx, iotlb);
+
+ if (enable_pri && fault) {
+ struct riscv_iommu_pq_record pr = {0};
+ if (enable_pid) {
+ pr.hdr = set_field(RISCV_IOMMU_PREQ_HDR_PV,
+ RISCV_IOMMU_PREQ_HDR_PID, ctx->process_id);
+ }
+ pr.hdr = set_field(pr.hdr, RISCV_IOMMU_PREQ_HDR_DID, ctx->devid);
+ pr.payload = (iotlb->iova & TARGET_PAGE_MASK) |
+ RISCV_IOMMU_PREQ_PAYLOAD_M;
+ riscv_iommu_pri(s, &pr);
+ return fault;
+ }
+
+ if (fault) {
+ unsigned ttype;
+
+ if (iotlb->perm & IOMMU_RW) {
+ ttype = RISCV_IOMMU_FQ_TTYPE_UADDR_WR;
+ } else {
+ ttype = RISCV_IOMMU_FQ_TTYPE_UADDR_RD;
+ }
+
+ riscv_iommu_report_fault(s, ctx, ttype, fault, enable_pid,
+ iotlb->iova, iotlb->translated_addr);
+ return fault;
+ }
+
+ return 0;
+}
+
+/* IOMMU Command Interface */
+static MemTxResult riscv_iommu_iofence(RISCVIOMMUState *s, bool notify,
+ uint64_t addr, uint32_t data)
+{
+ /*
+ * ATS processing in this implementation of the IOMMU is synchronous,
+ * no need to wait for completions here.
+ */
+ if (!notify) {
+ return MEMTX_OK;
+ }
+
+ return dma_memory_write(s->target_as, addr, &data, sizeof(data),
+ MEMTXATTRS_UNSPECIFIED);
+}
+
+static void riscv_iommu_process_ddtp(RISCVIOMMUState *s)
+{
+ uint64_t old_ddtp = s->ddtp;
+ uint64_t new_ddtp = riscv_iommu_reg_get64(s, RISCV_IOMMU_REG_DDTP);
+ unsigned new_mode = get_field(new_ddtp, RISCV_IOMMU_DDTP_MODE);
+ unsigned old_mode = get_field(old_ddtp, RISCV_IOMMU_DDTP_MODE);
+ bool ok = false;
+
+ /*
+ * Check for allowed DDTP.MODE transitions:
+ * {OFF, BARE} -> {OFF, BARE, 1LVL, 2LVL, 3LVL}
+ * {1LVL, 2LVL, 3LVL} -> {OFF, BARE}
+ */
+ if (new_mode == old_mode ||
+ new_mode == RISCV_IOMMU_DDTP_MODE_OFF ||
+ new_mode == RISCV_IOMMU_DDTP_MODE_BARE) {
+ ok = true;
+ } else if (new_mode == RISCV_IOMMU_DDTP_MODE_1LVL ||
+ new_mode == RISCV_IOMMU_DDTP_MODE_2LVL ||
+ new_mode == RISCV_IOMMU_DDTP_MODE_3LVL) {
+ ok = old_mode == RISCV_IOMMU_DDTP_MODE_OFF ||
+ old_mode == RISCV_IOMMU_DDTP_MODE_BARE;
+ }
+
+ if (ok) {
+ /* clear reserved and busy bits, report back sanitized version */
+ new_ddtp = set_field(new_ddtp & RISCV_IOMMU_DDTP_PPN,
+ RISCV_IOMMU_DDTP_MODE, new_mode);
+ } else {
+ new_ddtp = old_ddtp;
+ }
+ s->ddtp = new_ddtp;
+
+ riscv_iommu_reg_set64(s, RISCV_IOMMU_REG_DDTP, new_ddtp);
+}
+
+/* Command function and opcode field. */
+#define RISCV_IOMMU_CMD(func, op) (((func) << 7) | (op))
+
+static void riscv_iommu_process_cq_tail(RISCVIOMMUState *s)
+{
+ struct riscv_iommu_command cmd;
+ MemTxResult res;
+ dma_addr_t addr;
+ uint32_t tail, head, ctrl;
+ uint64_t cmd_opcode;
+ GHFunc func;
+
+ ctrl = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_CQCSR);
+ tail = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_CQT) & s->cq_mask;
+ head = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_CQH) & s->cq_mask;
+
+ /* Check for pending error or queue processing disabled */
+ if (!(ctrl & RISCV_IOMMU_CQCSR_CQON) ||
+ !!(ctrl & (RISCV_IOMMU_CQCSR_CMD_ILL | RISCV_IOMMU_CQCSR_CQMF))) {
+ return;
+ }
+
+ while (tail != head) {
+ addr = s->cq_addr + head * sizeof(cmd);
+ res = dma_memory_read(s->target_as, addr, &cmd, sizeof(cmd),
+ MEMTXATTRS_UNSPECIFIED);
+
+ if (res != MEMTX_OK) {
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_CQCSR,
+ RISCV_IOMMU_CQCSR_CQMF, 0);
+ goto fault;
+ }
+
+ trace_riscv_iommu_cmd(s->parent_obj.id, cmd.dword0, cmd.dword1);
+
+ cmd_opcode = get_field(cmd.dword0,
+ RISCV_IOMMU_CMD_OPCODE | RISCV_IOMMU_CMD_FUNC);
+
+ switch (cmd_opcode) {
+ case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_IOFENCE_FUNC_C,
+ RISCV_IOMMU_CMD_IOFENCE_OPCODE):
+ res = riscv_iommu_iofence(s,
+ cmd.dword0 & RISCV_IOMMU_CMD_IOFENCE_AV, cmd.dword1,
+ get_field(cmd.dword0, RISCV_IOMMU_CMD_IOFENCE_DATA));
+
+ if (res != MEMTX_OK) {
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_CQCSR,
+ RISCV_IOMMU_CQCSR_CQMF, 0);
+ goto fault;
+ }
+ break;
+
+ case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_IOTINVAL_FUNC_GVMA,
+ RISCV_IOMMU_CMD_IOTINVAL_OPCODE):
+ if (cmd.dword0 & RISCV_IOMMU_CMD_IOTINVAL_PSCV) {
+ /* illegal command arguments IOTINVAL.GVMA & PSCV == 1 */
+ goto cmd_ill;
+ }
+ /* translation cache not implemented yet */
+ break;
+
+ case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_IOTINVAL_FUNC_VMA,
+ RISCV_IOMMU_CMD_IOTINVAL_OPCODE):
+ /* translation cache not implemented yet */
+ break;
+
+ case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_IODIR_FUNC_INVAL_DDT,
+ RISCV_IOMMU_CMD_IODIR_OPCODE):
+ if (!(cmd.dword0 & RISCV_IOMMU_CMD_IODIR_DV)) {
+ /* invalidate all device context cache mappings */
+ func = __ctx_inval_all;
+ } else {
+ /* invalidate all device context matching DID */
+ func = __ctx_inval_devid;
+ }
+ riscv_iommu_ctx_inval(s, func,
+ get_field(cmd.dword0, RISCV_IOMMU_CMD_IODIR_DID), 0);
+ break;
+
+ case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_IODIR_FUNC_INVAL_PDT,
+ RISCV_IOMMU_CMD_IODIR_OPCODE):
+ if (!(cmd.dword0 & RISCV_IOMMU_CMD_IODIR_DV)) {
+ /* illegal command arguments IODIR_PDT & DV == 0 */
+ goto cmd_ill;
+ } else {
+ func = __ctx_inval_devid_procid;
+ }
+ riscv_iommu_ctx_inval(s, func,
+ get_field(cmd.dword0, RISCV_IOMMU_CMD_IODIR_DID),
+ get_field(cmd.dword0, RISCV_IOMMU_CMD_IODIR_PID));
+ break;
+
+ default:
+ cmd_ill:
+ /* Invalid instruction, do not advance instruction index. */
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_CQCSR,
+ RISCV_IOMMU_CQCSR_CMD_ILL, 0);
+ goto fault;
+ }
+
+ /* Advance and update head pointer after command completes. */
+ head = (head + 1) & s->cq_mask;
+ riscv_iommu_reg_set32(s, RISCV_IOMMU_REG_CQH, head);
+ }
+ return;
+
+fault:
+ if (ctrl & RISCV_IOMMU_CQCSR_CIE) {
+ riscv_iommu_notify(s, RISCV_IOMMU_INTR_CQ);
+ }
+}
+
+static void riscv_iommu_process_cq_control(RISCVIOMMUState *s)
+{
+ uint64_t base;
+ uint32_t ctrl_set = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_CQCSR);
+ uint32_t ctrl_clr;
+ bool enable = !!(ctrl_set & RISCV_IOMMU_CQCSR_CQEN);
+ bool active = !!(ctrl_set & RISCV_IOMMU_CQCSR_CQON);
+
+ if (enable && !active) {
+ base = riscv_iommu_reg_get64(s, RISCV_IOMMU_REG_CQB);
+ s->cq_mask = (2ULL << get_field(base, RISCV_IOMMU_CQB_LOG2SZ)) - 1;
+ s->cq_addr = PPN_PHYS(get_field(base, RISCV_IOMMU_CQB_PPN));
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_CQT], ~s->cq_mask);
+ stl_le_p(&s->regs_rw[RISCV_IOMMU_REG_CQH], 0);
+ stl_le_p(&s->regs_rw[RISCV_IOMMU_REG_CQT], 0);
+ ctrl_set = RISCV_IOMMU_CQCSR_CQON;
+ ctrl_clr = RISCV_IOMMU_CQCSR_BUSY | RISCV_IOMMU_CQCSR_CQMF |
+ RISCV_IOMMU_CQCSR_CMD_ILL | RISCV_IOMMU_CQCSR_CMD_TO |
+ RISCV_IOMMU_CQCSR_FENCE_W_IP;
+ } else if (!enable && active) {
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_CQT], ~0);
+ ctrl_set = 0;
+ ctrl_clr = RISCV_IOMMU_CQCSR_BUSY | RISCV_IOMMU_CQCSR_CQON;
+ } else {
+ ctrl_set = 0;
+ ctrl_clr = RISCV_IOMMU_CQCSR_BUSY;
+ }
+
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_CQCSR, ctrl_set, ctrl_clr);
+}
+
+static void riscv_iommu_process_fq_control(RISCVIOMMUState *s)
+{
+ uint64_t base;
+ uint32_t ctrl_set = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_FQCSR);
+ uint32_t ctrl_clr;
+ bool enable = !!(ctrl_set & RISCV_IOMMU_FQCSR_FQEN);
+ bool active = !!(ctrl_set & RISCV_IOMMU_FQCSR_FQON);
+
+ if (enable && !active) {
+ base = riscv_iommu_reg_get64(s, RISCV_IOMMU_REG_FQB);
+ s->fq_mask = (2ULL << get_field(base, RISCV_IOMMU_FQB_LOG2SZ)) - 1;
+ s->fq_addr = PPN_PHYS(get_field(base, RISCV_IOMMU_FQB_PPN));
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_FQH], ~s->fq_mask);
+ stl_le_p(&s->regs_rw[RISCV_IOMMU_REG_FQH], 0);
+ stl_le_p(&s->regs_rw[RISCV_IOMMU_REG_FQT], 0);
+ ctrl_set = RISCV_IOMMU_FQCSR_FQON;
+ ctrl_clr = RISCV_IOMMU_FQCSR_BUSY | RISCV_IOMMU_FQCSR_FQMF |
+ RISCV_IOMMU_FQCSR_FQOF;
+ } else if (!enable && active) {
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_FQH], ~0);
+ ctrl_set = 0;
+ ctrl_clr = RISCV_IOMMU_FQCSR_BUSY | RISCV_IOMMU_FQCSR_FQON;
+ } else {
+ ctrl_set = 0;
+ ctrl_clr = RISCV_IOMMU_FQCSR_BUSY;
+ }
+
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_FQCSR, ctrl_set, ctrl_clr);
+}
+
+static void riscv_iommu_process_pq_control(RISCVIOMMUState *s)
+{
+ uint64_t base;
+ uint32_t ctrl_set = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_PQCSR);
+ uint32_t ctrl_clr;
+ bool enable = !!(ctrl_set & RISCV_IOMMU_PQCSR_PQEN);
+ bool active = !!(ctrl_set & RISCV_IOMMU_PQCSR_PQON);
+
+ if (enable && !active) {
+ base = riscv_iommu_reg_get64(s, RISCV_IOMMU_REG_PQB);
+ s->pq_mask = (2ULL << get_field(base, RISCV_IOMMU_PQB_LOG2SZ)) - 1;
+ s->pq_addr = PPN_PHYS(get_field(base, RISCV_IOMMU_PQB_PPN));
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_PQH], ~s->pq_mask);
+ stl_le_p(&s->regs_rw[RISCV_IOMMU_REG_PQH], 0);
+ stl_le_p(&s->regs_rw[RISCV_IOMMU_REG_PQT], 0);
+ ctrl_set = RISCV_IOMMU_PQCSR_PQON;
+ ctrl_clr = RISCV_IOMMU_PQCSR_BUSY | RISCV_IOMMU_PQCSR_PQMF |
+ RISCV_IOMMU_PQCSR_PQOF;
+ } else if (!enable && active) {
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_PQH], ~0);
+ ctrl_set = 0;
+ ctrl_clr = RISCV_IOMMU_PQCSR_BUSY | RISCV_IOMMU_PQCSR_PQON;
+ } else {
+ ctrl_set = 0;
+ ctrl_clr = RISCV_IOMMU_PQCSR_BUSY;
+ }
+
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_PQCSR, ctrl_set, ctrl_clr);
+}
+
+typedef void riscv_iommu_process_fn(RISCVIOMMUState *s);
+
+static void riscv_iommu_update_icvec(RISCVIOMMUState *s, uint64_t data)
+{
+ uint64_t icvec = 0;
+
+ icvec |= MIN(data & RISCV_IOMMU_ICVEC_CIV,
+ s->icvec_avail_vectors & RISCV_IOMMU_ICVEC_CIV);
+
+ icvec |= MIN(data & RISCV_IOMMU_ICVEC_FIV,
+ s->icvec_avail_vectors & RISCV_IOMMU_ICVEC_FIV);
+
+ icvec |= MIN(data & RISCV_IOMMU_ICVEC_PMIV,
+ s->icvec_avail_vectors & RISCV_IOMMU_ICVEC_PMIV);
+
+ icvec |= MIN(data & RISCV_IOMMU_ICVEC_PIV,
+ s->icvec_avail_vectors & RISCV_IOMMU_ICVEC_PIV);
+
+ trace_riscv_iommu_icvec_write(data, icvec);
+
+ riscv_iommu_reg_set64(s, RISCV_IOMMU_REG_ICVEC, icvec);
+}
+
+static void riscv_iommu_update_ipsr(RISCVIOMMUState *s, uint64_t data)
+{
+ uint32_t cqcsr, fqcsr, pqcsr;
+ uint32_t ipsr_set = 0;
+ uint32_t ipsr_clr = 0;
+
+ if (data & RISCV_IOMMU_IPSR_CIP) {
+ cqcsr = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_CQCSR);
+
+ if (cqcsr & RISCV_IOMMU_CQCSR_CIE &&
+ (cqcsr & RISCV_IOMMU_CQCSR_FENCE_W_IP ||
+ cqcsr & RISCV_IOMMU_CQCSR_CMD_ILL ||
+ cqcsr & RISCV_IOMMU_CQCSR_CMD_TO ||
+ cqcsr & RISCV_IOMMU_CQCSR_CQMF)) {
+ ipsr_set |= RISCV_IOMMU_IPSR_CIP;
+ } else {
+ ipsr_clr |= RISCV_IOMMU_IPSR_CIP;
+ }
+ } else {
+ ipsr_clr |= RISCV_IOMMU_IPSR_CIP;
+ }
+
+ if (data & RISCV_IOMMU_IPSR_FIP) {
+ fqcsr = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_FQCSR);
+
+ if (fqcsr & RISCV_IOMMU_FQCSR_FIE &&
+ (fqcsr & RISCV_IOMMU_FQCSR_FQOF ||
+ fqcsr & RISCV_IOMMU_FQCSR_FQMF)) {
+ ipsr_set |= RISCV_IOMMU_IPSR_FIP;
+ } else {
+ ipsr_clr |= RISCV_IOMMU_IPSR_FIP;
+ }
+ } else {
+ ipsr_clr |= RISCV_IOMMU_IPSR_FIP;
+ }
+
+ if (data & RISCV_IOMMU_IPSR_PIP) {
+ pqcsr = riscv_iommu_reg_get32(s, RISCV_IOMMU_REG_PQCSR);
+
+ if (pqcsr & RISCV_IOMMU_PQCSR_PIE &&
+ (pqcsr & RISCV_IOMMU_PQCSR_PQOF ||
+ pqcsr & RISCV_IOMMU_PQCSR_PQMF)) {
+ ipsr_set |= RISCV_IOMMU_IPSR_PIP;
+ } else {
+ ipsr_clr |= RISCV_IOMMU_IPSR_PIP;
+ }
+ } else {
+ ipsr_clr |= RISCV_IOMMU_IPSR_PIP;
+ }
+
+ riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_IPSR, ipsr_set, ipsr_clr);
+}
+
+static MemTxResult riscv_iommu_mmio_write(void *opaque, hwaddr addr,
+ uint64_t data, unsigned size, MemTxAttrs attrs)
+{
+ riscv_iommu_process_fn *process_fn = NULL;
+ RISCVIOMMUState *s = opaque;
+ uint32_t regb = addr & ~3;
+ uint32_t busy = 0;
+ uint64_t val = 0;
+
+ if ((addr & (size - 1)) != 0) {
+ /* Unsupported MMIO alignment or access size */
+ return MEMTX_ERROR;
+ }
+
+ if (addr + size > RISCV_IOMMU_REG_MSI_CONFIG) {
+ /* Unsupported MMIO access location. */
+ return MEMTX_ACCESS_ERROR;
+ }
+
+ /* Track actionable MMIO write. */
+ switch (regb) {
+ case RISCV_IOMMU_REG_DDTP:
+ case RISCV_IOMMU_REG_DDTP + 4:
+ process_fn = riscv_iommu_process_ddtp;
+ regb = RISCV_IOMMU_REG_DDTP;
+ busy = RISCV_IOMMU_DDTP_BUSY;
+ break;
+
+ case RISCV_IOMMU_REG_CQT:
+ process_fn = riscv_iommu_process_cq_tail;
+ break;
+
+ case RISCV_IOMMU_REG_CQCSR:
+ process_fn = riscv_iommu_process_cq_control;
+ busy = RISCV_IOMMU_CQCSR_BUSY;
+ break;
+
+ case RISCV_IOMMU_REG_FQCSR:
+ process_fn = riscv_iommu_process_fq_control;
+ busy = RISCV_IOMMU_FQCSR_BUSY;
+ break;
+
+ case RISCV_IOMMU_REG_PQCSR:
+ process_fn = riscv_iommu_process_pq_control;
+ busy = RISCV_IOMMU_PQCSR_BUSY;
+ break;
+
+ case RISCV_IOMMU_REG_ICVEC:
+ case RISCV_IOMMU_REG_IPSR:
+ /*
+ * ICVEC and IPSR have special read/write procedures. We'll
+ * call their respective helpers and exit.
+ */
+ qemu_spin_lock(&s->regs_lock);
+
+ if (size == 4) {
+ uint32_t ro = ldl_le_p(&s->regs_ro[addr]);
+ uint32_t wc = ldl_le_p(&s->regs_wc[addr]);
+ uint32_t rw = ldl_le_p(&s->regs_rw[addr]);
+ stl_le_p(&val, ((rw & ro) | (data & ~ro)) & ~(data & wc));
+ } else if (size == 8) {
+ uint64_t ro = ldq_le_p(&s->regs_ro[addr]);
+ uint64_t wc = ldq_le_p(&s->regs_wc[addr]);
+ uint64_t rw = ldq_le_p(&s->regs_rw[addr]);
+ stq_le_p(&val, ((rw & ro) | (data & ~ro)) & ~(data & wc));
+ }
+
+ qemu_spin_unlock(&s->regs_lock);
+
+ if (regb == RISCV_IOMMU_REG_ICVEC) {
+ riscv_iommu_update_icvec(s, val);
+ } else {
+ riscv_iommu_update_ipsr(s, val);
+ }
+
+ return MEMTX_OK;
+
+ default:
+ break;
+ }
+
+ /*
+ * Registers update might be not synchronized with core logic.
+ * If system software updates register when relevant BUSY bit
+ * is set IOMMU behavior of additional writes to the register
+ * is UNSPECIFIED.
+ */
+ qemu_spin_lock(&s->regs_lock);
+ if (size == 1) {
+ uint8_t ro = s->regs_ro[addr];
+ uint8_t wc = s->regs_wc[addr];
+ uint8_t rw = s->regs_rw[addr];
+ s->regs_rw[addr] = ((rw & ro) | (data & ~ro)) & ~(data & wc);
+ } else if (size == 2) {
+ uint16_t ro = lduw_le_p(&s->regs_ro[addr]);
+ uint16_t wc = lduw_le_p(&s->regs_wc[addr]);
+ uint16_t rw = lduw_le_p(&s->regs_rw[addr]);
+ stw_le_p(&s->regs_rw[addr], ((rw & ro) | (data & ~ro)) & ~(data & wc));
+ } else if (size == 4) {
+ uint32_t ro = ldl_le_p(&s->regs_ro[addr]);
+ uint32_t wc = ldl_le_p(&s->regs_wc[addr]);
+ uint32_t rw = ldl_le_p(&s->regs_rw[addr]);
+ stl_le_p(&s->regs_rw[addr], ((rw & ro) | (data & ~ro)) & ~(data & wc));
+ } else if (size == 8) {
+ uint64_t ro = ldq_le_p(&s->regs_ro[addr]);
+ uint64_t wc = ldq_le_p(&s->regs_wc[addr]);
+ uint64_t rw = ldq_le_p(&s->regs_rw[addr]);
+ stq_le_p(&s->regs_rw[addr], ((rw & ro) | (data & ~ro)) & ~(data & wc));
+ }
+
+ /* Busy flag update, MSB 4-byte register. */
+ if (busy) {
+ uint32_t rw = ldl_le_p(&s->regs_rw[regb]);
+ stl_le_p(&s->regs_rw[regb], rw | busy);
+ }
+ qemu_spin_unlock(&s->regs_lock);
+
+ if (process_fn) {
+ qemu_mutex_lock(&s->core_lock);
+ process_fn(s);
+ qemu_mutex_unlock(&s->core_lock);
+ }
+
+ return MEMTX_OK;
+}
+
+static MemTxResult riscv_iommu_mmio_read(void *opaque, hwaddr addr,
+ uint64_t *data, unsigned size, MemTxAttrs attrs)
+{
+ RISCVIOMMUState *s = opaque;
+ uint64_t val = -1;
+ uint8_t *ptr;
+
+ if ((addr & (size - 1)) != 0) {
+ /* Unsupported MMIO alignment. */
+ return MEMTX_ERROR;
+ }
+
+ if (addr + size > RISCV_IOMMU_REG_MSI_CONFIG) {
+ return MEMTX_ACCESS_ERROR;
+ }
+
+ ptr = &s->regs_rw[addr];
+
+ if (size == 1) {
+ val = (uint64_t)*ptr;
+ } else if (size == 2) {
+ val = lduw_le_p(ptr);
+ } else if (size == 4) {
+ val = ldl_le_p(ptr);
+ } else if (size == 8) {
+ val = ldq_le_p(ptr);
+ } else {
+ return MEMTX_ERROR;
+ }
+
+ *data = val;
+
+ return MEMTX_OK;
+}
+
+static const MemoryRegionOps riscv_iommu_mmio_ops = {
+ .read_with_attrs = riscv_iommu_mmio_read,
+ .write_with_attrs = riscv_iommu_mmio_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
+ .impl = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ .unaligned = false,
+ },
+ .valid = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ }
+};
+
+/*
+ * Translations matching MSI pattern check are redirected to "riscv-iommu-trap"
+ * memory region as untranslated address, for additional MSI/MRIF interception
+ * by IOMMU interrupt remapping implementation.
+ * Note: Device emulation code generating an MSI is expected to provide a valid
+ * memory transaction attributes with requested_id set.
+ */
+static MemTxResult riscv_iommu_trap_write(void *opaque, hwaddr addr,
+ uint64_t data, unsigned size, MemTxAttrs attrs)
+{
+ RISCVIOMMUState* s = (RISCVIOMMUState *)opaque;
+ RISCVIOMMUContext *ctx;
+ MemTxResult res;
+ void *ref;
+ uint32_t devid = attrs.requester_id;
+
+ if (attrs.unspecified) {
+ return MEMTX_ACCESS_ERROR;
+ }
+
+ /* FIXME: PCIe bus remapping for attached endpoints. */
+ devid |= s->bus << 8;
+
+ ctx = riscv_iommu_ctx(s, devid, 0, &ref);
+ if (ctx == NULL) {
+ res = MEMTX_ACCESS_ERROR;
+ } else {
+ res = riscv_iommu_msi_write(s, ctx, addr, data, size, attrs);
+ }
+ riscv_iommu_ctx_put(s, ref);
+ return res;
+}
+
+static MemTxResult riscv_iommu_trap_read(void *opaque, hwaddr addr,
+ uint64_t *data, unsigned size, MemTxAttrs attrs)
+{
+ return MEMTX_ACCESS_ERROR;
+}
+
+static const MemoryRegionOps riscv_iommu_trap_ops = {
+ .read_with_attrs = riscv_iommu_trap_read,
+ .write_with_attrs = riscv_iommu_trap_write,
+ .endianness = DEVICE_LITTLE_ENDIAN,
+ .impl = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ .unaligned = true,
+ },
+ .valid = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ }
+};
+
+static void riscv_iommu_realize(DeviceState *dev, Error **errp)
+{
+ RISCVIOMMUState *s = RISCV_IOMMU(dev);
+
+ s->cap = s->version & RISCV_IOMMU_CAP_VERSION;
+ if (s->enable_msi) {
+ s->cap |= RISCV_IOMMU_CAP_MSI_FLAT | RISCV_IOMMU_CAP_MSI_MRIF;
+ }
+ if (s->enable_s_stage) {
+ s->cap |= RISCV_IOMMU_CAP_SV32 | RISCV_IOMMU_CAP_SV39 |
+ RISCV_IOMMU_CAP_SV48 | RISCV_IOMMU_CAP_SV57;
+ }
+ if (s->enable_g_stage) {
+ s->cap |= RISCV_IOMMU_CAP_SV32X4 | RISCV_IOMMU_CAP_SV39X4 |
+ RISCV_IOMMU_CAP_SV48X4 | RISCV_IOMMU_CAP_SV57X4;
+ }
+ /* Report QEMU target physical address space limits */
+ s->cap = set_field(s->cap, RISCV_IOMMU_CAP_PAS,
+ TARGET_PHYS_ADDR_SPACE_BITS);
+
+ /* TODO: method to report supported PID bits */
+ s->pid_bits = 8; /* restricted to size of MemTxAttrs.pid */
+ s->cap |= RISCV_IOMMU_CAP_PD8;
+
+ /* Out-of-reset translation mode: OFF (DMA disabled) BARE (passthrough) */
+ s->ddtp = set_field(0, RISCV_IOMMU_DDTP_MODE, s->enable_off ?
+ RISCV_IOMMU_DDTP_MODE_OFF : RISCV_IOMMU_DDTP_MODE_BARE);
+
+ /* register storage */
+ s->regs_rw = g_new0(uint8_t, RISCV_IOMMU_REG_SIZE);
+ s->regs_ro = g_new0(uint8_t, RISCV_IOMMU_REG_SIZE);
+ s->regs_wc = g_new0(uint8_t, RISCV_IOMMU_REG_SIZE);
+
+ /* Mark all registers read-only */
+ memset(s->regs_ro, 0xff, RISCV_IOMMU_REG_SIZE);
+
+ /*
+ * Register complete MMIO space, including MSI/PBA registers.
+ * Note, PCIDevice implementation will add overlapping MR for MSI/PBA,
+ * managed directly by the PCIDevice implementation.
+ */
+ memory_region_init_io(&s->regs_mr, OBJECT(dev), &riscv_iommu_mmio_ops, s,
+ "riscv-iommu-regs", RISCV_IOMMU_REG_SIZE);
+
+ /* Set power-on register state */
+ stq_le_p(&s->regs_rw[RISCV_IOMMU_REG_CAP], s->cap);
+ stq_le_p(&s->regs_rw[RISCV_IOMMU_REG_FCTL], 0);
+ stq_le_p(&s->regs_ro[RISCV_IOMMU_REG_FCTL],
+ ~(RISCV_IOMMU_FCTL_BE | RISCV_IOMMU_FCTL_WSI));
+ stq_le_p(&s->regs_ro[RISCV_IOMMU_REG_DDTP],
+ ~(RISCV_IOMMU_DDTP_PPN | RISCV_IOMMU_DDTP_MODE));
+ stq_le_p(&s->regs_ro[RISCV_IOMMU_REG_CQB],
+ ~(RISCV_IOMMU_CQB_LOG2SZ | RISCV_IOMMU_CQB_PPN));
+ stq_le_p(&s->regs_ro[RISCV_IOMMU_REG_FQB],
+ ~(RISCV_IOMMU_FQB_LOG2SZ | RISCV_IOMMU_FQB_PPN));
+ stq_le_p(&s->regs_ro[RISCV_IOMMU_REG_PQB],
+ ~(RISCV_IOMMU_PQB_LOG2SZ | RISCV_IOMMU_PQB_PPN));
+ stl_le_p(&s->regs_wc[RISCV_IOMMU_REG_CQCSR], RISCV_IOMMU_CQCSR_CQMF |
+ RISCV_IOMMU_CQCSR_CMD_TO | RISCV_IOMMU_CQCSR_CMD_ILL);
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_CQCSR], RISCV_IOMMU_CQCSR_CQON |
+ RISCV_IOMMU_CQCSR_BUSY);
+ stl_le_p(&s->regs_wc[RISCV_IOMMU_REG_FQCSR], RISCV_IOMMU_FQCSR_FQMF |
+ RISCV_IOMMU_FQCSR_FQOF);
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_FQCSR], RISCV_IOMMU_FQCSR_FQON |
+ RISCV_IOMMU_FQCSR_BUSY);
+ stl_le_p(&s->regs_wc[RISCV_IOMMU_REG_PQCSR], RISCV_IOMMU_PQCSR_PQMF |
+ RISCV_IOMMU_PQCSR_PQOF);
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_PQCSR], RISCV_IOMMU_PQCSR_PQON |
+ RISCV_IOMMU_PQCSR_BUSY);
+ stl_le_p(&s->regs_wc[RISCV_IOMMU_REG_IPSR], ~0);
+ stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_ICVEC], 0);
+ stq_le_p(&s->regs_rw[RISCV_IOMMU_REG_DDTP], s->ddtp);
+
+ /* Memory region for downstream access, if specified. */
+ if (s->target_mr) {
+ s->target_as = g_new0(AddressSpace, 1);
+ address_space_init(s->target_as, s->target_mr,
+ "riscv-iommu-downstream");
+ } else {
+ /* Fallback to global system memory. */
+ s->target_as = &address_space_memory;
+ }
+
+ /* Memory region for untranslated MRIF/MSI writes */
+ memory_region_init_io(&s->trap_mr, OBJECT(dev), &riscv_iommu_trap_ops, s,
+ "riscv-iommu-trap", ~0ULL);
+ address_space_init(&s->trap_as, &s->trap_mr, "riscv-iommu-trap-as");
+
+ /* Device translation context cache */
+ s->ctx_cache = g_hash_table_new_full(__ctx_hash, __ctx_equal,
+ g_free, NULL);
+ qemu_mutex_init(&s->ctx_lock);
+
+ s->iommus.le_next = NULL;
+ s->iommus.le_prev = NULL;
+ QLIST_INIT(&s->spaces);
+ qemu_mutex_init(&s->core_lock);
+ qemu_spin_init(&s->regs_lock);
+}
+
+static void riscv_iommu_unrealize(DeviceState *dev)
+{
+ RISCVIOMMUState *s = RISCV_IOMMU(dev);
+
+ qemu_mutex_destroy(&s->core_lock);
+ g_hash_table_unref(s->ctx_cache);
+}
+
+static Property riscv_iommu_properties[] = {
+ DEFINE_PROP_UINT32("version", RISCVIOMMUState, version,
+ RISCV_IOMMU_SPEC_DOT_VER),
+ DEFINE_PROP_UINT32("bus", RISCVIOMMUState, bus, 0x0),
+ DEFINE_PROP_BOOL("intremap", RISCVIOMMUState, enable_msi, TRUE),
+ DEFINE_PROP_BOOL("off", RISCVIOMMUState, enable_off, TRUE),
+ DEFINE_PROP_BOOL("s-stage", RISCVIOMMUState, enable_s_stage, TRUE),
+ DEFINE_PROP_BOOL("g-stage", RISCVIOMMUState, enable_g_stage, TRUE),
+ DEFINE_PROP_LINK("downstream-mr", RISCVIOMMUState, target_mr,
+ TYPE_MEMORY_REGION, MemoryRegion *),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void riscv_iommu_class_init(ObjectClass *klass, void* data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ /* internal device for riscv-iommu-{pci/sys}, not user-creatable */
+ dc->user_creatable = false;
+ dc->realize = riscv_iommu_realize;
+ dc->unrealize = riscv_iommu_unrealize;
+ device_class_set_props(dc, riscv_iommu_properties);
+}
+
+static const TypeInfo riscv_iommu_info = {
+ .name = TYPE_RISCV_IOMMU,
+ .parent = TYPE_DEVICE,
+ .instance_size = sizeof(RISCVIOMMUState),
+ .class_init = riscv_iommu_class_init,
+};
+
+static const char *IOMMU_FLAG_STR[] = {
+ "NA",
+ "RO",
+ "WR",
+ "RW",
+};
+
+/* RISC-V IOMMU Memory Region - Address Translation Space */
+static IOMMUTLBEntry riscv_iommu_memory_region_translate(
+ IOMMUMemoryRegion *iommu_mr, hwaddr addr,
+ IOMMUAccessFlags flag, int iommu_idx)
+{
+ RISCVIOMMUSpace *as = container_of(iommu_mr, RISCVIOMMUSpace, iova_mr);
+ RISCVIOMMUContext *ctx;
+ void *ref;
+ IOMMUTLBEntry iotlb = {
+ .iova = addr,
+ .target_as = as->iommu->target_as,
+ .addr_mask = ~0ULL,
+ .perm = flag,
+ };
+
+ ctx = riscv_iommu_ctx(as->iommu, as->devid, iommu_idx, &ref);
+ if (ctx == NULL) {
+ /* Translation disabled or invalid. */
+ iotlb.addr_mask = 0;
+ iotlb.perm = IOMMU_NONE;
+ } else if (riscv_iommu_translate(as->iommu, ctx, &iotlb)) {
+ /* Translation disabled or fault reported. */
+ iotlb.addr_mask = 0;
+ iotlb.perm = IOMMU_NONE;
+ }
+
+ /* Trace all dma translations with original access flags. */
+ trace_riscv_iommu_dma(as->iommu->parent_obj.id, PCI_BUS_NUM(as->devid),
+ PCI_SLOT(as->devid), PCI_FUNC(as->devid), iommu_idx,
+ IOMMU_FLAG_STR[flag & IOMMU_RW], iotlb.iova,
+ iotlb.translated_addr);
+
+ riscv_iommu_ctx_put(as->iommu, ref);
+
+ return iotlb;
+}
+
+static int riscv_iommu_memory_region_notify(
+ IOMMUMemoryRegion *iommu_mr, IOMMUNotifierFlag old,
+ IOMMUNotifierFlag new, Error **errp)
+{
+ RISCVIOMMUSpace *as = container_of(iommu_mr, RISCVIOMMUSpace, iova_mr);
+
+ if (old == IOMMU_NOTIFIER_NONE) {
+ as->notifier = true;
+ trace_riscv_iommu_notifier_add(iommu_mr->parent_obj.name);
+ } else if (new == IOMMU_NOTIFIER_NONE) {
+ as->notifier = false;
+ trace_riscv_iommu_notifier_del(iommu_mr->parent_obj.name);
+ }
+
+ return 0;
+}
+
+static inline bool pci_is_iommu(PCIDevice *pdev)
+{
+ return pci_get_word(pdev->config + PCI_CLASS_DEVICE) == 0x0806;
+}
+
+static AddressSpace *riscv_iommu_find_as(PCIBus *bus, void *opaque, int devfn)
+{
+ RISCVIOMMUState *s = (RISCVIOMMUState *) opaque;
+ PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
+ AddressSpace *as = NULL;
+
+ if (pdev && pci_is_iommu(pdev)) {
+ return s->target_as;
+ }
+
+ /* Find first registered IOMMU device */
+ while (s->iommus.le_prev) {
+ s = *(s->iommus.le_prev);
+ }
+
+ /* Find first matching IOMMU */
+ while (s != NULL && as == NULL) {
+ as = riscv_iommu_space(s, PCI_BUILD_BDF(pci_bus_num(bus), devfn));
+ s = s->iommus.le_next;
+ }
+
+ return as ? as : &address_space_memory;
+}
+
+static const PCIIOMMUOps riscv_iommu_ops = {
+ .get_address_space = riscv_iommu_find_as,
+};
+
+void riscv_iommu_pci_setup_iommu(RISCVIOMMUState *iommu, PCIBus *bus,
+ Error **errp)
+{
+ if (bus->iommu_ops &&
+ bus->iommu_ops->get_address_space == riscv_iommu_find_as) {
+ /* Allow multiple IOMMUs on the same PCIe bus, link known devices */
+ RISCVIOMMUState *last = (RISCVIOMMUState *)bus->iommu_opaque;
+ QLIST_INSERT_AFTER(last, iommu, iommus);
+ } else if (!bus->iommu_ops && !bus->iommu_opaque) {
+ pci_setup_iommu(bus, &riscv_iommu_ops, iommu);
+ } else {
+ error_setg(errp, "can't register secondary IOMMU for PCI bus #%d",
+ pci_bus_num(bus));
+ }
+}
+
+static int riscv_iommu_memory_region_index(IOMMUMemoryRegion *iommu_mr,
+ MemTxAttrs attrs)
+{
+ return attrs.unspecified ? RISCV_IOMMU_NOPROCID : (int)attrs.pid;
+}
+
+static int riscv_iommu_memory_region_index_len(IOMMUMemoryRegion *iommu_mr)
+{
+ RISCVIOMMUSpace *as = container_of(iommu_mr, RISCVIOMMUSpace, iova_mr);
+ return 1 << as->iommu->pid_bits;
+}
+
+static void riscv_iommu_memory_region_init(ObjectClass *klass, void *data)
+{
+ IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
+
+ imrc->translate = riscv_iommu_memory_region_translate;
+ imrc->notify_flag_changed = riscv_iommu_memory_region_notify;
+ imrc->attrs_to_index = riscv_iommu_memory_region_index;
+ imrc->num_indexes = riscv_iommu_memory_region_index_len;
+}
+
+static const TypeInfo riscv_iommu_memory_region_info = {
+ .parent = TYPE_IOMMU_MEMORY_REGION,
+ .name = TYPE_RISCV_IOMMU_MEMORY_REGION,
+ .class_init = riscv_iommu_memory_region_init,
+};
+
+static void riscv_iommu_register_mr_types(void)
+{
+ type_register_static(&riscv_iommu_memory_region_info);
+ type_register_static(&riscv_iommu_info);
+}
+
+type_init(riscv_iommu_register_mr_types);
diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index a2030e3a6f..f69d6e3c8e 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -1,3 +1,6 @@
+config RISCV_IOMMU
+ bool
+
config RISCV_NUMA
bool
@@ -47,6 +50,7 @@ config RISCV_VIRT
select SERIAL
select RISCV_ACLINT
select RISCV_APLIC
+ select RISCV_IOMMU
select RISCV_IMSIC
select SIFIVE_PLIC
select SIFIVE_TEST
diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
index f872674093..cbc99c6e8e 100644
--- a/hw/riscv/meson.build
+++ b/hw/riscv/meson.build
@@ -10,5 +10,6 @@ riscv_ss.add(when: 'CONFIG_SIFIVE_U', if_true: files('sifive_u.c'))
riscv_ss.add(when: 'CONFIG_SPIKE', if_true: files('spike.c'))
riscv_ss.add(when: 'CONFIG_MICROCHIP_PFSOC', if_true: files('microchip_pfsoc.c'))
riscv_ss.add(when: 'CONFIG_ACPI', if_true: files('virt-acpi-build.c'))
+riscv_ss.add(when: 'CONFIG_RISCV_IOMMU', if_true: files('riscv-iommu.c'))
hw_arch += {'riscv': riscv_ss}
diff --git a/hw/riscv/trace-events b/hw/riscv/trace-events
new file mode 100644
index 0000000000..3d5c33102d
--- /dev/null
+++ b/hw/riscv/trace-events
@@ -0,0 +1,14 @@
+# See documentation at docs/devel/tracing.rst
+
+# riscv-iommu.c
+riscv_iommu_new(const char *id, unsigned b, unsigned d, unsigned f) "%s: device attached %04x:%02x.%d"
+riscv_iommu_flt(const char *id, unsigned b, unsigned d, unsigned f, uint64_t reason, uint64_t iova) "%s: fault %04x:%02x.%u reason: 0x%"PRIx64" iova: 0x%"PRIx64
+riscv_iommu_pri(const char *id, unsigned b, unsigned d, unsigned f, uint64_t iova) "%s: page request %04x:%02x.%u iova: 0x%"PRIx64
+riscv_iommu_dma(const char *id, unsigned b, unsigned d, unsigned f, unsigned pasid, const char *dir, uint64_t iova, uint64_t phys) "%s: translate %04x:%02x.%u #%u %s 0x%"PRIx64" -> 0x%"PRIx64
+riscv_iommu_msi(const char *id, unsigned b, unsigned d, unsigned f, uint64_t iova, uint64_t phys) "%s: translate %04x:%02x.%u MSI 0x%"PRIx64" -> 0x%"PRIx64
+riscv_iommu_mrif_notification(const char *id, uint32_t nid, uint64_t phys) "%s: sent MRIF notification 0x%x to 0x%"PRIx64
+riscv_iommu_cmd(const char *id, uint64_t l, uint64_t u) "%s: command 0x%"PRIx64" 0x%"PRIx64
+riscv_iommu_notifier_add(const char *id) "%s: dev-iotlb notifier added"
+riscv_iommu_notifier_del(const char *id) "%s: dev-iotlb notifier removed"
+riscv_iommu_notify_int_vector(uint32_t cause, uint32_t vector) "Interrupt cause 0x%x sent via vector 0x%x"
+riscv_iommu_icvec_write(uint32_t orig, uint32_t actual) "ICVEC write: incoming 0x%x actual 0x%x"
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-09-24 22:17 ` [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation Alistair Francis
@ 2024-09-28 20:22 ` Peter Maydell
2024-09-28 21:01 ` Daniel Henrique Barboza
2024-10-01 22:24 ` Tomasz Jeznach
1 sibling, 1 reply; 68+ messages in thread
From: Peter Maydell @ 2024-09-28 20:22 UTC (permalink / raw)
To: Alistair Francis
Cc: qemu-devel, Tomasz Jeznach, Sebastien Boeuf,
Daniel Henrique Barboza, Alistair Francis
On Tue, 24 Sept 2024 at 23:19, Alistair Francis <alistair23@gmail.com> wrote:
>
> From: Tomasz Jeznach <tjeznach@rivosinc.com>
>
> The RISC-V IOMMU specification is now ratified as-per the RISC-V
> international process. The latest frozen specifcation can be found at:
>
> https://github.com/riscv-non-isa/riscv-iommu/releases/download/v1.0/riscv-iommu.pdf
>
> Add the foundation of the device emulation for RISC-V IOMMU. It includes
> support for s-stage (sv32, sv39, sv48, sv57 caps) and g-stage (sv32x4,
> sv39x4, sv48x4, sv57x4 caps).
>
> Other capabilities like ATS and DBG support will be added incrementally
> in the next patches.
>
> Co-developed-by: Sebastien Boeuf <seb@rivosinc.com>
> Signed-off-by: Sebastien Boeuf <seb@rivosinc.com>
> Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> Acked-by: Alistair Francis <alistair.francis@wdc.com>
> Message-ID: <20240903201633.93182-4-dbarboza@ventanamicro.com>
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
> meson.build | 1 +
> hw/riscv/riscv-iommu-bits.h | 18 +
> hw/riscv/riscv-iommu.h | 145 +++
> hw/riscv/trace.h | 1 +
> include/hw/riscv/iommu.h | 36 +
> hw/riscv/riscv-iommu.c | 2050 +++++++++++++++++++++++++++++++++++
> hw/riscv/Kconfig | 4 +
> hw/riscv/meson.build | 1 +
> hw/riscv/trace-events | 14 +
> 9 files changed, 2270 insertions(+)
Aside: patches this massive are really difficult to work with.
I just wanted to comment on a couple of things in here and
it's super painful. This is why we recommend breaking things
down a bit more...
> create mode 100644 hw/riscv/riscv-iommu.h
> create mode 100644 hw/riscv/trace.h
> create mode 100644 include/hw/riscv/iommu.h
> create mode 100644 hw/riscv/riscv-iommu.c
> create mode 100644 hw/riscv/trace-events
>
> diff --git a/meson.build b/meson.build
> index 10464466ff..71de8a5cd1 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -3439,6 +3439,7 @@ if have_system
> 'hw/pci-host',
> 'hw/ppc',
> 'hw/rtc',
> + 'hw/riscv',
> 'hw/s390x',
> 'hw/scsi',
> 'hw/sd',
> diff --git a/hw/riscv/riscv-iommu-bits.h b/hw/riscv/riscv-iommu-bits.h
> index c46d7d18ab..b1c477f5c3 100644
> --- a/hw/riscv/riscv-iommu-bits.h
> +++ b/hw/riscv/riscv-iommu-bits.h
> @@ -69,6 +69,14 @@ struct riscv_iommu_pq_record {
> /* 5.3 IOMMU Capabilities (64bits) */
> #define RISCV_IOMMU_REG_CAP 0x0000
> #define RISCV_IOMMU_CAP_VERSION GENMASK_ULL(7, 0)
> +#define RISCV_IOMMU_CAP_SV32 BIT_ULL(8)
> +#define RISCV_IOMMU_CAP_SV39 BIT_ULL(9)
> +#define RISCV_IOMMU_CAP_SV48 BIT_ULL(10)
> +#define RISCV_IOMMU_CAP_SV57 BIT_ULL(11)
> +#define RISCV_IOMMU_CAP_SV32X4 BIT_ULL(16)
> +#define RISCV_IOMMU_CAP_SV39X4 BIT_ULL(17)
> +#define RISCV_IOMMU_CAP_SV48X4 BIT_ULL(18)
> +#define RISCV_IOMMU_CAP_SV57X4 BIT_ULL(19)
> #define RISCV_IOMMU_CAP_MSI_FLAT BIT_ULL(22)
> #define RISCV_IOMMU_CAP_MSI_MRIF BIT_ULL(23)
> #define RISCV_IOMMU_CAP_T2GPA BIT_ULL(26)
> @@ -80,7 +88,9 @@ struct riscv_iommu_pq_record {
>
> /* 5.4 Features control register (32bits) */
> #define RISCV_IOMMU_REG_FCTL 0x0008
> +#define RISCV_IOMMU_FCTL_BE BIT(0)
> #define RISCV_IOMMU_FCTL_WSI BIT(1)
> +#define RISCV_IOMMU_FCTL_GXL BIT(2)
>
> /* 5.5 Device-directory-table pointer (64bits) */
> #define RISCV_IOMMU_REG_DDTP 0x0010
> @@ -175,6 +185,10 @@ enum {
>
> /* 5.27 Interrupt cause to vector (64bits) */
> #define RISCV_IOMMU_REG_ICVEC 0x02F8
> +#define RISCV_IOMMU_ICVEC_CIV GENMASK_ULL(3, 0)
> +#define RISCV_IOMMU_ICVEC_FIV GENMASK_ULL(7, 4)
> +#define RISCV_IOMMU_ICVEC_PMIV GENMASK_ULL(11, 8)
> +#define RISCV_IOMMU_ICVEC_PIV GENMASK_ULL(15, 12)
>
> /* 5.28 MSI Configuration table (32 * 64bits) */
> #define RISCV_IOMMU_REG_MSI_CONFIG 0x0300
> @@ -203,6 +217,8 @@ struct riscv_iommu_dc {
> #define RISCV_IOMMU_DC_TC_DTF BIT_ULL(4)
> #define RISCV_IOMMU_DC_TC_PDTV BIT_ULL(5)
> #define RISCV_IOMMU_DC_TC_PRPR BIT_ULL(6)
> +#define RISCV_IOMMU_DC_TC_GADE BIT_ULL(7)
> +#define RISCV_IOMMU_DC_TC_SADE BIT_ULL(8)
> #define RISCV_IOMMU_DC_TC_DPE BIT_ULL(9)
> #define RISCV_IOMMU_DC_TC_SBE BIT_ULL(10)
> #define RISCV_IOMMU_DC_TC_SXL BIT_ULL(11)
> @@ -309,9 +325,11 @@ enum riscv_iommu_fq_causes {
>
> /* Translation attributes fields */
> #define RISCV_IOMMU_PC_TA_V BIT_ULL(0)
> +#define RISCV_IOMMU_PC_TA_RESERVED GENMASK_ULL(63, 32)
>
> /* First stage context fields */
> #define RISCV_IOMMU_PC_FSC_PPN GENMASK_ULL(43, 0)
> +#define RISCV_IOMMU_PC_FSC_RESERVED GENMASK_ULL(59, 44)
>
> enum riscv_iommu_fq_ttypes {
> RISCV_IOMMU_FQ_TTYPE_NONE = 0,
> diff --git a/hw/riscv/riscv-iommu.h b/hw/riscv/riscv-iommu.h
> new file mode 100644
> index 0000000000..95b4ce8d50
> --- /dev/null
> +++ b/hw/riscv/riscv-iommu.h
> @@ -0,0 +1,145 @@
> +/*
> + * QEMU emulation of an RISC-V IOMMU
> + *
> + * Copyright (C) 2022-2023 Rivos Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_RISCV_IOMMU_STATE_H
> +#define HW_RISCV_IOMMU_STATE_H
> +
> +#include "qemu/osdep.h"
> +#include "qom/object.h"
> +
> +#include "hw/riscv/iommu.h"
> +
> +struct RISCVIOMMUState {
> + /*< private >*/
> + DeviceState parent_obj;
> +
> + /*< public >*/
> + uint32_t version; /* Reported interface version number */
> + uint32_t pid_bits; /* process identifier width */
> + uint32_t bus; /* PCI bus mapping for non-root endpoints */
> +
> + uint64_t cap; /* IOMMU supported capabilities */
> + uint64_t fctl; /* IOMMU enabled features */
> + uint64_t icvec_avail_vectors; /* Available interrupt vectors in ICVEC */
> +
> + bool enable_off; /* Enable out-of-reset OFF mode (DMA disabled) */
> + bool enable_msi; /* Enable MSI remapping */
> + bool enable_s_stage; /* Enable S/VS-Stage translation */
> + bool enable_g_stage; /* Enable G-Stage translation */
> +
> + /* IOMMU Internal State */
> + uint64_t ddtp; /* Validated Device Directory Tree Root Pointer */
> +
> + dma_addr_t cq_addr; /* Command queue base physical address */
> + dma_addr_t fq_addr; /* Fault/event queue base physical address */
> + dma_addr_t pq_addr; /* Page request queue base physical address */
> +
> + uint32_t cq_mask; /* Command queue index bit mask */
> + uint32_t fq_mask; /* Fault/event queue index bit mask */
> + uint32_t pq_mask; /* Page request queue index bit mask */
> +
> + /* interrupt notifier */
> + void (*notify)(RISCVIOMMUState *iommu, unsigned vector);
> +
> + /* IOMMU State Machine */
> + QemuThread core_proc; /* Background processing thread */
> + QemuMutex core_lock; /* Global IOMMU lock, used for cache/regs updates */
> + QemuCond core_cond; /* Background processing wake up signal */
> + unsigned core_exec; /* Processing thread execution actions */
> +
> + /* IOMMU target address space */
> + AddressSpace *target_as;
> + MemoryRegion *target_mr;
> +
> + /* MSI / MRIF access trap */
> + AddressSpace trap_as;
> + MemoryRegion trap_mr;
> +
> + GHashTable *ctx_cache; /* Device translation Context Cache */
> + QemuMutex ctx_lock; /* Device translation Cache update lock */
> +
> + /* MMIO Hardware Interface */
> + MemoryRegion regs_mr;
> + QemuSpin regs_lock;
> + uint8_t *regs_rw; /* register state (user write) */
> + uint8_t *regs_wc; /* write-1-to-clear mask */
> + uint8_t *regs_ro; /* read-only mask */
> +
> + QLIST_ENTRY(RISCVIOMMUState) iommus;
> + QLIST_HEAD(, RISCVIOMMUSpace) spaces;
> +};
> +
> +void riscv_iommu_pci_setup_iommu(RISCVIOMMUState *iommu, PCIBus *bus,
> + Error **errp);
> +
> +/* private helpers */
> +
> +/* Register helper functions */
> +static inline uint32_t riscv_iommu_reg_mod32(RISCVIOMMUState *s,
> + unsigned idx, uint32_t set, uint32_t clr)
> +{
> + uint32_t val;
> + qemu_spin_lock(&s->regs_lock);
> + val = ldl_le_p(s->regs_rw + idx);
> + stl_le_p(s->regs_rw + idx, (val & ~clr) | set);
> + qemu_spin_unlock(&s->regs_lock);
> + return val;
> +}
This looks very weird. Nobody else's IOMMU implementation
grabs a spinlock while it is accessing guest register data.
Why is riscv special? Why a spinlock? (We use spinlocks
only very very sparingly in general.)
> --- /dev/null
> +++ b/include/hw/riscv/iommu.h
> @@ -0,0 +1,36 @@
> +/*
> + * QEMU emulation of an RISC-V IOMMU
> + *
> + * Copyright (C) 2022-2023 Rivos Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_RISCV_IOMMU_H
> +#define HW_RISCV_IOMMU_H
> +
> +#include "qemu/osdep.h"
Header files should never include osdep.h. .c files always do.
> +#include "qom/object.h"
> +
> +#define TYPE_RISCV_IOMMU "riscv-iommu"
> +OBJECT_DECLARE_SIMPLE_TYPE(RISCVIOMMUState, RISCV_IOMMU)
> +typedef struct RISCVIOMMUState RISCVIOMMUState;
> +
> +#define TYPE_RISCV_IOMMU_MEMORY_REGION "riscv-iommu-mr"
> +typedef struct RISCVIOMMUSpace RISCVIOMMUSpace;
> +
> +#define TYPE_RISCV_IOMMU_PCI "riscv-iommu-pci"
> +OBJECT_DECLARE_SIMPLE_TYPE(RISCVIOMMUStatePci, RISCV_IOMMU_PCI)
> +typedef struct RISCVIOMMUStatePci RISCVIOMMUStatePci;
> +
> +#endif
> --- /dev/null
> +++ b/hw/riscv/riscv-iommu.c
> @@ -0,0 +1,2050 @@
> +/*
> + * QEMU emulation of an RISC-V IOMMU
> + *
> + * Copyright (C) 2021-2023, Rivos Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License.
This text makes no sense. Is it trying to say "version 2 only",
or "either version 2 or any later version"? This needs to
be fixed before we can take this code -- it needs to use the
standard text for whatever license you intend (which ideally
should be gpl-2-or-later).
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +/*
> + * RISCV IOMMU Address Translation Lookup - Page Table Walk
> + *
> + * Note: Code is based on get_physical_address() from target/riscv/cpu_helper.c
> + * Both implementation can be merged into single helper function in future.
> + * Keeping them separate for now, as error reporting and flow specifics are
> + * sufficiently different for separate implementation.
> + *
> + * @s : IOMMU Device State
> + * @ctx : Translation context for device id and process address space id.
> + * @iotlb : translation data: physical address and access mode.
> + * @return : success or fault cause code.
> + */
> +static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> + IOMMUTLBEntry *iotlb)
> +{
> + dma_addr_t addr, base;
> + uint64_t satp, gatp, pte;
> + bool en_s, en_g;
> + struct {
> + unsigned char step;
> + unsigned char levels;
> + unsigned char ptidxbits;
> + unsigned char ptesize;
> + } sc[2];
> + /* Translation stage phase */
> + enum {
> + S_STAGE = 0,
> + G_STAGE = 1,
> + } pass;
> +
> + satp = get_field(ctx->satp, RISCV_IOMMU_ATP_MODE_FIELD);
> + gatp = get_field(ctx->gatp, RISCV_IOMMU_ATP_MODE_FIELD);
> +
> + en_s = satp != RISCV_IOMMU_DC_FSC_MODE_BARE;
> + en_g = gatp != RISCV_IOMMU_DC_IOHGATP_MODE_BARE;
> +
> + /*
> + * Early check for MSI address match when IOVA == GPA. This check
> + * is required to ensure MSI translation is applied in case
> + * first-stage translation is set to BARE mode. In this case IOVA
> + * provided is a valid GPA. Running translation through page walk
> + * second stage translation will incorrectly try to translate GPA
> + * to host physical page, likely hitting IOPF.
> + */
> + if ((iotlb->perm & IOMMU_WO) &&
> + riscv_iommu_msi_check(s, ctx, iotlb->iova)) {
> + iotlb->target_as = &s->trap_as;
> + iotlb->translated_addr = iotlb->iova;
> + iotlb->addr_mask = ~TARGET_PAGE_MASK;
> + return 0;
> + }
> +
> + /* Exit early for pass-through mode. */
> + if (!(en_s || en_g)) {
> + iotlb->translated_addr = iotlb->iova;
> + iotlb->addr_mask = ~TARGET_PAGE_MASK;
> + /* Allow R/W in pass-through mode */
> + iotlb->perm = IOMMU_RW;
> + return 0;
> + }
> +
> + /* S/G translation parameters. */
> + for (pass = 0; pass < 2; pass++) {
> + uint32_t sv_mode;
> +
> + sc[pass].step = 0;
> + if (pass ? (s->fctl & RISCV_IOMMU_FCTL_GXL) :
> + (ctx->tc & RISCV_IOMMU_DC_TC_SXL)) {
> + /* 32bit mode for GXL/SXL == 1 */
> + switch (pass ? gatp : satp) {
> + case RISCV_IOMMU_DC_IOHGATP_MODE_BARE:
> + sc[pass].levels = 0;
> + sc[pass].ptidxbits = 0;
> + sc[pass].ptesize = 0;
> + break;
> + case RISCV_IOMMU_DC_IOHGATP_MODE_SV32X4:
> + sv_mode = pass ? RISCV_IOMMU_CAP_SV32X4 : RISCV_IOMMU_CAP_SV32;
> + if (!(s->cap & sv_mode)) {
> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
> + }
> + sc[pass].levels = 2;
> + sc[pass].ptidxbits = 10;
> + sc[pass].ptesize = 4;
> + break;
> + default:
> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
> + }
> + } else {
> + /* 64bit mode for GXL/SXL == 0 */
> + switch (pass ? gatp : satp) {
> + case RISCV_IOMMU_DC_IOHGATP_MODE_BARE:
> + sc[pass].levels = 0;
> + sc[pass].ptidxbits = 0;
> + sc[pass].ptesize = 0;
> + break;
> + case RISCV_IOMMU_DC_IOHGATP_MODE_SV39X4:
> + sv_mode = pass ? RISCV_IOMMU_CAP_SV39X4 : RISCV_IOMMU_CAP_SV39;
> + if (!(s->cap & sv_mode)) {
> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
> + }
> + sc[pass].levels = 3;
> + sc[pass].ptidxbits = 9;
> + sc[pass].ptesize = 8;
> + break;
> + case RISCV_IOMMU_DC_IOHGATP_MODE_SV48X4:
> + sv_mode = pass ? RISCV_IOMMU_CAP_SV48X4 : RISCV_IOMMU_CAP_SV48;
> + if (!(s->cap & sv_mode)) {
> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
> + }
> + sc[pass].levels = 4;
> + sc[pass].ptidxbits = 9;
> + sc[pass].ptesize = 8;
> + break;
> + case RISCV_IOMMU_DC_IOHGATP_MODE_SV57X4:
> + sv_mode = pass ? RISCV_IOMMU_CAP_SV57X4 : RISCV_IOMMU_CAP_SV57;
> + if (!(s->cap & sv_mode)) {
> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
> + }
> + sc[pass].levels = 5;
> + sc[pass].ptidxbits = 9;
> + sc[pass].ptesize = 8;
> + break;
> + default:
> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
> + }
> + }
> + };
> +
> + /* S/G stages translation tables root pointers */
> + gatp = PPN_PHYS(get_field(ctx->gatp, RISCV_IOMMU_ATP_PPN_FIELD));
> + satp = PPN_PHYS(get_field(ctx->satp, RISCV_IOMMU_ATP_PPN_FIELD));
> + addr = (en_s && en_g) ? satp : iotlb->iova;
> + base = en_g ? gatp : satp;
> + pass = en_g ? G_STAGE : S_STAGE;
> +
> + do {
> + const unsigned widened = (pass && !sc[pass].step) ? 2 : 0;
> + const unsigned va_bits = widened + sc[pass].ptidxbits;
> + const unsigned va_skip = TARGET_PAGE_BITS + sc[pass].ptidxbits *
> + (sc[pass].levels - 1 - sc[pass].step);
> + const unsigned idx = (addr >> va_skip) & ((1 << va_bits) - 1);
> + const dma_addr_t pte_addr = base + idx * sc[pass].ptesize;
> + const bool ade =
> + ctx->tc & (pass ? RISCV_IOMMU_DC_TC_GADE : RISCV_IOMMU_DC_TC_SADE);
> +
> + /* Address range check before first level lookup */
> + if (!sc[pass].step) {
> + const uint64_t va_mask = (1ULL << (va_skip + va_bits)) - 1;
> + if ((addr & va_mask) != addr) {
> + return RISCV_IOMMU_FQ_CAUSE_DMA_DISABLED;
> + }
> + }
> +
> + /* Read page table entry */
> + if (dma_memory_read(s->target_as, pte_addr, &pte,
> + sc[pass].ptesize, MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
> + return (iotlb->perm & IOMMU_WO) ? RISCV_IOMMU_FQ_CAUSE_WR_FAULT
> + : RISCV_IOMMU_FQ_CAUSE_RD_FAULT;
> + }
> +
> + if (sc[pass].ptesize == 4) {
> + pte = (uint64_t) le32_to_cpu(*((uint32_t *)&pte));
> + } else {
> + pte = le64_to_cpu(pte);
> + }
I think this would be clearer to read if you did
if (sc[pass].ptesize == 4) {
uint32_t pte32 = 0;
r = ldl_le_dma(s->target_as, pte_addr, &pte32, MEMTX_ATTRS_UNSPECIFIED);
pte = pte32;
} else {
r = ldq_le_dma(s->target_as, pte_addr, &pte, MEMTX_ATTRS_UNSPECIFIED);
}
if (r != MEMTX_OK) {
...
}
rather than loading 4 or 8 bytes into a host uint64_t as a pile-of-bytes
and doing a complicated expression to get the right answer afterwards.
> +/* Translation Context cache support */
> +static gboolean __ctx_equal(gconstpointer v1, gconstpointer v2)
Don't use double-underscore prefixes, please -- those are
reserved for the system (i.e. not user code like QEMU).
> +{
> + RISCVIOMMUContext *c1 = (RISCVIOMMUContext *) v1;
> + RISCVIOMMUContext *c2 = (RISCVIOMMUContext *) v2;
> + return c1->devid == c2->devid &&
> + c1->process_id == c2->process_id;
> +}
> +
> +static MemTxResult riscv_iommu_mmio_read(void *opaque, hwaddr addr,
> + uint64_t *data, unsigned size, MemTxAttrs attrs)
> +{
> + RISCVIOMMUState *s = opaque;
> + uint64_t val = -1;
> + uint8_t *ptr;
> +
> + if ((addr & (size - 1)) != 0) {
> + /* Unsupported MMIO alignment. */
> + return MEMTX_ERROR;
> + }
> +
> + if (addr + size > RISCV_IOMMU_REG_MSI_CONFIG) {
> + return MEMTX_ACCESS_ERROR;
> + }
> +
> + ptr = &s->regs_rw[addr];
> +
> + if (size == 1) {
> + val = (uint64_t)*ptr;
This looks very fishy. If we're reading 1 byte then cast
the pointer to uint64_t* and read 8 possibly-misaligned bytes ??
If you want to read one byte, then the way that parallels
the other cases in this if() ladder is
val = ldub_p(ptr);
> + } else if (size == 2) {
> + val = lduw_le_p(ptr);
> + } else if (size == 4) {
> + val = ldl_le_p(ptr);
> + } else if (size == 8) {
> + val = ldq_le_p(ptr);
> + } else {
> + return MEMTX_ERROR;
> + }
...but you can replace the whole if ladder with
val = ldn_le_p(ptr, size);
(There's a corresponding stn_le_p() if you need it.)
> +
> + *data = val;
> +
> + return MEMTX_OK;
> +}
> +
> +static const MemoryRegionOps riscv_iommu_mmio_ops = {
> + .read_with_attrs = riscv_iommu_mmio_read,
> + .write_with_attrs = riscv_iommu_mmio_write,
> + .endianness = DEVICE_NATIVE_ENDIAN,
> + .impl = {
> + .min_access_size = 4,
> + .max_access_size = 8,
But you've set your impl here to say that it can
only handle 4 and 8 byte accesses. So why is the read
function trying to handle 1 and 2 byte accesses?
> + .unaligned = false,
> + },
> + .valid = {
> + .min_access_size = 4,
> + .max_access_size = 8,
> + }
> +};
> +
-- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-09-28 20:22 ` Peter Maydell
@ 2024-09-28 21:01 ` Daniel Henrique Barboza
2024-09-29 15:46 ` Peter Maydell
0 siblings, 1 reply; 68+ messages in thread
From: Daniel Henrique Barboza @ 2024-09-28 21:01 UTC (permalink / raw)
To: Peter Maydell, Alistair Francis
Cc: qemu-devel, Tomasz Jeznach, Sebastien Boeuf, Alistair Francis
On 9/28/24 5:22 PM, Peter Maydell wrote:
> On Tue, 24 Sept 2024 at 23:19, Alistair Francis <alistair23@gmail.com> wrote:
>>
>> From: Tomasz Jeznach <tjeznach@rivosinc.com>
>>
>> The RISC-V IOMMU specification is now ratified as-per the RISC-V
>> international process. The latest frozen specifcation can be found at:
>>
>> https://github.com/riscv-non-isa/riscv-iommu/releases/download/v1.0/riscv-iommu.pdf
>>
>> Add the foundation of the device emulation for RISC-V IOMMU. It includes
>> support for s-stage (sv32, sv39, sv48, sv57 caps) and g-stage (sv32x4,
>> sv39x4, sv48x4, sv57x4 caps).
>>
>> Other capabilities like ATS and DBG support will be added incrementally
>> in the next patches.
>>
>> Co-developed-by: Sebastien Boeuf <seb@rivosinc.com>
>> Signed-off-by: Sebastien Boeuf <seb@rivosinc.com>
>> Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
>> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
>> Acked-by: Alistair Francis <alistair.francis@wdc.com>
>> Message-ID: <20240903201633.93182-4-dbarboza@ventanamicro.com>
>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
>> ---
>> meson.build | 1 +
>> hw/riscv/riscv-iommu-bits.h | 18 +
>> hw/riscv/riscv-iommu.h | 145 +++
>> hw/riscv/trace.h | 1 +
>> include/hw/riscv/iommu.h | 36 +
>> hw/riscv/riscv-iommu.c | 2050 +++++++++++++++++++++++++++++++++++
>> hw/riscv/Kconfig | 4 +
>> hw/riscv/meson.build | 1 +
>> hw/riscv/trace-events | 14 +
>> 9 files changed, 2270 insertions(+)
>
> Aside: patches this massive are really difficult to work with.
> I just wanted to comment on a couple of things in here and
> it's super painful. This is why we recommend breaking things
> down a bit more...
>
>> create mode 100644 hw/riscv/riscv-iommu.h
>> create mode 100644 hw/riscv/trace.h
>> create mode 100644 include/hw/riscv/iommu.h
>> create mode 100644 hw/riscv/riscv-iommu.c
>> create mode 100644 hw/riscv/trace-events
>>
>> diff --git a/meson.build b/meson.build
>> index 10464466ff..71de8a5cd1 100644
>> --- a/meson.build
>> +++ b/meson.build
>> @@ -3439,6 +3439,7 @@ if have_system
>> 'hw/pci-host',
>> 'hw/ppc',
>> 'hw/rtc',
>> + 'hw/riscv',
>> 'hw/s390x',
>> 'hw/scsi',
>> 'hw/sd',
>> diff --git a/hw/riscv/riscv-iommu-bits.h b/hw/riscv/riscv-iommu-bits.h
>> index c46d7d18ab..b1c477f5c3 100644
>> --- a/hw/riscv/riscv-iommu-bits.h
>> +++ b/hw/riscv/riscv-iommu-bits.h
>> @@ -69,6 +69,14 @@ struct riscv_iommu_pq_record {
>> /* 5.3 IOMMU Capabilities (64bits) */
>> #define RISCV_IOMMU_REG_CAP 0x0000
>> #define RISCV_IOMMU_CAP_VERSION GENMASK_ULL(7, 0)
>> +#define RISCV_IOMMU_CAP_SV32 BIT_ULL(8)
>> +#define RISCV_IOMMU_CAP_SV39 BIT_ULL(9)
>> +#define RISCV_IOMMU_CAP_SV48 BIT_ULL(10)
>> +#define RISCV_IOMMU_CAP_SV57 BIT_ULL(11)
>> +#define RISCV_IOMMU_CAP_SV32X4 BIT_ULL(16)
>> +#define RISCV_IOMMU_CAP_SV39X4 BIT_ULL(17)
>> +#define RISCV_IOMMU_CAP_SV48X4 BIT_ULL(18)
>> +#define RISCV_IOMMU_CAP_SV57X4 BIT_ULL(19)
>> #define RISCV_IOMMU_CAP_MSI_FLAT BIT_ULL(22)
>> #define RISCV_IOMMU_CAP_MSI_MRIF BIT_ULL(23)
>> #define RISCV_IOMMU_CAP_T2GPA BIT_ULL(26)
>> @@ -80,7 +88,9 @@ struct riscv_iommu_pq_record {
>>
>> /* 5.4 Features control register (32bits) */
>> #define RISCV_IOMMU_REG_FCTL 0x0008
>> +#define RISCV_IOMMU_FCTL_BE BIT(0)
>> #define RISCV_IOMMU_FCTL_WSI BIT(1)
>> +#define RISCV_IOMMU_FCTL_GXL BIT(2)
>>
>> /* 5.5 Device-directory-table pointer (64bits) */
>> #define RISCV_IOMMU_REG_DDTP 0x0010
>> @@ -175,6 +185,10 @@ enum {
>>
>> /* 5.27 Interrupt cause to vector (64bits) */
>> #define RISCV_IOMMU_REG_ICVEC 0x02F8
>> +#define RISCV_IOMMU_ICVEC_CIV GENMASK_ULL(3, 0)
>> +#define RISCV_IOMMU_ICVEC_FIV GENMASK_ULL(7, 4)
>> +#define RISCV_IOMMU_ICVEC_PMIV GENMASK_ULL(11, 8)
>> +#define RISCV_IOMMU_ICVEC_PIV GENMASK_ULL(15, 12)
>>
>> /* 5.28 MSI Configuration table (32 * 64bits) */
>> #define RISCV_IOMMU_REG_MSI_CONFIG 0x0300
>> @@ -203,6 +217,8 @@ struct riscv_iommu_dc {
>> #define RISCV_IOMMU_DC_TC_DTF BIT_ULL(4)
>> #define RISCV_IOMMU_DC_TC_PDTV BIT_ULL(5)
>> #define RISCV_IOMMU_DC_TC_PRPR BIT_ULL(6)
>> +#define RISCV_IOMMU_DC_TC_GADE BIT_ULL(7)
>> +#define RISCV_IOMMU_DC_TC_SADE BIT_ULL(8)
>> #define RISCV_IOMMU_DC_TC_DPE BIT_ULL(9)
>> #define RISCV_IOMMU_DC_TC_SBE BIT_ULL(10)
>> #define RISCV_IOMMU_DC_TC_SXL BIT_ULL(11)
>> @@ -309,9 +325,11 @@ enum riscv_iommu_fq_causes {
>>
>> /* Translation attributes fields */
>> #define RISCV_IOMMU_PC_TA_V BIT_ULL(0)
>> +#define RISCV_IOMMU_PC_TA_RESERVED GENMASK_ULL(63, 32)
>>
>> /* First stage context fields */
>> #define RISCV_IOMMU_PC_FSC_PPN GENMASK_ULL(43, 0)
>> +#define RISCV_IOMMU_PC_FSC_RESERVED GENMASK_ULL(59, 44)
>>
>> enum riscv_iommu_fq_ttypes {
>> RISCV_IOMMU_FQ_TTYPE_NONE = 0,
>> diff --git a/hw/riscv/riscv-iommu.h b/hw/riscv/riscv-iommu.h
>> new file mode 100644
>> index 0000000000..95b4ce8d50
>> --- /dev/null
>> +++ b/hw/riscv/riscv-iommu.h
>> @@ -0,0 +1,145 @@
>> +/*
>> + * QEMU emulation of an RISC-V IOMMU
>> + *
>> + * Copyright (C) 2022-2023 Rivos Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef HW_RISCV_IOMMU_STATE_H
>> +#define HW_RISCV_IOMMU_STATE_H
>> +
>> +#include "qemu/osdep.h"
>> +#include "qom/object.h"
>> +
>> +#include "hw/riscv/iommu.h"
>> +
>> +struct RISCVIOMMUState {
>> + /*< private >*/
>> + DeviceState parent_obj;
>> +
>> + /*< public >*/
>> + uint32_t version; /* Reported interface version number */
>> + uint32_t pid_bits; /* process identifier width */
>> + uint32_t bus; /* PCI bus mapping for non-root endpoints */
>> +
>> + uint64_t cap; /* IOMMU supported capabilities */
>> + uint64_t fctl; /* IOMMU enabled features */
>> + uint64_t icvec_avail_vectors; /* Available interrupt vectors in ICVEC */
>> +
>> + bool enable_off; /* Enable out-of-reset OFF mode (DMA disabled) */
>> + bool enable_msi; /* Enable MSI remapping */
>> + bool enable_s_stage; /* Enable S/VS-Stage translation */
>> + bool enable_g_stage; /* Enable G-Stage translation */
>> +
>> + /* IOMMU Internal State */
>> + uint64_t ddtp; /* Validated Device Directory Tree Root Pointer */
>> +
>> + dma_addr_t cq_addr; /* Command queue base physical address */
>> + dma_addr_t fq_addr; /* Fault/event queue base physical address */
>> + dma_addr_t pq_addr; /* Page request queue base physical address */
>> +
>> + uint32_t cq_mask; /* Command queue index bit mask */
>> + uint32_t fq_mask; /* Fault/event queue index bit mask */
>> + uint32_t pq_mask; /* Page request queue index bit mask */
>> +
>> + /* interrupt notifier */
>> + void (*notify)(RISCVIOMMUState *iommu, unsigned vector);
>> +
>> + /* IOMMU State Machine */
>> + QemuThread core_proc; /* Background processing thread */
>> + QemuMutex core_lock; /* Global IOMMU lock, used for cache/regs updates */
>> + QemuCond core_cond; /* Background processing wake up signal */
>> + unsigned core_exec; /* Processing thread execution actions */
>> +
>> + /* IOMMU target address space */
>> + AddressSpace *target_as;
>> + MemoryRegion *target_mr;
>> +
>> + /* MSI / MRIF access trap */
>> + AddressSpace trap_as;
>> + MemoryRegion trap_mr;
>> +
>> + GHashTable *ctx_cache; /* Device translation Context Cache */
>> + QemuMutex ctx_lock; /* Device translation Cache update lock */
>> +
>> + /* MMIO Hardware Interface */
>> + MemoryRegion regs_mr;
>> + QemuSpin regs_lock;
>> + uint8_t *regs_rw; /* register state (user write) */
>> + uint8_t *regs_wc; /* write-1-to-clear mask */
>> + uint8_t *regs_ro; /* read-only mask */
>> +
>> + QLIST_ENTRY(RISCVIOMMUState) iommus;
>> + QLIST_HEAD(, RISCVIOMMUSpace) spaces;
>> +};
>> +
>> +void riscv_iommu_pci_setup_iommu(RISCVIOMMUState *iommu, PCIBus *bus,
>> + Error **errp);
>> +
>> +/* private helpers */
>> +
>> +/* Register helper functions */
>> +static inline uint32_t riscv_iommu_reg_mod32(RISCVIOMMUState *s,
>> + unsigned idx, uint32_t set, uint32_t clr)
>> +{
>> + uint32_t val;
>> + qemu_spin_lock(&s->regs_lock);
>> + val = ldl_le_p(s->regs_rw + idx);
>> + stl_le_p(s->regs_rw + idx, (val & ~clr) | set);
>> + qemu_spin_unlock(&s->regs_lock);
>> + return val;
>> +}
>
> This looks very weird. Nobody else's IOMMU implementation
> grabs a spinlock while it is accessing guest register data.
> Why is riscv special? Why a spinlock? (We use spinlocks
> only very very sparingly in general.)
The first versions of the IOMMU used qemu threads. I believe this is where
the locks come from (both from registers and from the cache).
I'm not sure if we're ever going to hit a race condition without the locks
in the current code (i.e. using mmio ops only). I think I'll make an attempt
to remove the locks and see if something breaks.
>
>
>> --- /dev/null
>> +++ b/include/hw/riscv/iommu.h
>> @@ -0,0 +1,36 @@
>> +/*
>> + * QEMU emulation of an RISC-V IOMMU
>> + *
>> + * Copyright (C) 2022-2023 Rivos Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef HW_RISCV_IOMMU_H
>> +#define HW_RISCV_IOMMU_H
>> +
>> +#include "qemu/osdep.h"
>
> Header files should never include osdep.h. .c files always do.
>
>> +#include "qom/object.h"
>> +
>> +#define TYPE_RISCV_IOMMU "riscv-iommu"
>> +OBJECT_DECLARE_SIMPLE_TYPE(RISCVIOMMUState, RISCV_IOMMU)
>> +typedef struct RISCVIOMMUState RISCVIOMMUState;
>> +
>> +#define TYPE_RISCV_IOMMU_MEMORY_REGION "riscv-iommu-mr"
>> +typedef struct RISCVIOMMUSpace RISCVIOMMUSpace;
>> +
>> +#define TYPE_RISCV_IOMMU_PCI "riscv-iommu-pci"
>> +OBJECT_DECLARE_SIMPLE_TYPE(RISCVIOMMUStatePci, RISCV_IOMMU_PCI)
>> +typedef struct RISCVIOMMUStatePci RISCVIOMMUStatePci;
>> +
>> +#endif
>
>> --- /dev/null
>> +++ b/hw/riscv/riscv-iommu.c
>> @@ -0,0 +1,2050 @@
>> +/*
>> + * QEMU emulation of an RISC-V IOMMU
>> + *
>> + * Copyright (C) 2021-2023, Rivos Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License.
>
> This text makes no sense. Is it trying to say "version 2 only",
> or "either version 2 or any later version"? This needs to
> be fixed before we can take this code -- it needs to use the
> standard text for whatever license you intend (which ideally
> should be gpl-2-or-later).
I'll use the license text from target/riscv/cpu.c:
--------
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2 or later, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
--------
>
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>
>
>
>
>> +/*
>> + * RISCV IOMMU Address Translation Lookup - Page Table Walk
>> + *
>> + * Note: Code is based on get_physical_address() from target/riscv/cpu_helper.c
>> + * Both implementation can be merged into single helper function in future.
>> + * Keeping them separate for now, as error reporting and flow specifics are
>> + * sufficiently different for separate implementation.
>> + *
>> + * @s : IOMMU Device State
>> + * @ctx : Translation context for device id and process address space id.
>> + * @iotlb : translation data: physical address and access mode.
>> + * @return : success or fault cause code.
>> + */
>> +static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>> + IOMMUTLBEntry *iotlb)
>> +{
>> + dma_addr_t addr, base;
>> + uint64_t satp, gatp, pte;
>> + bool en_s, en_g;
>> + struct {
>> + unsigned char step;
>> + unsigned char levels;
>> + unsigned char ptidxbits;
>> + unsigned char ptesize;
>> + } sc[2];
>> + /* Translation stage phase */
>> + enum {
>> + S_STAGE = 0,
>> + G_STAGE = 1,
>> + } pass;
>> +
>> + satp = get_field(ctx->satp, RISCV_IOMMU_ATP_MODE_FIELD);
>> + gatp = get_field(ctx->gatp, RISCV_IOMMU_ATP_MODE_FIELD);
>> +
>> + en_s = satp != RISCV_IOMMU_DC_FSC_MODE_BARE;
>> + en_g = gatp != RISCV_IOMMU_DC_IOHGATP_MODE_BARE;
>> +
>> + /*
>> + * Early check for MSI address match when IOVA == GPA. This check
>> + * is required to ensure MSI translation is applied in case
>> + * first-stage translation is set to BARE mode. In this case IOVA
>> + * provided is a valid GPA. Running translation through page walk
>> + * second stage translation will incorrectly try to translate GPA
>> + * to host physical page, likely hitting IOPF.
>> + */
>> + if ((iotlb->perm & IOMMU_WO) &&
>> + riscv_iommu_msi_check(s, ctx, iotlb->iova)) {
>> + iotlb->target_as = &s->trap_as;
>> + iotlb->translated_addr = iotlb->iova;
>> + iotlb->addr_mask = ~TARGET_PAGE_MASK;
>> + return 0;
>> + }
>> +
>> + /* Exit early for pass-through mode. */
>> + if (!(en_s || en_g)) {
>> + iotlb->translated_addr = iotlb->iova;
>> + iotlb->addr_mask = ~TARGET_PAGE_MASK;
>> + /* Allow R/W in pass-through mode */
>> + iotlb->perm = IOMMU_RW;
>> + return 0;
>> + }
>> +
>> + /* S/G translation parameters. */
>> + for (pass = 0; pass < 2; pass++) {
>> + uint32_t sv_mode;
>> +
>> + sc[pass].step = 0;
>> + if (pass ? (s->fctl & RISCV_IOMMU_FCTL_GXL) :
>> + (ctx->tc & RISCV_IOMMU_DC_TC_SXL)) {
>> + /* 32bit mode for GXL/SXL == 1 */
>> + switch (pass ? gatp : satp) {
>> + case RISCV_IOMMU_DC_IOHGATP_MODE_BARE:
>> + sc[pass].levels = 0;
>> + sc[pass].ptidxbits = 0;
>> + sc[pass].ptesize = 0;
>> + break;
>> + case RISCV_IOMMU_DC_IOHGATP_MODE_SV32X4:
>> + sv_mode = pass ? RISCV_IOMMU_CAP_SV32X4 : RISCV_IOMMU_CAP_SV32;
>> + if (!(s->cap & sv_mode)) {
>> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
>> + }
>> + sc[pass].levels = 2;
>> + sc[pass].ptidxbits = 10;
>> + sc[pass].ptesize = 4;
>> + break;
>> + default:
>> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
>> + }
>> + } else {
>> + /* 64bit mode for GXL/SXL == 0 */
>> + switch (pass ? gatp : satp) {
>> + case RISCV_IOMMU_DC_IOHGATP_MODE_BARE:
>> + sc[pass].levels = 0;
>> + sc[pass].ptidxbits = 0;
>> + sc[pass].ptesize = 0;
>> + break;
>> + case RISCV_IOMMU_DC_IOHGATP_MODE_SV39X4:
>> + sv_mode = pass ? RISCV_IOMMU_CAP_SV39X4 : RISCV_IOMMU_CAP_SV39;
>> + if (!(s->cap & sv_mode)) {
>> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
>> + }
>> + sc[pass].levels = 3;
>> + sc[pass].ptidxbits = 9;
>> + sc[pass].ptesize = 8;
>> + break;
>> + case RISCV_IOMMU_DC_IOHGATP_MODE_SV48X4:
>> + sv_mode = pass ? RISCV_IOMMU_CAP_SV48X4 : RISCV_IOMMU_CAP_SV48;
>> + if (!(s->cap & sv_mode)) {
>> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
>> + }
>> + sc[pass].levels = 4;
>> + sc[pass].ptidxbits = 9;
>> + sc[pass].ptesize = 8;
>> + break;
>> + case RISCV_IOMMU_DC_IOHGATP_MODE_SV57X4:
>> + sv_mode = pass ? RISCV_IOMMU_CAP_SV57X4 : RISCV_IOMMU_CAP_SV57;
>> + if (!(s->cap & sv_mode)) {
>> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
>> + }
>> + sc[pass].levels = 5;
>> + sc[pass].ptidxbits = 9;
>> + sc[pass].ptesize = 8;
>> + break;
>> + default:
>> + return RISCV_IOMMU_FQ_CAUSE_DDT_MISCONFIGURED;
>> + }
>> + }
>> + };
>> +
>> + /* S/G stages translation tables root pointers */
>> + gatp = PPN_PHYS(get_field(ctx->gatp, RISCV_IOMMU_ATP_PPN_FIELD));
>> + satp = PPN_PHYS(get_field(ctx->satp, RISCV_IOMMU_ATP_PPN_FIELD));
>> + addr = (en_s && en_g) ? satp : iotlb->iova;
>> + base = en_g ? gatp : satp;
>> + pass = en_g ? G_STAGE : S_STAGE;
>> +
>> + do {
>> + const unsigned widened = (pass && !sc[pass].step) ? 2 : 0;
>> + const unsigned va_bits = widened + sc[pass].ptidxbits;
>> + const unsigned va_skip = TARGET_PAGE_BITS + sc[pass].ptidxbits *
>> + (sc[pass].levels - 1 - sc[pass].step);
>> + const unsigned idx = (addr >> va_skip) & ((1 << va_bits) - 1);
>> + const dma_addr_t pte_addr = base + idx * sc[pass].ptesize;
>> + const bool ade =
>> + ctx->tc & (pass ? RISCV_IOMMU_DC_TC_GADE : RISCV_IOMMU_DC_TC_SADE);
>> +
>> + /* Address range check before first level lookup */
>> + if (!sc[pass].step) {
>> + const uint64_t va_mask = (1ULL << (va_skip + va_bits)) - 1;
>> + if ((addr & va_mask) != addr) {
>> + return RISCV_IOMMU_FQ_CAUSE_DMA_DISABLED;
>> + }
>> + }
>> +
>> + /* Read page table entry */
>> + if (dma_memory_read(s->target_as, pte_addr, &pte,
>> + sc[pass].ptesize, MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
>> + return (iotlb->perm & IOMMU_WO) ? RISCV_IOMMU_FQ_CAUSE_WR_FAULT
>> + : RISCV_IOMMU_FQ_CAUSE_RD_FAULT;
>> + }
>> +
>> + if (sc[pass].ptesize == 4) {
>> + pte = (uint64_t) le32_to_cpu(*((uint32_t *)&pte));
>> + } else {
>> + pte = le64_to_cpu(pte);
>> + }
>
> I think this would be clearer to read if you did
> if (sc[pass].ptesize == 4) {
> uint32_t pte32 = 0;
> r = ldl_le_dma(s->target_as, pte_addr, &pte32, MEMTX_ATTRS_UNSPECIFIED);
> pte = pte32;
> } else {
> r = ldq_le_dma(s->target_as, pte_addr, &pte, MEMTX_ATTRS_UNSPECIFIED);
> }
> if (r != MEMTX_OK) {
> ...
> }
>
> rather than loading 4 or 8 bytes into a host uint64_t as a pile-of-bytes
> and doing a complicated expression to get the right answer afterwards.
>
>
>> +/* Translation Context cache support */
>> +static gboolean __ctx_equal(gconstpointer v1, gconstpointer v2)
>
> Don't use double-underscore prefixes, please -- those are
> reserved for the system (i.e. not user code like QEMU).
>
>> +{
>> + RISCVIOMMUContext *c1 = (RISCVIOMMUContext *) v1;
>> + RISCVIOMMUContext *c2 = (RISCVIOMMUContext *) v2;
>> + return c1->devid == c2->devid &&
>> + c1->process_id == c2->process_id;
>> +}
>> +
>> +static MemTxResult riscv_iommu_mmio_read(void *opaque, hwaddr addr,
>> + uint64_t *data, unsigned size, MemTxAttrs attrs)
>> +{
>> + RISCVIOMMUState *s = opaque;
>> + uint64_t val = -1;
>> + uint8_t *ptr;
>> +
>> + if ((addr & (size - 1)) != 0) {
>> + /* Unsupported MMIO alignment. */
>> + return MEMTX_ERROR;
>> + }
>> +
>> + if (addr + size > RISCV_IOMMU_REG_MSI_CONFIG) {
>> + return MEMTX_ACCESS_ERROR;
>> + }
>> +
>> + ptr = &s->regs_rw[addr];
>> +
>> + if (size == 1) {
>> + val = (uint64_t)*ptr;
>
> This looks very fishy. If we're reading 1 byte then cast
> the pointer to uint64_t* and read 8 possibly-misaligned bytes ??
>
> If you want to read one byte, then the way that parallels
> the other cases in this if() ladder is
> val = ldub_p(ptr);
>
>> + } else if (size == 2) {
>> + val = lduw_le_p(ptr);
>> + } else if (size == 4) {
>> + val = ldl_le_p(ptr);
>> + } else if (size == 8) {
>> + val = ldq_le_p(ptr);
>> + } else {
>> + return MEMTX_ERROR;
>> + }
>
> ...but you can replace the whole if ladder with
> val = ldn_le_p(ptr, size);
That's better.
>
> (There's a corresponding stn_le_p() if you need it.)
>
>> +
>> + *data = val;
>> +
>> + return MEMTX_OK;
>> +}
>> +
>> +static const MemoryRegionOps riscv_iommu_mmio_ops = {
>> + .read_with_attrs = riscv_iommu_mmio_read,
>> + .write_with_attrs = riscv_iommu_mmio_write,
>> + .endianness = DEVICE_NATIVE_ENDIAN,
>> + .impl = {
>> + .min_access_size = 4,
>> + .max_access_size = 8,
>
> But you've set your impl here to say that it can
> only handle 4 and 8 byte accesses. So why is the read
> function trying to handle 1 and 2 byte accesses?
The min and max access size was added during review and I removed the
1/2 byte accesses from mmio_write(), but it seems I forgot about
mmio_read().
I'll fix it and re-send. Thanks,
Daniel
>
>> + .unaligned = false,
>> + },
>> + .valid = {
>> + .min_access_size = 4,
>> + .max_access_size = 8,
>> + }
>> +};
>> +
>
> -- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-09-28 21:01 ` Daniel Henrique Barboza
@ 2024-09-29 15:46 ` Peter Maydell
2024-10-01 22:14 ` Tomasz Jeznach
0 siblings, 1 reply; 68+ messages in thread
From: Peter Maydell @ 2024-09-29 15:46 UTC (permalink / raw)
To: Daniel Henrique Barboza
Cc: Alistair Francis, qemu-devel, Tomasz Jeznach, Sebastien Boeuf,
Alistair Francis
On Sat, 28 Sept 2024 at 22:01, Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
>
>
> On 9/28/24 5:22 PM, Peter Maydell wrote:
> > On Tue, 24 Sept 2024 at 23:19, Alistair Francis <alistair23@gmail.com> wrote:
> >> +/* Register helper functions */
> >> +static inline uint32_t riscv_iommu_reg_mod32(RISCVIOMMUState *s,
> >> + unsigned idx, uint32_t set, uint32_t clr)
> >> +{
> >> + uint32_t val;
> >> + qemu_spin_lock(&s->regs_lock);
> >> + val = ldl_le_p(s->regs_rw + idx);
> >> + stl_le_p(s->regs_rw + idx, (val & ~clr) | set);
> >> + qemu_spin_unlock(&s->regs_lock);
> >> + return val;
> >> +}
> >
> > This looks very weird. Nobody else's IOMMU implementation
> > grabs a spinlock while it is accessing guest register data.
> > Why is riscv special? Why a spinlock? (We use spinlocks
> > only very very sparingly in general.)
>
>
> The first versions of the IOMMU used qemu threads. I believe this is where
> the locks come from (both from registers and from the cache).
>
> I'm not sure if we're ever going to hit a race condition without the locks
> in the current code (i.e. using mmio ops only). I think I'll make an attempt
> to remove the locks and see if something breaks.
Generally access to the backing for guest registers in a
device model is protected by the fact that the iothread lock
(BQL) is held while the guest is accessing a device. I think for
iommu devices there may be some extra complication because they
get called as part of the translate-this-address codepath
where I think the BQL is not held, and so they typically
have some qemu_mutex to protect themselves[*]. But they
don't generally need/want to do the handling at this very
fine-grained per-read/write level.
[*] This is just what I've gathered from a quick read through
our other iommu implementations; don't take it as gospel.
If you do need to do something complicated with locking it would
be good to have a comment somewhere explaining why and what
the locking principles are.
thanks
-- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-09-29 15:46 ` Peter Maydell
@ 2024-10-01 22:14 ` Tomasz Jeznach
2024-10-01 23:00 ` Daniel Henrique Barboza
0 siblings, 1 reply; 68+ messages in thread
From: Tomasz Jeznach @ 2024-10-01 22:14 UTC (permalink / raw)
To: Peter Maydell
Cc: Daniel Henrique Barboza, Alistair Francis, qemu-devel,
Sebastien Boeuf, Alistair Francis
On Sun, Sep 29, 2024 at 8:46 AM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Sat, 28 Sept 2024 at 22:01, Daniel Henrique Barboza
> <dbarboza@ventanamicro.com> wrote:
> >
> >
> >
> > On 9/28/24 5:22 PM, Peter Maydell wrote:
> > > On Tue, 24 Sept 2024 at 23:19, Alistair Francis <alistair23@gmail.com> wrote:
>
> > >> +/* Register helper functions */
> > >> +static inline uint32_t riscv_iommu_reg_mod32(RISCVIOMMUState *s,
> > >> + unsigned idx, uint32_t set, uint32_t clr)
> > >> +{
> > >> + uint32_t val;
> > >> + qemu_spin_lock(&s->regs_lock);
> > >> + val = ldl_le_p(s->regs_rw + idx);
> > >> + stl_le_p(s->regs_rw + idx, (val & ~clr) | set);
> > >> + qemu_spin_unlock(&s->regs_lock);
> > >> + return val;
> > >> +}
> > >
> > > This looks very weird. Nobody else's IOMMU implementation
> > > grabs a spinlock while it is accessing guest register data.
> > > Why is riscv special? Why a spinlock? (We use spinlocks
> > > only very very sparingly in general.)
> >
> >
> > The first versions of the IOMMU used qemu threads. I believe this is where
> > the locks come from (both from registers and from the cache).
> >
> > I'm not sure if we're ever going to hit a race condition without the locks
> > in the current code (i.e. using mmio ops only). I think I'll make an attempt
> > to remove the locks and see if something breaks.
>
> Generally access to the backing for guest registers in a
> device model is protected by the fact that the iothread lock
> (BQL) is held while the guest is accessing a device. I think for
> iommu devices there may be some extra complication because they
> get called as part of the translate-this-address codepath
> where I think the BQL is not held, and so they typically
> have some qemu_mutex to protect themselves[*]. But they
> don't generally need/want to do the handling at this very
> fine-grained per-read/write level.
>
> [*] This is just what I've gathered from a quick read through
> our other iommu implementations; don't take it as gospel.
>
> If you do need to do something complicated with locking it would
> be good to have a comment somewhere explaining why and what
> the locking principles are.
>
Locking for the iot and ctx cache is still required as the translation
requests going through IOMMUMemoryRegionClass.translate callback
(riscv_iommu_memory_region_translate) are unlikely protected by BQL.
With ATS translation requests and eventually PRI notifications coming
to the iommu implementation, cache structures will be modified without
any external locking guarantees.
To add some background to register access spin locks.
Original implementation runs all IOMMU command processing logic as a
separate thread, avoiding holding BQL while executing cache
invalidations and other commands. This was implemented to mimic real
hardware, where IOMMU operations are not blocking CPU execution.
During the review process it was suggested and modified to execute all
commands directly as part of the MMIO callback. Please note, that this
change has consequences of potentially holding CPU/BQL for unspecified
time as the command processing flow might be calling registered IOMMU
memory notifiers for ATS invalidations and page group responses.
Anyone implementing memory notifier callback will now execute with BQL
held. Also, updates to register state might happen during translation
fault reporting and or/pri. At least queue CSR/head/tail registers
updates will need some sort of protection.
Probably adding a comment in the code and maybe restoring the original
threading model would make this implementation more readable and less
problematic in the future.
Thanks for the review,
- Tomasz
> thanks
> -- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-10-01 22:14 ` Tomasz Jeznach
@ 2024-10-01 23:00 ` Daniel Henrique Barboza
2024-10-01 23:19 ` Tomasz Jeznach
0 siblings, 1 reply; 68+ messages in thread
From: Daniel Henrique Barboza @ 2024-10-01 23:00 UTC (permalink / raw)
To: Tomasz Jeznach, Peter Maydell
Cc: Alistair Francis, qemu-devel, Sebastien Boeuf, Alistair Francis,
frank.chang@sifive.com
On 10/1/24 7:14 PM, Tomasz Jeznach wrote:
> On Sun, Sep 29, 2024 at 8:46 AM Peter Maydell <peter.maydell@linaro.org> wrote:
>>
>> On Sat, 28 Sept 2024 at 22:01, Daniel Henrique Barboza
>> <dbarboza@ventanamicro.com> wrote:
>>>
>>>
>>>
>>> On 9/28/24 5:22 PM, Peter Maydell wrote:
>>>> On Tue, 24 Sept 2024 at 23:19, Alistair Francis <alistair23@gmail.com> wrote:
>>
>>>>> +/* Register helper functions */
>>>>> +static inline uint32_t riscv_iommu_reg_mod32(RISCVIOMMUState *s,
>>>>> + unsigned idx, uint32_t set, uint32_t clr)
>>>>> +{
>>>>> + uint32_t val;
>>>>> + qemu_spin_lock(&s->regs_lock);
>>>>> + val = ldl_le_p(s->regs_rw + idx);
>>>>> + stl_le_p(s->regs_rw + idx, (val & ~clr) | set);
>>>>> + qemu_spin_unlock(&s->regs_lock);
>>>>> + return val;
>>>>> +}
>>>>
>>>> This looks very weird. Nobody else's IOMMU implementation
>>>> grabs a spinlock while it is accessing guest register data.
>>>> Why is riscv special? Why a spinlock? (We use spinlocks
>>>> only very very sparingly in general.)
>>>
>>>
>>> The first versions of the IOMMU used qemu threads. I believe this is where
>>> the locks come from (both from registers and from the cache).
>>>
>>> I'm not sure if we're ever going to hit a race condition without the locks
>>> in the current code (i.e. using mmio ops only). I think I'll make an attempt
>>> to remove the locks and see if something breaks.
>>
>> Generally access to the backing for guest registers in a
>> device model is protected by the fact that the iothread lock
>> (BQL) is held while the guest is accessing a device. I think for
>> iommu devices there may be some extra complication because they
>> get called as part of the translate-this-address codepath
>> where I think the BQL is not held, and so they typically
>> have some qemu_mutex to protect themselves[*]. But they
>> don't generally need/want to do the handling at this very
>> fine-grained per-read/write level.
>>
>> [*] This is just what I've gathered from a quick read through
>> our other iommu implementations; don't take it as gospel.
>>
>> If you do need to do something complicated with locking it would
>> be good to have a comment somewhere explaining why and what
>> the locking principles are.
>>
>
> Locking for the iot and ctx cache is still required as the translation
> requests going through IOMMUMemoryRegionClass.translate callback
> (riscv_iommu_memory_region_translate) are unlikely protected by BQL.
> With ATS translation requests and eventually PRI notifications coming
> to the iommu implementation, cache structures will be modified without
> any external locking guarantees.
>
> To add some background to register access spin locks.
> Original implementation runs all IOMMU command processing logic as a
> separate thread, avoiding holding BQL while executing cache
> invalidations and other commands. This was implemented to mimic real
> hardware, where IOMMU operations are not blocking CPU execution.
> During the review process it was suggested and modified to execute all
> commands directly as part of the MMIO callback. Please note, that this
> change has consequences of potentially holding CPU/BQL for unspecified
> time as the command processing flow might be calling registered IOMMU
> memory notifiers for ATS invalidations and page group responses.
> Anyone implementing memory notifier callback will now execute with BQL
> held. Also, updates to register state might happen during translation
> fault reporting and or/pri. At least queue CSR/head/tail registers
> updates will need some sort of protection.
>
> Probably adding a comment in the code and maybe restoring the original
> threading model would make this implementation more readable and less
> problematic in the future.
I agree that the locks makes more sense in a threading model like it was
the case in v2. At this point it doesn't make much sense to have both
BQL and manual locks.
For context, we moved away from it back in v2, after this comment from
Frank Chang:
https://lore.kernel.org/qemu-riscv/CANzO1D35eYan8axod37tAg88r=qg4Jt0CVTvO+0AiwRLbbV64A@mail.gmail.com/
--------------
In our experience, using QEMU thread increases the latency of command
queue processing,
which leads to the potential IOMMU fence timeout in the Linux driver
when using IOMMU with KVM,
e.g. booting the guest Linux.
Is it possible to remove the thread from the IOMMU just like ARM, AMD,
and Intel IOMMU models?
--------------
Frank then elaborated further about IOFENCE and how they were hitting asserts
that were solved after moving the IOMMU away from a thread model.
For now I removed all locks, like Peter suggested, and in my testing with
riscv-iommu-pci and riscv-iommu-sys I didn't see anything bad happening.
BQL is granting enough race protection it seems.
I believe we should stick with this approach, instead of rolling back
to a threading model that Frank said back in v2 that could be problematic.
Nothing is stopping us from moving back to a threading model later if we
start hitting race conditions or even if we have performance improvements
in doing so. At that point we'll be more versed in how the IOMMU is going
to be used by everyone else too.
Thanks,
Daniel
>
> Thanks for the review,
> - Tomasz
>
>
>> thanks
>> -- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-10-01 23:00 ` Daniel Henrique Barboza
@ 2024-10-01 23:19 ` Tomasz Jeznach
0 siblings, 0 replies; 68+ messages in thread
From: Tomasz Jeznach @ 2024-10-01 23:19 UTC (permalink / raw)
To: Daniel Henrique Barboza
Cc: Peter Maydell, Alistair Francis, qemu-devel, Sebastien Boeuf,
Alistair Francis, frank.chang@sifive.com
On Tue, Oct 1, 2024 at 4:00 PM Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
>
>
> On 10/1/24 7:14 PM, Tomasz Jeznach wrote:
> > On Sun, Sep 29, 2024 at 8:46 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> >>
> >> On Sat, 28 Sept 2024 at 22:01, Daniel Henrique Barboza
> >> <dbarboza@ventanamicro.com> wrote:
> >>>
> >>>
> >>>
> >>> On 9/28/24 5:22 PM, Peter Maydell wrote:
> >>>> On Tue, 24 Sept 2024 at 23:19, Alistair Francis <alistair23@gmail.com> wrote:
> >>
> >>>>> +/* Register helper functions */
> >>>>> +static inline uint32_t riscv_iommu_reg_mod32(RISCVIOMMUState *s,
> >>>>> + unsigned idx, uint32_t set, uint32_t clr)
> >>>>> +{
> >>>>> + uint32_t val;
> >>>>> + qemu_spin_lock(&s->regs_lock);
> >>>>> + val = ldl_le_p(s->regs_rw + idx);
> >>>>> + stl_le_p(s->regs_rw + idx, (val & ~clr) | set);
> >>>>> + qemu_spin_unlock(&s->regs_lock);
> >>>>> + return val;
> >>>>> +}
> >>>>
> >>>> This looks very weird. Nobody else's IOMMU implementation
> >>>> grabs a spinlock while it is accessing guest register data.
> >>>> Why is riscv special? Why a spinlock? (We use spinlocks
> >>>> only very very sparingly in general.)
> >>>
> >>>
> >>> The first versions of the IOMMU used qemu threads. I believe this is where
> >>> the locks come from (both from registers and from the cache).
> >>>
> >>> I'm not sure if we're ever going to hit a race condition without the locks
> >>> in the current code (i.e. using mmio ops only). I think I'll make an attempt
> >>> to remove the locks and see if something breaks.
> >>
> >> Generally access to the backing for guest registers in a
> >> device model is protected by the fact that the iothread lock
> >> (BQL) is held while the guest is accessing a device. I think for
> >> iommu devices there may be some extra complication because they
> >> get called as part of the translate-this-address codepath
> >> where I think the BQL is not held, and so they typically
> >> have some qemu_mutex to protect themselves[*]. But they
> >> don't generally need/want to do the handling at this very
> >> fine-grained per-read/write level.
> >>
> >> [*] This is just what I've gathered from a quick read through
> >> our other iommu implementations; don't take it as gospel.
> >>
> >> If you do need to do something complicated with locking it would
> >> be good to have a comment somewhere explaining why and what
> >> the locking principles are.
> >>
> >
> > Locking for the iot and ctx cache is still required as the translation
> > requests going through IOMMUMemoryRegionClass.translate callback
> > (riscv_iommu_memory_region_translate) are unlikely protected by BQL.
> > With ATS translation requests and eventually PRI notifications coming
> > to the iommu implementation, cache structures will be modified without
> > any external locking guarantees.
> >
> > To add some background to register access spin locks.
> > Original implementation runs all IOMMU command processing logic as a
> > separate thread, avoiding holding BQL while executing cache
> > invalidations and other commands. This was implemented to mimic real
> > hardware, where IOMMU operations are not blocking CPU execution.
> > During the review process it was suggested and modified to execute all
> > commands directly as part of the MMIO callback. Please note, that this
> > change has consequences of potentially holding CPU/BQL for unspecified
> > time as the command processing flow might be calling registered IOMMU
> > memory notifiers for ATS invalidations and page group responses.
> > Anyone implementing memory notifier callback will now execute with BQL
> > held. Also, updates to register state might happen during translation
> > fault reporting and or/pri. At least queue CSR/head/tail registers
> > updates will need some sort of protection.
> >
> > Probably adding a comment in the code and maybe restoring the original
> > threading model would make this implementation more readable and less
> > problematic in the future.
>
> I agree that the locks makes more sense in a threading model like it was
> the case in v2. At this point it doesn't make much sense to have both
> BQL and manual locks.
>
Agree, it doesn't make sense to have both. My only concern now is
related to riscv_iommu_fault() and riscv_iommu_pri() updating related
queue CSR and interrupt status register (REG_IPSR). Other than that it
should be ok to rely on BQL.
> For context, we moved away from it back in v2, after this comment from
> Frank Chang:
>
> https://lore.kernel.org/qemu-riscv/CANzO1D35eYan8axod37tAg88r=qg4Jt0CVTvO+0AiwRLbbV64A@mail.gmail.com/
>
> --------------
> In our experience, using QEMU thread increases the latency of command
> queue processing,
> which leads to the potential IOMMU fence timeout in the Linux driver
> when using IOMMU with KVM,
> e.g. booting the guest Linux.
>
> Is it possible to remove the thread from the IOMMU just like ARM, AMD,
> and Intel IOMMU models?
> --------------
>
> Frank then elaborated further about IOFENCE and how they were hitting asserts
> that were solved after moving the IOMMU away from a thread model.
>
> For now I removed all locks, like Peter suggested, and in my testing with
> riscv-iommu-pci and riscv-iommu-sys I didn't see anything bad happening.
> BQL is granting enough race protection it seems.
>
> I believe we should stick with this approach, instead of rolling back
> to a threading model that Frank said back in v2 that could be problematic.
> Nothing is stopping us from moving back to a threading model later if we
> start hitting race conditions or even if we have performance improvements
> in doing so. At that point we'll be more versed in how the IOMMU is going
> to be used by everyone else too.
>
>
Thanks for bringing up this comment. Yes, definitely makes sense, it
will add latency (what I've found beneficial for driver side testing).
Both solutions have downsides. I'll have a closer look if the
consequences for ATS/PRI processing are severe enough to move back to
the threading model. Let's leave as is for now.
Thanks!
- Tomasz
> Thanks,
>
> Daniel
>
>
> >
> > Thanks for the review,
> > - Tomasz
> >
> >
> >> thanks
> >> -- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-09-24 22:17 ` [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation Alistair Francis
2024-09-28 20:22 ` Peter Maydell
@ 2024-10-01 22:24 ` Tomasz Jeznach
2024-10-01 23:15 ` Daniel Henrique Barboza
1 sibling, 1 reply; 68+ messages in thread
From: Tomasz Jeznach @ 2024-10-01 22:24 UTC (permalink / raw)
To: Alistair Francis
Cc: qemu-devel, Sebastien Boeuf, Daniel Henrique Barboza,
Alistair Francis
On Tue, Sep 24, 2024 at 3:18 PM Alistair Francis <alistair23@gmail.com> wrote:
> +
> +/* IOMMU index for transactions without process_id specified. */
> +#define RISCV_IOMMU_NOPROCID 0
> +
> +static uint8_t riscv_iommu_get_icvec_vector(uint32_t icvec, uint32_t vec_type)
> +{
> + switch (vec_type) {
> + case RISCV_IOMMU_INTR_CQ:
> + return icvec & RISCV_IOMMU_ICVEC_CIV;
> + case RISCV_IOMMU_INTR_FQ:
> + return icvec & RISCV_IOMMU_ICVEC_FIV >> 4;
Please add missing brackets to fix operator ordering bug. Here and for
PM, PQ cases.
It should be: return (icvec & RISCV_IOMMU_ICVEC_FIV) >> 4
> + case RISCV_IOMMU_INTR_PM:
> + return icvec & RISCV_IOMMU_ICVEC_PMIV >> 8;
> + case RISCV_IOMMU_INTR_PQ:
> + return icvec & RISCV_IOMMU_ICVEC_PIV >> 12;
> + default:
> + g_assert_not_reached();
> + }
> +}
> +
thanks,
- Tomasz
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation
2024-10-01 22:24 ` Tomasz Jeznach
@ 2024-10-01 23:15 ` Daniel Henrique Barboza
0 siblings, 0 replies; 68+ messages in thread
From: Daniel Henrique Barboza @ 2024-10-01 23:15 UTC (permalink / raw)
To: Tomasz Jeznach, Alistair Francis
Cc: qemu-devel, Sebastien Boeuf, Alistair Francis
On 10/1/24 7:24 PM, Tomasz Jeznach wrote:
> On Tue, Sep 24, 2024 at 3:18 PM Alistair Francis <alistair23@gmail.com> wrote:
>
>> +
>> +/* IOMMU index for transactions without process_id specified. */
>> +#define RISCV_IOMMU_NOPROCID 0
>> +
>> +static uint8_t riscv_iommu_get_icvec_vector(uint32_t icvec, uint32_t vec_type)
>> +{
>> + switch (vec_type) {
>> + case RISCV_IOMMU_INTR_CQ:
>> + return icvec & RISCV_IOMMU_ICVEC_CIV;
>> + case RISCV_IOMMU_INTR_FQ:
>> + return icvec & RISCV_IOMMU_ICVEC_FIV >> 4;
>
> Please add missing brackets to fix operator ordering bug. Here and for
> PM, PQ cases.
> It should be: return (icvec & RISCV_IOMMU_ICVEC_FIV) >> 4
Nice catch. Changed in v8. Thanks,
Daniel
>
>> + case RISCV_IOMMU_INTR_PM:
>> + return icvec & RISCV_IOMMU_ICVEC_PMIV >> 8;
>> + case RISCV_IOMMU_INTR_PQ:
>> + return icvec & RISCV_IOMMU_ICVEC_PIV >> 12;
>> + default:
>> + g_assert_not_reached();
>> + }
>> +}
>> +
>
> thanks,
> - Tomasz
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PULL v2 17/47] pci-ids.rst: add Red Hat pci-id for RISC-V IOMMU device
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (15 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 16/47] hw/riscv: add RISC-V IOMMU base emulation Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 18/47] hw/riscv: add riscv-iommu-pci reference device Alistair Francis
` (30 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Daniel Henrique Barboza, Gerd Hoffmann, Frank Chang,
Alistair Francis
From: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
The RISC-V IOMMU PCI device we're going to add next is a reference
implementation of the riscv-iommu spec [1], which predicts that the
IOMMU can be implemented as a PCIe device.
However, RISC-V International (RVI), the entity that ratified the
riscv-iommu spec, didn't bother assigning a PCI ID for this IOMMU PCIe
implementation that the spec predicts. This puts us in an uncommon
situation because we want to add the reference IOMMU PCIe implementation
but we don't have a PCI ID for it.
Given that RVI doesn't provide a PCI ID for it we reached out to Red Hat
and Gerd Hoffman, and they were kind enough to give us a PCI ID for the
RISC-V IOMMU PCI reference device.
Thanks Red Hat and Gerd for this RISC-V IOMMU PCIe device ID.
[1] https://github.com/riscv-non-isa/riscv-iommu/releases/tag/v1.0.0
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240903201633.93182-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
docs/specs/pci-ids.rst | 2 ++
include/hw/pci/pci.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/docs/specs/pci-ids.rst b/docs/specs/pci-ids.rst
index 328ab31fe8..261b0f359f 100644
--- a/docs/specs/pci-ids.rst
+++ b/docs/specs/pci-ids.rst
@@ -98,6 +98,8 @@ PCI devices (other than virtio):
PCI ACPI ERST device (``-device acpi-erst``)
1b36:0013
PCI UFS device (``-device ufs``)
+1b36:0014
+ PCI RISC-V IOMMU device
All these devices are documented in :doc:`index`.
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index eb26cac810..35d4fe0bbf 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -116,6 +116,7 @@ extern bool pci_available;
#define PCI_DEVICE_ID_REDHAT_PVPANIC 0x0011
#define PCI_DEVICE_ID_REDHAT_ACPI_ERST 0x0012
#define PCI_DEVICE_ID_REDHAT_UFS 0x0013
+#define PCI_DEVICE_ID_REDHAT_RISCV_IOMMU 0x0014
#define PCI_DEVICE_ID_REDHAT_QXL 0x0100
#define FMT_PCIBUS PRIx64
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 18/47] hw/riscv: add riscv-iommu-pci reference device
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (16 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 17/47] pci-ids.rst: add Red Hat pci-id for RISC-V IOMMU device Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 19/47] hw/riscv/virt.c: support for RISC-V IOMMU PCIDevice hotplug Alistair Francis
` (29 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Tomasz Jeznach, Daniel Henrique Barboza, Frank Chang,
Alistair Francis
From: Tomasz Jeznach <tjeznach@rivosinc.com>
The RISC-V IOMMU can be modelled as a PCIe device following the
guidelines of the RISC-V IOMMU spec, chapter 7.1, "Integrating an IOMMU
as a PCIe device".
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
hw/riscv/riscv-iommu-pci.c | 202 +++++++++++++++++++++++++++++++++++++
hw/riscv/meson.build | 2 +-
2 files changed, 203 insertions(+), 1 deletion(-)
create mode 100644 hw/riscv/riscv-iommu-pci.c
diff --git a/hw/riscv/riscv-iommu-pci.c b/hw/riscv/riscv-iommu-pci.c
new file mode 100644
index 0000000000..6be6bc7383
--- /dev/null
+++ b/hw/riscv/riscv-iommu-pci.c
@@ -0,0 +1,202 @@
+/*
+ * QEMU emulation of an RISC-V IOMMU
+ *
+ * Copyright (C) 2022-2023 Rivos Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/qdev-properties.h"
+#include "hw/riscv/riscv_hart.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/host-utils.h"
+#include "qom/object.h"
+
+#include "cpu_bits.h"
+#include "riscv-iommu.h"
+#include "riscv-iommu-bits.h"
+
+/* RISC-V IOMMU PCI Device Emulation */
+#define RISCV_PCI_CLASS_SYSTEM_IOMMU 0x0806
+
+/*
+ * 4 MSIx vectors for ICVEC, one for MRIF. The spec mentions in
+ * the "Placement and data flow" section that:
+ *
+ * "The interfaces related to recording an incoming MSI in a memory-resident
+ * interrupt file (MRIF) are implementation-specific. The partitioning of
+ * responsibility between the IOMMU and the IO bridge for recording the
+ * incoming MSI in an MRIF and generating the associated notice MSI are
+ * implementation-specific."
+ *
+ * We're making a design decision to create the MSIx for MRIF in the
+ * IOMMU MSIx emulation.
+ */
+#define RISCV_IOMMU_PCI_MSIX_VECTORS 5
+
+/*
+ * 4 vectors that can be used by civ, fiv, pmiv and piv. Number of
+ * vectors is represented by 2^N, where N = number of writable bits
+ * in each cause. For 4 vectors we'll write 0b11 (3) in each reg.
+ */
+#define RISCV_IOMMU_PCI_ICVEC_VECTORS 0x3333
+
+typedef struct RISCVIOMMUStatePci {
+ PCIDevice pci; /* Parent PCIe device state */
+ uint16_t vendor_id;
+ uint16_t device_id;
+ uint8_t revision;
+ MemoryRegion bar0; /* PCI BAR (including MSI-x config) */
+ RISCVIOMMUState iommu; /* common IOMMU state */
+} RISCVIOMMUStatePci;
+
+/* interrupt delivery callback */
+static void riscv_iommu_pci_notify(RISCVIOMMUState *iommu, unsigned vector)
+{
+ RISCVIOMMUStatePci *s = container_of(iommu, RISCVIOMMUStatePci, iommu);
+
+ if (msix_enabled(&(s->pci))) {
+ msix_notify(&(s->pci), vector);
+ }
+}
+
+static void riscv_iommu_pci_realize(PCIDevice *dev, Error **errp)
+{
+ RISCVIOMMUStatePci *s = DO_UPCAST(RISCVIOMMUStatePci, pci, dev);
+ RISCVIOMMUState *iommu = &s->iommu;
+ uint8_t *pci_conf = dev->config;
+ Error *err = NULL;
+
+ pci_set_word(pci_conf + PCI_VENDOR_ID, s->vendor_id);
+ pci_set_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID, s->vendor_id);
+ pci_set_word(pci_conf + PCI_DEVICE_ID, s->device_id);
+ pci_set_word(pci_conf + PCI_SUBSYSTEM_ID, s->device_id);
+ pci_set_byte(pci_conf + PCI_REVISION_ID, s->revision);
+
+ /* Set device id for trace / debug */
+ DEVICE(iommu)->id = g_strdup_printf("%02x:%02x.%01x",
+ pci_dev_bus_num(dev), PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));
+ qdev_realize(DEVICE(iommu), NULL, errp);
+
+ memory_region_init(&s->bar0, OBJECT(s), "riscv-iommu-bar0",
+ QEMU_ALIGN_UP(memory_region_size(&iommu->regs_mr), TARGET_PAGE_SIZE));
+ memory_region_add_subregion(&s->bar0, 0, &iommu->regs_mr);
+
+ pcie_endpoint_cap_init(dev, 0);
+
+ pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY |
+ PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar0);
+
+ int ret = msix_init(dev, RISCV_IOMMU_PCI_MSIX_VECTORS,
+ &s->bar0, 0, RISCV_IOMMU_REG_MSI_CONFIG,
+ &s->bar0, 0, RISCV_IOMMU_REG_MSI_CONFIG + 256, 0, &err);
+
+ if (ret == -ENOTSUP) {
+ /*
+ * MSI-x is not supported by the platform.
+ * Driver should use timer/polling based notification handlers.
+ */
+ warn_report_err(err);
+ } else if (ret < 0) {
+ error_propagate(errp, err);
+ return;
+ } else {
+ /* Mark all ICVEC MSIx vectors as used */
+ for (int i = 0; i < RISCV_IOMMU_PCI_MSIX_VECTORS; i++) {
+ msix_vector_use(dev, i);
+ }
+
+ iommu->notify = riscv_iommu_pci_notify;
+ }
+
+ PCIBus *bus = pci_device_root_bus(dev);
+ if (!bus) {
+ error_setg(errp, "can't find PCIe root port for %02x:%02x.%x",
+ pci_bus_num(pci_get_bus(dev)), PCI_SLOT(dev->devfn),
+ PCI_FUNC(dev->devfn));
+ return;
+ }
+
+ riscv_iommu_pci_setup_iommu(iommu, bus, errp);
+}
+
+static void riscv_iommu_pci_exit(PCIDevice *pci_dev)
+{
+ pci_setup_iommu(pci_device_root_bus(pci_dev), NULL, NULL);
+}
+
+static const VMStateDescription riscv_iommu_vmstate = {
+ .name = "riscv-iommu",
+ .unmigratable = 1
+};
+
+static void riscv_iommu_pci_init(Object *obj)
+{
+ RISCVIOMMUStatePci *s = RISCV_IOMMU_PCI(obj);
+ RISCVIOMMUState *iommu = &s->iommu;
+
+ object_initialize_child(obj, "iommu", iommu, TYPE_RISCV_IOMMU);
+ qdev_alias_all_properties(DEVICE(iommu), obj);
+
+ iommu->icvec_avail_vectors = RISCV_IOMMU_PCI_ICVEC_VECTORS;
+}
+
+static Property riscv_iommu_pci_properties[] = {
+ DEFINE_PROP_UINT16("vendor-id", RISCVIOMMUStatePci, vendor_id,
+ PCI_VENDOR_ID_REDHAT),
+ DEFINE_PROP_UINT16("device-id", RISCVIOMMUStatePci, device_id,
+ PCI_DEVICE_ID_REDHAT_RISCV_IOMMU),
+ DEFINE_PROP_UINT8("revision", RISCVIOMMUStatePci, revision, 0x01),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void riscv_iommu_pci_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+ PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+ k->realize = riscv_iommu_pci_realize;
+ k->exit = riscv_iommu_pci_exit;
+ k->class_id = RISCV_PCI_CLASS_SYSTEM_IOMMU;
+ dc->desc = "RISCV-IOMMU DMA Remapping device";
+ dc->vmsd = &riscv_iommu_vmstate;
+ dc->hotpluggable = false;
+ dc->user_creatable = true;
+ set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+ device_class_set_props(dc, riscv_iommu_pci_properties);
+}
+
+static const TypeInfo riscv_iommu_pci = {
+ .name = TYPE_RISCV_IOMMU_PCI,
+ .parent = TYPE_PCI_DEVICE,
+ .class_init = riscv_iommu_pci_class_init,
+ .instance_init = riscv_iommu_pci_init,
+ .instance_size = sizeof(RISCVIOMMUStatePci),
+ .interfaces = (InterfaceInfo[]) {
+ { INTERFACE_PCIE_DEVICE },
+ { },
+ },
+};
+
+static void riscv_iommu_register_pci_types(void)
+{
+ type_register_static(&riscv_iommu_pci);
+}
+
+type_init(riscv_iommu_register_pci_types);
diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
index cbc99c6e8e..adbef8a9b2 100644
--- a/hw/riscv/meson.build
+++ b/hw/riscv/meson.build
@@ -10,6 +10,6 @@ riscv_ss.add(when: 'CONFIG_SIFIVE_U', if_true: files('sifive_u.c'))
riscv_ss.add(when: 'CONFIG_SPIKE', if_true: files('spike.c'))
riscv_ss.add(when: 'CONFIG_MICROCHIP_PFSOC', if_true: files('microchip_pfsoc.c'))
riscv_ss.add(when: 'CONFIG_ACPI', if_true: files('virt-acpi-build.c'))
-riscv_ss.add(when: 'CONFIG_RISCV_IOMMU', if_true: files('riscv-iommu.c'))
+riscv_ss.add(when: 'CONFIG_RISCV_IOMMU', if_true: files('riscv-iommu.c', 'riscv-iommu-pci.c'))
hw_arch += {'riscv': riscv_ss}
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 19/47] hw/riscv/virt.c: support for RISC-V IOMMU PCIDevice hotplug
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (17 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 18/47] hw/riscv: add riscv-iommu-pci reference device Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 20/47] test/qtest: add riscv-iommu-pci tests Alistair Francis
` (28 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Tomasz Jeznach, Daniel Henrique Barboza, Frank Chang,
Alistair Francis
From: Tomasz Jeznach <tjeznach@rivosinc.com>
Generate device tree entry for riscv-iommu PCI device, along with
mapping all PCI device identifiers to the single IOMMU device instance.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
hw/riscv/virt.c | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index cef41c150a..63d553ac98 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -32,6 +32,7 @@
#include "hw/core/sysbus-fdt.h"
#include "target/riscv/pmu.h"
#include "hw/riscv/riscv_hart.h"
+#include "hw/riscv/iommu.h"
#include "hw/riscv/virt.h"
#include "hw/riscv/boot.h"
#include "hw/riscv/numa.h"
@@ -1032,6 +1033,30 @@ static void create_fdt_virtio_iommu(RISCVVirtState *s, uint16_t bdf)
bdf + 1, iommu_phandle, bdf + 1, 0xffff - bdf);
}
+static void create_fdt_iommu(RISCVVirtState *s, uint16_t bdf)
+{
+ const char comp[] = "riscv,pci-iommu";
+ void *fdt = MACHINE(s)->fdt;
+ uint32_t iommu_phandle;
+ g_autofree char *iommu_node = NULL;
+ g_autofree char *pci_node = NULL;
+
+ pci_node = g_strdup_printf("/soc/pci@%lx",
+ (long) virt_memmap[VIRT_PCIE_ECAM].base);
+ iommu_node = g_strdup_printf("%s/iommu@%x", pci_node, bdf);
+ iommu_phandle = qemu_fdt_alloc_phandle(fdt);
+ qemu_fdt_add_subnode(fdt, iommu_node);
+
+ qemu_fdt_setprop(fdt, iommu_node, "compatible", comp, sizeof(comp));
+ qemu_fdt_setprop_cell(fdt, iommu_node, "#iommu-cells", 1);
+ qemu_fdt_setprop_cell(fdt, iommu_node, "phandle", iommu_phandle);
+ qemu_fdt_setprop_cells(fdt, iommu_node, "reg",
+ bdf << 8, 0, 0, 0, 0);
+ qemu_fdt_setprop_cells(fdt, pci_node, "iommu-map",
+ 0, iommu_phandle, 0, bdf,
+ bdf + 1, iommu_phandle, bdf + 1, 0xffff - bdf);
+}
+
static void finalize_fdt(RISCVVirtState *s)
{
uint32_t phandle = 1, irq_mmio_phandle = 1, msi_pcie_phandle = 1;
@@ -1738,9 +1763,11 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
MachineClass *mc = MACHINE_GET_CLASS(machine);
if (device_is_dynamic_sysbus(mc, dev) ||
- object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
+ object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
+ object_dynamic_cast(OBJECT(dev), TYPE_RISCV_IOMMU_PCI)) {
return HOTPLUG_HANDLER(machine);
}
+
return NULL;
}
@@ -1761,6 +1788,10 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
create_fdt_virtio_iommu(s, pci_get_bdf(PCI_DEVICE(dev)));
}
+
+ if (object_dynamic_cast(OBJECT(dev), TYPE_RISCV_IOMMU_PCI)) {
+ create_fdt_iommu(s, pci_get_bdf(PCI_DEVICE(dev)));
+ }
}
static void virt_machine_class_init(ObjectClass *oc, void *data)
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 20/47] test/qtest: add riscv-iommu-pci tests
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (18 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 19/47] hw/riscv/virt.c: support for RISC-V IOMMU PCIDevice hotplug Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 21/47] hw/riscv/riscv-iommu: add Address Translation Cache (IOATC) Alistair Francis
` (27 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Daniel Henrique Barboza, Frank Chang,
Alistair Francis
From: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
To test the RISC-V IOMMU emulation we'll use its PCI representation.
Create a new 'riscv-iommu-pci' libqos device that will be present with
CONFIG_RISCV_IOMMU. This config is only available for RISC-V, so this
device will only be consumed by the RISC-V libqos machine.
Start with basic tests: a PCI sanity check and a reset state register
test. The reset test was taken from the RISC-V IOMMU spec chapter 5.2,
"Reset behavior".
More tests will be added later.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
tests/qtest/libqos/riscv-iommu.h | 71 ++++++++++++++++++++++++
tests/qtest/libqos/riscv-iommu.c | 76 ++++++++++++++++++++++++++
tests/qtest/riscv-iommu-test.c | 93 ++++++++++++++++++++++++++++++++
tests/qtest/libqos/meson.build | 4 ++
tests/qtest/meson.build | 1 +
5 files changed, 245 insertions(+)
create mode 100644 tests/qtest/libqos/riscv-iommu.h
create mode 100644 tests/qtest/libqos/riscv-iommu.c
create mode 100644 tests/qtest/riscv-iommu-test.c
diff --git a/tests/qtest/libqos/riscv-iommu.h b/tests/qtest/libqos/riscv-iommu.h
new file mode 100644
index 0000000000..d123efb41f
--- /dev/null
+++ b/tests/qtest/libqos/riscv-iommu.h
@@ -0,0 +1,71 @@
+/*
+ * libqos driver riscv-iommu-pci framework
+ *
+ * Copyright (c) 2024 Ventana Micro Systems Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at your
+ * option) any later version. See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef TESTS_LIBQOS_RISCV_IOMMU_H
+#define TESTS_LIBQOS_RISCV_IOMMU_H
+
+#include "qgraph.h"
+#include "pci.h"
+#include "qemu/bitops.h"
+
+#ifndef GENMASK_ULL
+#define GENMASK_ULL(h, l) (((~0ULL) >> (63 - (h) + (l))) << (l))
+#endif
+
+/*
+ * RISC-V IOMMU uses PCI_VENDOR_ID_REDHAT 0x1b36 and
+ * PCI_DEVICE_ID_REDHAT_RISCV_IOMMU 0x0014.
+ */
+#define RISCV_IOMMU_PCI_VENDOR_ID 0x1b36
+#define RISCV_IOMMU_PCI_DEVICE_ID 0x0014
+#define RISCV_IOMMU_PCI_DEVICE_CLASS 0x0806
+
+/* Common field positions */
+#define RISCV_IOMMU_QUEUE_ENABLE BIT(0)
+#define RISCV_IOMMU_QUEUE_INTR_ENABLE BIT(1)
+#define RISCV_IOMMU_QUEUE_MEM_FAULT BIT(8)
+#define RISCV_IOMMU_QUEUE_ACTIVE BIT(16)
+#define RISCV_IOMMU_QUEUE_BUSY BIT(17)
+
+#define RISCV_IOMMU_REG_CAP 0x0000
+#define RISCV_IOMMU_CAP_VERSION GENMASK_ULL(7, 0)
+
+#define RISCV_IOMMU_REG_DDTP 0x0010
+#define RISCV_IOMMU_DDTP_BUSY BIT_ULL(4)
+#define RISCV_IOMMU_DDTP_MODE GENMASK_ULL(3, 0)
+#define RISCV_IOMMU_DDTP_MODE_OFF 0
+
+#define RISCV_IOMMU_REG_CQCSR 0x0048
+#define RISCV_IOMMU_CQCSR_CQEN RISCV_IOMMU_QUEUE_ENABLE
+#define RISCV_IOMMU_CQCSR_CIE RISCV_IOMMU_QUEUE_INTR_ENABLE
+#define RISCV_IOMMU_CQCSR_CQON RISCV_IOMMU_QUEUE_ACTIVE
+#define RISCV_IOMMU_CQCSR_BUSY RISCV_IOMMU_QUEUE_BUSY
+
+#define RISCV_IOMMU_REG_FQCSR 0x004C
+#define RISCV_IOMMU_FQCSR_FQEN RISCV_IOMMU_QUEUE_ENABLE
+#define RISCV_IOMMU_FQCSR_FIE RISCV_IOMMU_QUEUE_INTR_ENABLE
+#define RISCV_IOMMU_FQCSR_FQON RISCV_IOMMU_QUEUE_ACTIVE
+#define RISCV_IOMMU_FQCSR_BUSY RISCV_IOMMU_QUEUE_BUSY
+
+#define RISCV_IOMMU_REG_PQCSR 0x0050
+#define RISCV_IOMMU_PQCSR_PQEN RISCV_IOMMU_QUEUE_ENABLE
+#define RISCV_IOMMU_PQCSR_PIE RISCV_IOMMU_QUEUE_INTR_ENABLE
+#define RISCV_IOMMU_PQCSR_PQON RISCV_IOMMU_QUEUE_ACTIVE
+#define RISCV_IOMMU_PQCSR_BUSY RISCV_IOMMU_QUEUE_BUSY
+
+#define RISCV_IOMMU_REG_IPSR 0x0054
+
+typedef struct QRISCVIOMMU {
+ QOSGraphObject obj;
+ QPCIDevice dev;
+ QPCIBar reg_bar;
+} QRISCVIOMMU;
+
+#endif
diff --git a/tests/qtest/libqos/riscv-iommu.c b/tests/qtest/libqos/riscv-iommu.c
new file mode 100644
index 0000000000..01e3b31c0b
--- /dev/null
+++ b/tests/qtest/libqos/riscv-iommu.c
@@ -0,0 +1,76 @@
+/*
+ * libqos driver riscv-iommu-pci framework
+ *
+ * Copyright (c) 2024 Ventana Micro Systems Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at your
+ * option) any later version. See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "../libqtest.h"
+#include "qemu/module.h"
+#include "qgraph.h"
+#include "pci.h"
+#include "riscv-iommu.h"
+
+static void *riscv_iommu_pci_get_driver(void *obj, const char *interface)
+{
+ QRISCVIOMMU *r_iommu_pci = obj;
+
+ if (!g_strcmp0(interface, "pci-device")) {
+ return &r_iommu_pci->dev;
+ }
+
+ fprintf(stderr, "%s not present in riscv_iommu_pci\n", interface);
+ g_assert_not_reached();
+}
+
+static void riscv_iommu_pci_start_hw(QOSGraphObject *obj)
+{
+ QRISCVIOMMU *pci = (QRISCVIOMMU *)obj;
+ qpci_device_enable(&pci->dev);
+}
+
+static void riscv_iommu_pci_destructor(QOSGraphObject *obj)
+{
+ QRISCVIOMMU *pci = (QRISCVIOMMU *)obj;
+ qpci_iounmap(&pci->dev, pci->reg_bar);
+}
+
+static void *riscv_iommu_pci_create(void *pci_bus, QGuestAllocator *alloc,
+ void *addr)
+{
+ QRISCVIOMMU *r_iommu_pci = g_new0(QRISCVIOMMU, 1);
+ QPCIBus *bus = pci_bus;
+
+ qpci_device_init(&r_iommu_pci->dev, bus, addr);
+ r_iommu_pci->reg_bar = qpci_iomap(&r_iommu_pci->dev, 0, NULL);
+
+ r_iommu_pci->obj.get_driver = riscv_iommu_pci_get_driver;
+ r_iommu_pci->obj.start_hw = riscv_iommu_pci_start_hw;
+ r_iommu_pci->obj.destructor = riscv_iommu_pci_destructor;
+ return &r_iommu_pci->obj;
+}
+
+static void riscv_iommu_pci_register_nodes(void)
+{
+ QPCIAddress addr = {
+ .vendor_id = RISCV_IOMMU_PCI_VENDOR_ID,
+ .device_id = RISCV_IOMMU_PCI_DEVICE_ID,
+ .devfn = QPCI_DEVFN(1, 0),
+ };
+
+ QOSGraphEdgeOptions opts = {
+ .extra_device_opts = "addr=01.0",
+ };
+
+ add_qpci_address(&opts, &addr);
+
+ qos_node_create_driver("riscv-iommu-pci", riscv_iommu_pci_create);
+ qos_node_produces("riscv-iommu-pci", "pci-device");
+ qos_node_consumes("riscv-iommu-pci", "pci-bus", &opts);
+}
+
+libqos_init(riscv_iommu_pci_register_nodes);
diff --git a/tests/qtest/riscv-iommu-test.c b/tests/qtest/riscv-iommu-test.c
new file mode 100644
index 0000000000..7f0dbd0211
--- /dev/null
+++ b/tests/qtest/riscv-iommu-test.c
@@ -0,0 +1,93 @@
+/*
+ * QTest testcase for RISC-V IOMMU
+ *
+ * Copyright (c) 2024 Ventana Micro Systems Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at your
+ * option) any later version. See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+#include "qemu/module.h"
+#include "libqos/qgraph.h"
+#include "libqos/riscv-iommu.h"
+#include "hw/pci/pci_regs.h"
+
+static uint32_t riscv_iommu_read_reg32(QRISCVIOMMU *r_iommu, int reg_offset)
+{
+ uint32_t reg;
+
+ qpci_memread(&r_iommu->dev, r_iommu->reg_bar, reg_offset,
+ ®, sizeof(reg));
+ return reg;
+}
+
+static uint64_t riscv_iommu_read_reg64(QRISCVIOMMU *r_iommu, int reg_offset)
+{
+ uint64_t reg;
+
+ qpci_memread(&r_iommu->dev, r_iommu->reg_bar, reg_offset,
+ ®, sizeof(reg));
+ return reg;
+}
+
+static void test_pci_config(void *obj, void *data, QGuestAllocator *t_alloc)
+{
+ QRISCVIOMMU *r_iommu = obj;
+ QPCIDevice *dev = &r_iommu->dev;
+ uint16_t vendorid, deviceid, classid;
+
+ vendorid = qpci_config_readw(dev, PCI_VENDOR_ID);
+ deviceid = qpci_config_readw(dev, PCI_DEVICE_ID);
+ classid = qpci_config_readw(dev, PCI_CLASS_DEVICE);
+
+ g_assert_cmpuint(vendorid, ==, RISCV_IOMMU_PCI_VENDOR_ID);
+ g_assert_cmpuint(deviceid, ==, RISCV_IOMMU_PCI_DEVICE_ID);
+ g_assert_cmpuint(classid, ==, RISCV_IOMMU_PCI_DEVICE_CLASS);
+}
+
+static void test_reg_reset(void *obj, void *data, QGuestAllocator *t_alloc)
+{
+ QRISCVIOMMU *r_iommu = obj;
+ uint64_t cap;
+ uint32_t reg;
+
+ cap = riscv_iommu_read_reg64(r_iommu, RISCV_IOMMU_REG_CAP);
+ g_assert_cmpuint(cap & RISCV_IOMMU_CAP_VERSION, ==, 0x10);
+
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_CQCSR);
+ g_assert_cmpuint(reg & RISCV_IOMMU_CQCSR_CQEN, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_CQCSR_CIE, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_CQCSR_CQON, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_CQCSR_BUSY, ==, 0);
+
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_FQCSR);
+ g_assert_cmpuint(reg & RISCV_IOMMU_FQCSR_FQEN, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_FQCSR_FIE, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_FQCSR_FQON, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_FQCSR_BUSY, ==, 0);
+
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_PQCSR);
+ g_assert_cmpuint(reg & RISCV_IOMMU_PQCSR_PQEN, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_PQCSR_PIE, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_PQCSR_PQON, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_PQCSR_BUSY, ==, 0);
+
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_DDTP);
+ g_assert_cmpuint(reg & RISCV_IOMMU_DDTP_BUSY, ==, 0);
+ g_assert_cmpuint(reg & RISCV_IOMMU_DDTP_MODE, ==,
+ RISCV_IOMMU_DDTP_MODE_OFF);
+
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_IPSR);
+ g_assert_cmpuint(reg, ==, 0);
+}
+
+static void register_riscv_iommu_test(void)
+{
+ qos_add_test("pci_config", "riscv-iommu-pci", test_pci_config, NULL);
+ qos_add_test("reg_reset", "riscv-iommu-pci", test_reg_reset, NULL);
+}
+
+libqos_init(register_riscv_iommu_test);
diff --git a/tests/qtest/libqos/meson.build b/tests/qtest/libqos/meson.build
index 1b2b2dbb22..586fcacdc8 100644
--- a/tests/qtest/libqos/meson.build
+++ b/tests/qtest/libqos/meson.build
@@ -68,6 +68,10 @@ if have_virtfs
libqos_srcs += files('virtio-9p.c', 'virtio-9p-client.c')
endif
+if config_all_devices.has_key('CONFIG_RISCV_IOMMU')
+ libqos_srcs += files('riscv-iommu.c')
+endif
+
libqos = static_library('qos', libqos_srcs + genh,
build_by_default: false)
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 310865e49c..e5a54f342a 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -301,6 +301,7 @@ qos_test_ss.add(
'vmxnet3-test.c',
'igb-test.c',
'ufs-test.c',
+ 'riscv-iommu-test.c',
)
if config_all_devices.has_key('CONFIG_VIRTIO_SERIAL')
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 21/47] hw/riscv/riscv-iommu: add Address Translation Cache (IOATC)
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (19 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 20/47] test/qtest: add riscv-iommu-pci tests Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 22/47] hw/riscv/riscv-iommu: add ATS support Alistair Francis
` (26 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Tomasz Jeznach, Daniel Henrique Barboza, Frank Chang,
Alistair Francis
From: Tomasz Jeznach <tjeznach@rivosinc.com>
The RISC-V IOMMU spec predicts that the IOMMU can use translation caches
to hold entries from the DDT. This includes implementation for all cache
commands that are marked as 'not implemented'.
There are some artifacts included in the cache that predicts s-stage and
g-stage elements, although we don't support it yet. We'll introduce them
next.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-9-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
hw/riscv/riscv-iommu.h | 3 +
hw/riscv/riscv-iommu.c | 205 ++++++++++++++++++++++++++++++++++++++++-
2 files changed, 204 insertions(+), 4 deletions(-)
diff --git a/hw/riscv/riscv-iommu.h b/hw/riscv/riscv-iommu.h
index 95b4ce8d50..ddf5d50cf9 100644
--- a/hw/riscv/riscv-iommu.h
+++ b/hw/riscv/riscv-iommu.h
@@ -72,6 +72,9 @@ struct RISCVIOMMUState {
GHashTable *ctx_cache; /* Device translation Context Cache */
QemuMutex ctx_lock; /* Device translation Cache update lock */
+ GHashTable *iot_cache; /* IO Translated Address Cache */
+ QemuMutex iot_lock; /* IO TLB Cache update lock */
+ unsigned iot_limit; /* IO Translation Cache size limit */
/* MMIO Hardware Interface */
MemoryRegion regs_mr;
diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
index 061a5efe19..fe81072b1c 100644
--- a/hw/riscv/riscv-iommu.c
+++ b/hw/riscv/riscv-iommu.c
@@ -65,6 +65,16 @@ struct RISCVIOMMUContext {
uint64_t msiptp; /* MSI redirection page table pointer */
};
+/* Address translation cache entry */
+struct RISCVIOMMUEntry {
+ uint64_t iova:44; /* IOVA Page Number */
+ uint64_t pscid:20; /* Process Soft-Context identifier */
+ uint64_t phys:44; /* Physical Page Number */
+ uint64_t gscid:16; /* Guest Soft-Context identifier */
+ uint64_t perm:2; /* IOMMU_RW flags */
+ uint64_t __rfu:2;
+};
+
/* IOMMU index for transactions without process_id specified. */
#define RISCV_IOMMU_NOPROCID 0
@@ -1156,13 +1166,130 @@ static AddressSpace *riscv_iommu_space(RISCVIOMMUState *s, uint32_t devid)
return &as->iova_as;
}
+/* Translation Object cache support */
+static gboolean __iot_equal(gconstpointer v1, gconstpointer v2)
+{
+ RISCVIOMMUEntry *t1 = (RISCVIOMMUEntry *) v1;
+ RISCVIOMMUEntry *t2 = (RISCVIOMMUEntry *) v2;
+ return t1->gscid == t2->gscid && t1->pscid == t2->pscid &&
+ t1->iova == t2->iova;
+}
+
+static guint __iot_hash(gconstpointer v)
+{
+ RISCVIOMMUEntry *t = (RISCVIOMMUEntry *) v;
+ return (guint)t->iova;
+}
+
+/* GV: 1 PSCV: 1 AV: 1 */
+static void __iot_inval_pscid_iova(gpointer key, gpointer value, gpointer data)
+{
+ RISCVIOMMUEntry *iot = (RISCVIOMMUEntry *) value;
+ RISCVIOMMUEntry *arg = (RISCVIOMMUEntry *) data;
+ if (iot->gscid == arg->gscid &&
+ iot->pscid == arg->pscid &&
+ iot->iova == arg->iova) {
+ iot->perm = IOMMU_NONE;
+ }
+}
+
+/* GV: 1 PSCV: 1 AV: 0 */
+static void __iot_inval_pscid(gpointer key, gpointer value, gpointer data)
+{
+ RISCVIOMMUEntry *iot = (RISCVIOMMUEntry *) value;
+ RISCVIOMMUEntry *arg = (RISCVIOMMUEntry *) data;
+ if (iot->gscid == arg->gscid &&
+ iot->pscid == arg->pscid) {
+ iot->perm = IOMMU_NONE;
+ }
+}
+
+/* GV: 1 GVMA: 1 */
+static void __iot_inval_gscid_gpa(gpointer key, gpointer value, gpointer data)
+{
+ RISCVIOMMUEntry *iot = (RISCVIOMMUEntry *) value;
+ RISCVIOMMUEntry *arg = (RISCVIOMMUEntry *) data;
+ if (iot->gscid == arg->gscid) {
+ /* simplified cache, no GPA matching */
+ iot->perm = IOMMU_NONE;
+ }
+}
+
+/* GV: 1 GVMA: 0 */
+static void __iot_inval_gscid(gpointer key, gpointer value, gpointer data)
+{
+ RISCVIOMMUEntry *iot = (RISCVIOMMUEntry *) value;
+ RISCVIOMMUEntry *arg = (RISCVIOMMUEntry *) data;
+ if (iot->gscid == arg->gscid) {
+ iot->perm = IOMMU_NONE;
+ }
+}
+
+/* GV: 0 */
+static void __iot_inval_all(gpointer key, gpointer value, gpointer data)
+{
+ RISCVIOMMUEntry *iot = (RISCVIOMMUEntry *) value;
+ iot->perm = IOMMU_NONE;
+}
+
+/* caller should keep ref-count for iot_cache object */
+static RISCVIOMMUEntry *riscv_iommu_iot_lookup(RISCVIOMMUContext *ctx,
+ GHashTable *iot_cache, hwaddr iova)
+{
+ RISCVIOMMUEntry key = {
+ .gscid = get_field(ctx->gatp, RISCV_IOMMU_DC_IOHGATP_GSCID),
+ .pscid = get_field(ctx->ta, RISCV_IOMMU_DC_TA_PSCID),
+ .iova = PPN_DOWN(iova),
+ };
+ return g_hash_table_lookup(iot_cache, &key);
+}
+
+/* caller should keep ref-count for iot_cache object */
+static void riscv_iommu_iot_update(RISCVIOMMUState *s,
+ GHashTable *iot_cache, RISCVIOMMUEntry *iot)
+{
+ if (!s->iot_limit) {
+ return;
+ }
+
+ qemu_mutex_lock(&s->iot_lock);
+ if (g_hash_table_size(s->iot_cache) >= s->iot_limit) {
+ iot_cache = g_hash_table_new_full(__iot_hash, __iot_equal,
+ g_free, NULL);
+ g_hash_table_unref(qatomic_xchg(&s->iot_cache, iot_cache));
+ }
+ g_hash_table_add(iot_cache, iot);
+ qemu_mutex_unlock(&s->iot_lock);
+}
+
+static void riscv_iommu_iot_inval(RISCVIOMMUState *s, GHFunc func,
+ uint32_t gscid, uint32_t pscid, hwaddr iova)
+{
+ GHashTable *iot_cache;
+ RISCVIOMMUEntry key = {
+ .gscid = gscid,
+ .pscid = pscid,
+ .iova = PPN_DOWN(iova),
+ };
+
+ iot_cache = g_hash_table_ref(s->iot_cache);
+ qemu_mutex_lock(&s->iot_lock);
+ g_hash_table_foreach(iot_cache, func, &key);
+ qemu_mutex_unlock(&s->iot_lock);
+ g_hash_table_unref(iot_cache);
+}
+
static int riscv_iommu_translate(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
- IOMMUTLBEntry *iotlb)
+ IOMMUTLBEntry *iotlb, bool enable_cache)
{
+ RISCVIOMMUEntry *iot;
+ IOMMUAccessFlags perm;
bool enable_pid;
bool enable_pri;
+ GHashTable *iot_cache;
int fault;
+ iot_cache = g_hash_table_ref(s->iot_cache);
/*
* TC[32] is reserved for custom extensions, used here to temporarily
* enable automatic page-request generation for ATS queries.
@@ -1170,9 +1297,45 @@ static int riscv_iommu_translate(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
enable_pri = (iotlb->perm == IOMMU_NONE) && (ctx->tc & BIT_ULL(32));
enable_pid = (ctx->tc & RISCV_IOMMU_DC_TC_PDTV);
+ qemu_mutex_lock(&s->iot_lock);
+ iot = riscv_iommu_iot_lookup(ctx, iot_cache, iotlb->iova);
+ qemu_mutex_unlock(&s->iot_lock);
+ perm = iot ? iot->perm : IOMMU_NONE;
+ if (perm != IOMMU_NONE) {
+ iotlb->translated_addr = PPN_PHYS(iot->phys);
+ iotlb->addr_mask = ~TARGET_PAGE_MASK;
+ iotlb->perm = perm;
+ fault = 0;
+ goto done;
+ }
+
/* Translate using device directory / page table information. */
fault = riscv_iommu_spa_fetch(s, ctx, iotlb);
+ if (!fault && iotlb->target_as == &s->trap_as) {
+ /* Do not cache trapped MSI translations */
+ goto done;
+ }
+
+ /*
+ * We made an implementation choice to not cache identity-mapped
+ * translations, as allowed by the specification, to avoid
+ * translation cache evictions for other devices sharing the
+ * IOMMU hardware model.
+ */
+ if (!fault && iotlb->translated_addr != iotlb->iova && enable_cache) {
+ iot = g_new0(RISCVIOMMUEntry, 1);
+ iot->iova = PPN_DOWN(iotlb->iova);
+ iot->phys = PPN_DOWN(iotlb->translated_addr);
+ iot->gscid = get_field(ctx->gatp, RISCV_IOMMU_DC_IOHGATP_GSCID);
+ iot->pscid = get_field(ctx->ta, RISCV_IOMMU_DC_TA_PSCID);
+ iot->perm = iotlb->perm;
+ riscv_iommu_iot_update(s, iot_cache, iot);
+ }
+
+done:
+ g_hash_table_unref(iot_cache);
+
if (enable_pri && fault) {
struct riscv_iommu_pq_record pr = {0};
if (enable_pid) {
@@ -1312,13 +1475,40 @@ static void riscv_iommu_process_cq_tail(RISCVIOMMUState *s)
if (cmd.dword0 & RISCV_IOMMU_CMD_IOTINVAL_PSCV) {
/* illegal command arguments IOTINVAL.GVMA & PSCV == 1 */
goto cmd_ill;
+ } else if (!(cmd.dword0 & RISCV_IOMMU_CMD_IOTINVAL_GV)) {
+ /* invalidate all cache mappings */
+ func = __iot_inval_all;
+ } else if (!(cmd.dword0 & RISCV_IOMMU_CMD_IOTINVAL_AV)) {
+ /* invalidate cache matching GSCID */
+ func = __iot_inval_gscid;
+ } else {
+ /* invalidate cache matching GSCID and ADDR (GPA) */
+ func = __iot_inval_gscid_gpa;
}
- /* translation cache not implemented yet */
+ riscv_iommu_iot_inval(s, func,
+ get_field(cmd.dword0, RISCV_IOMMU_CMD_IOTINVAL_GSCID), 0,
+ cmd.dword1 & TARGET_PAGE_MASK);
break;
case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_IOTINVAL_FUNC_VMA,
RISCV_IOMMU_CMD_IOTINVAL_OPCODE):
- /* translation cache not implemented yet */
+ if (!(cmd.dword0 & RISCV_IOMMU_CMD_IOTINVAL_GV)) {
+ /* invalidate all cache mappings, simplified model */
+ func = __iot_inval_all;
+ } else if (!(cmd.dword0 & RISCV_IOMMU_CMD_IOTINVAL_PSCV)) {
+ /* invalidate cache matching GSCID, simplified model */
+ func = __iot_inval_gscid;
+ } else if (!(cmd.dword0 & RISCV_IOMMU_CMD_IOTINVAL_AV)) {
+ /* invalidate cache matching GSCID and PSCID */
+ func = __iot_inval_pscid;
+ } else {
+ /* invalidate cache matching GSCID and PSCID and ADDR (IOVA) */
+ func = __iot_inval_pscid_iova;
+ }
+ riscv_iommu_iot_inval(s, func,
+ get_field(cmd.dword0, RISCV_IOMMU_CMD_IOTINVAL_GSCID),
+ get_field(cmd.dword0, RISCV_IOMMU_CMD_IOTINVAL_PSCID),
+ cmd.dword1 & TARGET_PAGE_MASK);
break;
case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_IODIR_FUNC_INVAL_DDT,
@@ -1857,6 +2047,10 @@ static void riscv_iommu_realize(DeviceState *dev, Error **errp)
g_free, NULL);
qemu_mutex_init(&s->ctx_lock);
+ s->iot_cache = g_hash_table_new_full(__iot_hash, __iot_equal,
+ g_free, NULL);
+ qemu_mutex_init(&s->iot_lock);
+
s->iommus.le_next = NULL;
s->iommus.le_prev = NULL;
QLIST_INIT(&s->spaces);
@@ -1869,6 +2063,7 @@ static void riscv_iommu_unrealize(DeviceState *dev)
RISCVIOMMUState *s = RISCV_IOMMU(dev);
qemu_mutex_destroy(&s->core_lock);
+ g_hash_table_unref(s->iot_cache);
g_hash_table_unref(s->ctx_cache);
}
@@ -1876,6 +2071,8 @@ static Property riscv_iommu_properties[] = {
DEFINE_PROP_UINT32("version", RISCVIOMMUState, version,
RISCV_IOMMU_SPEC_DOT_VER),
DEFINE_PROP_UINT32("bus", RISCVIOMMUState, bus, 0x0),
+ DEFINE_PROP_UINT32("ioatc-limit", RISCVIOMMUState, iot_limit,
+ LIMIT_CACHE_IOT),
DEFINE_PROP_BOOL("intremap", RISCVIOMMUState, enable_msi, TRUE),
DEFINE_PROP_BOOL("off", RISCVIOMMUState, enable_off, TRUE),
DEFINE_PROP_BOOL("s-stage", RISCVIOMMUState, enable_s_stage, TRUE),
@@ -1930,7 +2127,7 @@ static IOMMUTLBEntry riscv_iommu_memory_region_translate(
/* Translation disabled or invalid. */
iotlb.addr_mask = 0;
iotlb.perm = IOMMU_NONE;
- } else if (riscv_iommu_translate(as->iommu, ctx, &iotlb)) {
+ } else if (riscv_iommu_translate(as->iommu, ctx, &iotlb, true)) {
/* Translation disabled or fault reported. */
iotlb.addr_mask = 0;
iotlb.perm = IOMMU_NONE;
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 22/47] hw/riscv/riscv-iommu: add ATS support
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (20 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 21/47] hw/riscv/riscv-iommu: add Address Translation Cache (IOATC) Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 23/47] hw/riscv/riscv-iommu: add DBG support Alistair Francis
` (25 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Tomasz Jeznach, Daniel Henrique Barboza, Frank Chang,
Alistair Francis
From: Tomasz Jeznach <tjeznach@rivosinc.com>
Add PCIe Address Translation Services (ATS) capabilities to the IOMMU.
This will add support for ATS translation requests in Fault/Event
queues, Page-request queue and IOATC invalidations.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-10-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
hw/riscv/riscv-iommu-bits.h | 43 +++++++++++-
hw/riscv/riscv-iommu.h | 1 +
hw/riscv/riscv-iommu.c | 129 +++++++++++++++++++++++++++++++++++-
hw/riscv/trace-events | 3 +
4 files changed, 173 insertions(+), 3 deletions(-)
diff --git a/hw/riscv/riscv-iommu-bits.h b/hw/riscv/riscv-iommu-bits.h
index b1c477f5c3..96a994b9aa 100644
--- a/hw/riscv/riscv-iommu-bits.h
+++ b/hw/riscv/riscv-iommu-bits.h
@@ -79,6 +79,7 @@ struct riscv_iommu_pq_record {
#define RISCV_IOMMU_CAP_SV57X4 BIT_ULL(19)
#define RISCV_IOMMU_CAP_MSI_FLAT BIT_ULL(22)
#define RISCV_IOMMU_CAP_MSI_MRIF BIT_ULL(23)
+#define RISCV_IOMMU_CAP_ATS BIT_ULL(25)
#define RISCV_IOMMU_CAP_T2GPA BIT_ULL(26)
#define RISCV_IOMMU_CAP_IGS GENMASK_ULL(29, 28)
#define RISCV_IOMMU_CAP_PAS GENMASK_ULL(37, 32)
@@ -212,6 +213,7 @@ struct riscv_iommu_dc {
/* Translation control fields */
#define RISCV_IOMMU_DC_TC_V BIT_ULL(0)
+#define RISCV_IOMMU_DC_TC_EN_ATS BIT_ULL(1)
#define RISCV_IOMMU_DC_TC_EN_PRI BIT_ULL(2)
#define RISCV_IOMMU_DC_TC_T2GPA BIT_ULL(3)
#define RISCV_IOMMU_DC_TC_DTF BIT_ULL(4)
@@ -273,6 +275,20 @@ struct riscv_iommu_command {
#define RISCV_IOMMU_CMD_IODIR_DV BIT_ULL(33)
#define RISCV_IOMMU_CMD_IODIR_DID GENMASK_ULL(63, 40)
+/* 3.1.4 I/O MMU PCIe ATS */
+#define RISCV_IOMMU_CMD_ATS_OPCODE 4
+#define RISCV_IOMMU_CMD_ATS_FUNC_INVAL 0
+#define RISCV_IOMMU_CMD_ATS_FUNC_PRGR 1
+#define RISCV_IOMMU_CMD_ATS_PID GENMASK_ULL(31, 12)
+#define RISCV_IOMMU_CMD_ATS_PV BIT_ULL(32)
+#define RISCV_IOMMU_CMD_ATS_DSV BIT_ULL(33)
+#define RISCV_IOMMU_CMD_ATS_RID GENMASK_ULL(55, 40)
+#define RISCV_IOMMU_CMD_ATS_DSEG GENMASK_ULL(63, 56)
+/* dword1 is the ATS payload, two different payload types for INVAL and PRGR */
+
+/* ATS.PRGR payload */
+#define RISCV_IOMMU_CMD_ATS_PRGR_RESP_CODE GENMASK_ULL(47, 44)
+
enum riscv_iommu_dc_fsc_atp_modes {
RISCV_IOMMU_DC_FSC_MODE_BARE = 0,
RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV32 = 8,
@@ -339,7 +355,32 @@ enum riscv_iommu_fq_ttypes {
RISCV_IOMMU_FQ_TTYPE_TADDR_INST_FETCH = 5,
RISCV_IOMMU_FQ_TTYPE_TADDR_RD = 6,
RISCV_IOMMU_FQ_TTYPE_TADDR_WR = 7,
- RISCV_IOMMU_FW_TTYPE_PCIE_MSG_REQ = 8,
+ RISCV_IOMMU_FQ_TTYPE_PCIE_ATS_REQ = 8,
+ RISCV_IOMMU_FW_TTYPE_PCIE_MSG_REQ = 9,
+};
+
+/* Header fields */
+#define RISCV_IOMMU_PREQ_HDR_PID GENMASK_ULL(31, 12)
+#define RISCV_IOMMU_PREQ_HDR_PV BIT_ULL(32)
+#define RISCV_IOMMU_PREQ_HDR_PRIV BIT_ULL(33)
+#define RISCV_IOMMU_PREQ_HDR_EXEC BIT_ULL(34)
+#define RISCV_IOMMU_PREQ_HDR_DID GENMASK_ULL(63, 40)
+
+/* Payload fields */
+#define RISCV_IOMMU_PREQ_PAYLOAD_R BIT_ULL(0)
+#define RISCV_IOMMU_PREQ_PAYLOAD_W BIT_ULL(1)
+#define RISCV_IOMMU_PREQ_PAYLOAD_L BIT_ULL(2)
+#define RISCV_IOMMU_PREQ_PAYLOAD_M GENMASK_ULL(2, 0)
+#define RISCV_IOMMU_PREQ_PRG_INDEX GENMASK_ULL(11, 3)
+#define RISCV_IOMMU_PREQ_UADDR GENMASK_ULL(63, 12)
+
+
+/*
+ * struct riscv_iommu_msi_pte - MSI Page Table Entry
+ */
+struct riscv_iommu_msi_pte {
+ uint64_t pte;
+ uint64_t mrif_info;
};
/* Fields on pte */
diff --git a/hw/riscv/riscv-iommu.h b/hw/riscv/riscv-iommu.h
index ddf5d50cf9..9f15c2b139 100644
--- a/hw/riscv/riscv-iommu.h
+++ b/hw/riscv/riscv-iommu.h
@@ -39,6 +39,7 @@ struct RISCVIOMMUState {
bool enable_off; /* Enable out-of-reset OFF mode (DMA disabled) */
bool enable_msi; /* Enable MSI remapping */
+ bool enable_ats; /* Enable ATS support */
bool enable_s_stage; /* Enable S/VS-Stage translation */
bool enable_g_stage; /* Enable G-Stage translation */
diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
index fe81072b1c..50740442bb 100644
--- a/hw/riscv/riscv-iommu.c
+++ b/hw/riscv/riscv-iommu.c
@@ -665,6 +665,20 @@ static bool riscv_iommu_validate_device_ctx(RISCVIOMMUState *s,
RISCVIOMMUContext *ctx)
{
uint32_t fsc_mode, msi_mode;
+ uint64_t gatp;
+
+ if (!(s->cap & RISCV_IOMMU_CAP_ATS) &&
+ (ctx->tc & RISCV_IOMMU_DC_TC_EN_ATS ||
+ ctx->tc & RISCV_IOMMU_DC_TC_EN_PRI ||
+ ctx->tc & RISCV_IOMMU_DC_TC_PRPR)) {
+ return false;
+ }
+
+ if (!(ctx->tc & RISCV_IOMMU_DC_TC_EN_ATS) &&
+ (ctx->tc & RISCV_IOMMU_DC_TC_T2GPA ||
+ ctx->tc & RISCV_IOMMU_DC_TC_EN_PRI)) {
+ return false;
+ }
if (!(ctx->tc & RISCV_IOMMU_DC_TC_EN_PRI) &&
ctx->tc & RISCV_IOMMU_DC_TC_PRPR) {
@@ -685,6 +699,12 @@ static bool riscv_iommu_validate_device_ctx(RISCVIOMMUState *s,
}
}
+ gatp = get_field(ctx->gatp, RISCV_IOMMU_ATP_MODE_FIELD);
+ if (ctx->tc & RISCV_IOMMU_DC_TC_T2GPA &&
+ gatp == RISCV_IOMMU_DC_IOHGATP_MODE_BARE) {
+ return false;
+ }
+
fsc_mode = get_field(ctx->satp, RISCV_IOMMU_DC_FSC_MODE);
if (ctx->tc & RISCV_IOMMU_DC_TC_PDTV) {
@@ -835,7 +855,12 @@ static int riscv_iommu_ctx_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx)
RISCV_IOMMU_DC_IOHGATP_MODE_BARE);
ctx->satp = set_field(0, RISCV_IOMMU_ATP_MODE_FIELD,
RISCV_IOMMU_DC_FSC_MODE_BARE);
+
ctx->tc = RISCV_IOMMU_DC_TC_V;
+ if (s->enable_ats) {
+ ctx->tc |= RISCV_IOMMU_DC_TC_EN_ATS;
+ }
+
ctx->ta = 0;
ctx->msiptp = 0;
return 0;
@@ -1297,6 +1322,16 @@ static int riscv_iommu_translate(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
enable_pri = (iotlb->perm == IOMMU_NONE) && (ctx->tc & BIT_ULL(32));
enable_pid = (ctx->tc & RISCV_IOMMU_DC_TC_PDTV);
+ /* Check for ATS request. */
+ if (iotlb->perm == IOMMU_NONE) {
+ /* Check if ATS is disabled. */
+ if (!(ctx->tc & RISCV_IOMMU_DC_TC_EN_ATS)) {
+ enable_pri = false;
+ fault = RISCV_IOMMU_FQ_CAUSE_TTYPE_BLOCKED;
+ goto done;
+ }
+ }
+
qemu_mutex_lock(&s->iot_lock);
iot = riscv_iommu_iot_lookup(ctx, iot_cache, iotlb->iova);
qemu_mutex_unlock(&s->iot_lock);
@@ -1350,11 +1385,11 @@ done:
}
if (fault) {
- unsigned ttype;
+ unsigned ttype = RISCV_IOMMU_FQ_TTYPE_PCIE_ATS_REQ;
if (iotlb->perm & IOMMU_RW) {
ttype = RISCV_IOMMU_FQ_TTYPE_UADDR_WR;
- } else {
+ } else if (iotlb->perm & IOMMU_RO) {
ttype = RISCV_IOMMU_FQ_TTYPE_UADDR_RD;
}
@@ -1382,6 +1417,73 @@ static MemTxResult riscv_iommu_iofence(RISCVIOMMUState *s, bool notify,
MEMTXATTRS_UNSPECIFIED);
}
+static void riscv_iommu_ats(RISCVIOMMUState *s,
+ struct riscv_iommu_command *cmd, IOMMUNotifierFlag flag,
+ IOMMUAccessFlags perm,
+ void (*trace_fn)(const char *id))
+{
+ RISCVIOMMUSpace *as = NULL;
+ IOMMUNotifier *n;
+ IOMMUTLBEvent event;
+ uint32_t pid;
+ uint32_t devid;
+ const bool pv = cmd->dword0 & RISCV_IOMMU_CMD_ATS_PV;
+
+ if (cmd->dword0 & RISCV_IOMMU_CMD_ATS_DSV) {
+ /* Use device segment and requester id */
+ devid = get_field(cmd->dword0,
+ RISCV_IOMMU_CMD_ATS_DSEG | RISCV_IOMMU_CMD_ATS_RID);
+ } else {
+ devid = get_field(cmd->dword0, RISCV_IOMMU_CMD_ATS_RID);
+ }
+
+ pid = get_field(cmd->dword0, RISCV_IOMMU_CMD_ATS_PID);
+
+ qemu_mutex_lock(&s->core_lock);
+ QLIST_FOREACH(as, &s->spaces, list) {
+ if (as->devid == devid) {
+ break;
+ }
+ }
+ qemu_mutex_unlock(&s->core_lock);
+
+ if (!as || !as->notifier) {
+ return;
+ }
+
+ event.type = flag;
+ event.entry.perm = perm;
+ event.entry.target_as = s->target_as;
+
+ IOMMU_NOTIFIER_FOREACH(n, &as->iova_mr) {
+ if (!pv || n->iommu_idx == pid) {
+ event.entry.iova = n->start;
+ event.entry.addr_mask = n->end - n->start;
+ trace_fn(as->iova_mr.parent_obj.name);
+ memory_region_notify_iommu_one(n, &event);
+ }
+ }
+}
+
+static void riscv_iommu_ats_inval(RISCVIOMMUState *s,
+ struct riscv_iommu_command *cmd)
+{
+ return riscv_iommu_ats(s, cmd, IOMMU_NOTIFIER_DEVIOTLB_UNMAP, IOMMU_NONE,
+ trace_riscv_iommu_ats_inval);
+}
+
+static void riscv_iommu_ats_prgr(RISCVIOMMUState *s,
+ struct riscv_iommu_command *cmd)
+{
+ unsigned resp_code = get_field(cmd->dword1,
+ RISCV_IOMMU_CMD_ATS_PRGR_RESP_CODE);
+
+ /* Using the access flag to carry response code information */
+ IOMMUAccessFlags perm = resp_code ? IOMMU_NONE : IOMMU_RW;
+ return riscv_iommu_ats(s, cmd, IOMMU_NOTIFIER_MAP, perm,
+ trace_riscv_iommu_ats_prgr);
+}
+
static void riscv_iommu_process_ddtp(RISCVIOMMUState *s)
{
uint64_t old_ddtp = s->ddtp;
@@ -1537,6 +1639,25 @@ static void riscv_iommu_process_cq_tail(RISCVIOMMUState *s)
get_field(cmd.dword0, RISCV_IOMMU_CMD_IODIR_PID));
break;
+ /* ATS commands */
+ case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_ATS_FUNC_INVAL,
+ RISCV_IOMMU_CMD_ATS_OPCODE):
+ if (!s->enable_ats) {
+ goto cmd_ill;
+ }
+
+ riscv_iommu_ats_inval(s, &cmd);
+ break;
+
+ case RISCV_IOMMU_CMD(RISCV_IOMMU_CMD_ATS_FUNC_PRGR,
+ RISCV_IOMMU_CMD_ATS_OPCODE):
+ if (!s->enable_ats) {
+ goto cmd_ill;
+ }
+
+ riscv_iommu_ats_prgr(s, &cmd);
+ break;
+
default:
cmd_ill:
/* Invalid instruction, do not advance instruction index. */
@@ -1962,6 +2083,9 @@ static void riscv_iommu_realize(DeviceState *dev, Error **errp)
if (s->enable_msi) {
s->cap |= RISCV_IOMMU_CAP_MSI_FLAT | RISCV_IOMMU_CAP_MSI_MRIF;
}
+ if (s->enable_ats) {
+ s->cap |= RISCV_IOMMU_CAP_ATS;
+ }
if (s->enable_s_stage) {
s->cap |= RISCV_IOMMU_CAP_SV32 | RISCV_IOMMU_CAP_SV39 |
RISCV_IOMMU_CAP_SV48 | RISCV_IOMMU_CAP_SV57;
@@ -2074,6 +2198,7 @@ static Property riscv_iommu_properties[] = {
DEFINE_PROP_UINT32("ioatc-limit", RISCVIOMMUState, iot_limit,
LIMIT_CACHE_IOT),
DEFINE_PROP_BOOL("intremap", RISCVIOMMUState, enable_msi, TRUE),
+ DEFINE_PROP_BOOL("ats", RISCVIOMMUState, enable_ats, TRUE),
DEFINE_PROP_BOOL("off", RISCVIOMMUState, enable_off, TRUE),
DEFINE_PROP_BOOL("s-stage", RISCVIOMMUState, enable_s_stage, TRUE),
DEFINE_PROP_BOOL("g-stage", RISCVIOMMUState, enable_g_stage, TRUE),
diff --git a/hw/riscv/trace-events b/hw/riscv/trace-events
index 3d5c33102d..0527c56c91 100644
--- a/hw/riscv/trace-events
+++ b/hw/riscv/trace-events
@@ -12,3 +12,6 @@ riscv_iommu_notifier_add(const char *id) "%s: dev-iotlb notifier added"
riscv_iommu_notifier_del(const char *id) "%s: dev-iotlb notifier removed"
riscv_iommu_notify_int_vector(uint32_t cause, uint32_t vector) "Interrupt cause 0x%x sent via vector 0x%x"
riscv_iommu_icvec_write(uint32_t orig, uint32_t actual) "ICVEC write: incoming 0x%x actual 0x%x"
+riscv_iommu_ats(const char *id, unsigned b, unsigned d, unsigned f, uint64_t iova) "%s: translate request %04x:%02x.%u iova: 0x%"PRIx64
+riscv_iommu_ats_inval(const char *id) "%s: dev-iotlb invalidate"
+riscv_iommu_ats_prgr(const char *id) "%s: dev-iotlb page request group response"
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 23/47] hw/riscv/riscv-iommu: add DBG support
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (21 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 22/47] hw/riscv/riscv-iommu: add ATS support Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 24/47] qtest/riscv-iommu-test: add init queues test Alistair Francis
` (24 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Tomasz Jeznach, Daniel Henrique Barboza, Frank Chang,
Alistair Francis
From: Tomasz Jeznach <tjeznach@rivosinc.com>
DBG support adds three additional registers: tr_req_iova, tr_req_ctl and
tr_response.
The DBG cap is always enabled. No on/off toggle is provided for it.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-11-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
hw/riscv/riscv-iommu-bits.h | 17 +++++++++++
hw/riscv/riscv-iommu.c | 59 +++++++++++++++++++++++++++++++++++++
2 files changed, 76 insertions(+)
diff --git a/hw/riscv/riscv-iommu-bits.h b/hw/riscv/riscv-iommu-bits.h
index 96a994b9aa..6359ae0353 100644
--- a/hw/riscv/riscv-iommu-bits.h
+++ b/hw/riscv/riscv-iommu-bits.h
@@ -82,6 +82,7 @@ struct riscv_iommu_pq_record {
#define RISCV_IOMMU_CAP_ATS BIT_ULL(25)
#define RISCV_IOMMU_CAP_T2GPA BIT_ULL(26)
#define RISCV_IOMMU_CAP_IGS GENMASK_ULL(29, 28)
+#define RISCV_IOMMU_CAP_DBG BIT_ULL(31)
#define RISCV_IOMMU_CAP_PAS GENMASK_ULL(37, 32)
#define RISCV_IOMMU_CAP_PD8 BIT_ULL(38)
#define RISCV_IOMMU_CAP_PD17 BIT_ULL(39)
@@ -184,6 +185,22 @@ enum {
RISCV_IOMMU_INTR_COUNT
};
+/* 5.24 Translation request IOVA (64bits) */
+#define RISCV_IOMMU_REG_TR_REQ_IOVA 0x0258
+
+/* 5.25 Translation request control (64bits) */
+#define RISCV_IOMMU_REG_TR_REQ_CTL 0x0260
+#define RISCV_IOMMU_TR_REQ_CTL_GO_BUSY BIT_ULL(0)
+#define RISCV_IOMMU_TR_REQ_CTL_NW BIT_ULL(3)
+#define RISCV_IOMMU_TR_REQ_CTL_PID GENMASK_ULL(31, 12)
+#define RISCV_IOMMU_TR_REQ_CTL_DID GENMASK_ULL(63, 40)
+
+/* 5.26 Translation request response (64bits) */
+#define RISCV_IOMMU_REG_TR_RESPONSE 0x0268
+#define RISCV_IOMMU_TR_RESPONSE_FAULT BIT_ULL(0)
+#define RISCV_IOMMU_TR_RESPONSE_S BIT_ULL(9)
+#define RISCV_IOMMU_TR_RESPONSE_PPN RISCV_IOMMU_PPN_FIELD
+
/* 5.27 Interrupt cause to vector (64bits) */
#define RISCV_IOMMU_REG_ICVEC 0x02F8
#define RISCV_IOMMU_ICVEC_CIV GENMASK_ULL(3, 0)
diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
index 50740442bb..135f461ccf 100644
--- a/hw/riscv/riscv-iommu.c
+++ b/hw/riscv/riscv-iommu.c
@@ -1769,6 +1769,50 @@ static void riscv_iommu_process_pq_control(RISCVIOMMUState *s)
riscv_iommu_reg_mod32(s, RISCV_IOMMU_REG_PQCSR, ctrl_set, ctrl_clr);
}
+static void riscv_iommu_process_dbg(RISCVIOMMUState *s)
+{
+ uint64_t iova = riscv_iommu_reg_get64(s, RISCV_IOMMU_REG_TR_REQ_IOVA);
+ uint64_t ctrl = riscv_iommu_reg_get64(s, RISCV_IOMMU_REG_TR_REQ_CTL);
+ unsigned devid = get_field(ctrl, RISCV_IOMMU_TR_REQ_CTL_DID);
+ unsigned pid = get_field(ctrl, RISCV_IOMMU_TR_REQ_CTL_PID);
+ RISCVIOMMUContext *ctx;
+ void *ref;
+
+ if (!(ctrl & RISCV_IOMMU_TR_REQ_CTL_GO_BUSY)) {
+ return;
+ }
+
+ ctx = riscv_iommu_ctx(s, devid, pid, &ref);
+ if (ctx == NULL) {
+ riscv_iommu_reg_set64(s, RISCV_IOMMU_REG_TR_RESPONSE,
+ RISCV_IOMMU_TR_RESPONSE_FAULT |
+ (RISCV_IOMMU_FQ_CAUSE_DMA_DISABLED << 10));
+ } else {
+ IOMMUTLBEntry iotlb = {
+ .iova = iova,
+ .perm = ctrl & RISCV_IOMMU_TR_REQ_CTL_NW ? IOMMU_RO : IOMMU_RW,
+ .addr_mask = ~0,
+ .target_as = NULL,
+ };
+ int fault = riscv_iommu_translate(s, ctx, &iotlb, false);
+ if (fault) {
+ iova = RISCV_IOMMU_TR_RESPONSE_FAULT | (((uint64_t) fault) << 10);
+ } else {
+ iova = iotlb.translated_addr & ~iotlb.addr_mask;
+ iova >>= TARGET_PAGE_BITS;
+ iova &= RISCV_IOMMU_TR_RESPONSE_PPN;
+
+ /* We do not support superpages (> 4kbs) for now */
+ iova &= ~RISCV_IOMMU_TR_RESPONSE_S;
+ }
+ riscv_iommu_reg_set64(s, RISCV_IOMMU_REG_TR_RESPONSE, iova);
+ }
+
+ riscv_iommu_reg_mod64(s, RISCV_IOMMU_REG_TR_REQ_CTL, 0,
+ RISCV_IOMMU_TR_REQ_CTL_GO_BUSY);
+ riscv_iommu_ctx_put(s, ref);
+}
+
typedef void riscv_iommu_process_fn(RISCVIOMMUState *s);
static void riscv_iommu_update_icvec(RISCVIOMMUState *s, uint64_t data)
@@ -1922,6 +1966,12 @@ static MemTxResult riscv_iommu_mmio_write(void *opaque, hwaddr addr,
return MEMTX_OK;
+ case RISCV_IOMMU_REG_TR_REQ_CTL:
+ process_fn = riscv_iommu_process_dbg;
+ regb = RISCV_IOMMU_REG_TR_REQ_CTL;
+ busy = RISCV_IOMMU_TR_REQ_CTL_GO_BUSY;
+ break;
+
default:
break;
}
@@ -2094,6 +2144,9 @@ static void riscv_iommu_realize(DeviceState *dev, Error **errp)
s->cap |= RISCV_IOMMU_CAP_SV32X4 | RISCV_IOMMU_CAP_SV39X4 |
RISCV_IOMMU_CAP_SV48X4 | RISCV_IOMMU_CAP_SV57X4;
}
+ /* Enable translation debug interface */
+ s->cap |= RISCV_IOMMU_CAP_DBG;
+
/* Report QEMU target physical address space limits */
s->cap = set_field(s->cap, RISCV_IOMMU_CAP_PAS,
TARGET_PHYS_ADDR_SPACE_BITS);
@@ -2150,6 +2203,12 @@ static void riscv_iommu_realize(DeviceState *dev, Error **errp)
stl_le_p(&s->regs_wc[RISCV_IOMMU_REG_IPSR], ~0);
stl_le_p(&s->regs_ro[RISCV_IOMMU_REG_ICVEC], 0);
stq_le_p(&s->regs_rw[RISCV_IOMMU_REG_DDTP], s->ddtp);
+ /* If debug registers enabled. */
+ if (s->cap & RISCV_IOMMU_CAP_DBG) {
+ stq_le_p(&s->regs_ro[RISCV_IOMMU_REG_TR_REQ_IOVA], 0);
+ stq_le_p(&s->regs_ro[RISCV_IOMMU_REG_TR_REQ_CTL],
+ RISCV_IOMMU_TR_REQ_CTL_GO_BUSY);
+ }
/* Memory region for downstream access, if specified. */
if (s->target_mr) {
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 24/47] qtest/riscv-iommu-test: add init queues test
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (22 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 23/47] hw/riscv/riscv-iommu: add DBG support Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 25/47] docs/specs: add riscv-iommu Alistair Francis
` (23 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Daniel Henrique Barboza, Frank Chang,
Alistair Francis
From: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Add an additional test to further exercise the IOMMU where we attempt to
initialize the command, fault and page-request queues.
These steps are taken from chapter 6.2 of the RISC-V IOMMU spec,
"Guidelines for initialization". It emulates what we expect from the
software/OS when initializing the IOMMU.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-12-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
tests/qtest/libqos/riscv-iommu.h | 30 ++++++++
tests/qtest/riscv-iommu-test.c | 127 +++++++++++++++++++++++++++++++
2 files changed, 157 insertions(+)
diff --git a/tests/qtest/libqos/riscv-iommu.h b/tests/qtest/libqos/riscv-iommu.h
index d123efb41f..318db13799 100644
--- a/tests/qtest/libqos/riscv-iommu.h
+++ b/tests/qtest/libqos/riscv-iommu.h
@@ -62,6 +62,36 @@
#define RISCV_IOMMU_REG_IPSR 0x0054
+#define RISCV_IOMMU_REG_IVEC 0x02F8
+#define RISCV_IOMMU_REG_IVEC_CIV GENMASK_ULL(3, 0)
+#define RISCV_IOMMU_REG_IVEC_FIV GENMASK_ULL(7, 4)
+#define RISCV_IOMMU_REG_IVEC_PMIV GENMASK_ULL(11, 8)
+#define RISCV_IOMMU_REG_IVEC_PIV GENMASK_ULL(15, 12)
+
+#define RISCV_IOMMU_REG_CQB 0x0018
+#define RISCV_IOMMU_CQB_PPN_START 10
+#define RISCV_IOMMU_CQB_PPN_LEN 44
+#define RISCV_IOMMU_CQB_LOG2SZ_START 0
+#define RISCV_IOMMU_CQB_LOG2SZ_LEN 5
+
+#define RISCV_IOMMU_REG_CQT 0x0024
+
+#define RISCV_IOMMU_REG_FQB 0x0028
+#define RISCV_IOMMU_FQB_PPN_START 10
+#define RISCV_IOMMU_FQB_PPN_LEN 44
+#define RISCV_IOMMU_FQB_LOG2SZ_START 0
+#define RISCV_IOMMU_FQB_LOG2SZ_LEN 5
+
+#define RISCV_IOMMU_REG_FQT 0x0034
+
+#define RISCV_IOMMU_REG_PQB 0x0038
+#define RISCV_IOMMU_PQB_PPN_START 10
+#define RISCV_IOMMU_PQB_PPN_LEN 44
+#define RISCV_IOMMU_PQB_LOG2SZ_START 0
+#define RISCV_IOMMU_PQB_LOG2SZ_LEN 5
+
+#define RISCV_IOMMU_REG_PQT 0x0044
+
typedef struct QRISCVIOMMU {
QOSGraphObject obj;
QPCIDevice dev;
diff --git a/tests/qtest/riscv-iommu-test.c b/tests/qtest/riscv-iommu-test.c
index 7f0dbd0211..c38a0a160d 100644
--- a/tests/qtest/riscv-iommu-test.c
+++ b/tests/qtest/riscv-iommu-test.c
@@ -33,6 +33,20 @@ static uint64_t riscv_iommu_read_reg64(QRISCVIOMMU *r_iommu, int reg_offset)
return reg;
}
+static void riscv_iommu_write_reg32(QRISCVIOMMU *r_iommu, int reg_offset,
+ uint32_t val)
+{
+ qpci_memwrite(&r_iommu->dev, r_iommu->reg_bar, reg_offset,
+ &val, sizeof(val));
+}
+
+static void riscv_iommu_write_reg64(QRISCVIOMMU *r_iommu, int reg_offset,
+ uint64_t val)
+{
+ qpci_memwrite(&r_iommu->dev, r_iommu->reg_bar, reg_offset,
+ &val, sizeof(val));
+}
+
static void test_pci_config(void *obj, void *data, QGuestAllocator *t_alloc)
{
QRISCVIOMMU *r_iommu = obj;
@@ -84,10 +98,123 @@ static void test_reg_reset(void *obj, void *data, QGuestAllocator *t_alloc)
g_assert_cmpuint(reg, ==, 0);
}
+/*
+ * Common timeout-based poll for CQCSR, FQCSR and PQCSR. All
+ * their ON bits are mapped as RISCV_IOMMU_QUEUE_ACTIVE (16),
+ */
+static void qtest_wait_for_queue_active(QRISCVIOMMU *r_iommu,
+ uint32_t queue_csr)
+{
+ QTestState *qts = global_qtest;
+ guint64 timeout_us = 2 * 1000 * 1000;
+ gint64 start_time = g_get_monotonic_time();
+ uint32_t reg;
+
+ for (;;) {
+ qtest_clock_step(qts, 100);
+
+ reg = riscv_iommu_read_reg32(r_iommu, queue_csr);
+ if (reg & RISCV_IOMMU_QUEUE_ACTIVE) {
+ break;
+ }
+ g_assert(g_get_monotonic_time() - start_time <= timeout_us);
+ }
+}
+
+/*
+ * Goes through the queue activation procedures of chapter 6.2,
+ * "Guidelines for initialization", of the RISCV-IOMMU spec.
+ */
+static void test_iommu_init_queues(void *obj, void *data,
+ QGuestAllocator *t_alloc)
+{
+ QRISCVIOMMU *r_iommu = obj;
+ uint64_t reg64, q_addr;
+ uint32_t reg;
+ int k = 2;
+
+ reg64 = riscv_iommu_read_reg64(r_iommu, RISCV_IOMMU_REG_CAP);
+ g_assert_cmpuint(reg64 & RISCV_IOMMU_CAP_VERSION, ==, 0x10);
+
+ /*
+ * Program the command queue. Write 0xF to civ, fiv, pmiv and
+ * piv. With the current PCI device impl we expect 2 writable
+ * bits for each (k = 2) since we have N = 4 total vectors (2^k).
+ */
+ riscv_iommu_write_reg32(r_iommu, RISCV_IOMMU_REG_IVEC, 0xFFFF);
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_IVEC);
+ g_assert_cmpuint(reg & RISCV_IOMMU_REG_IVEC_CIV, ==, 0x3);
+ g_assert_cmpuint(reg & RISCV_IOMMU_REG_IVEC_FIV, ==, 0x30);
+ g_assert_cmpuint(reg & RISCV_IOMMU_REG_IVEC_PMIV, ==, 0x300);
+ g_assert_cmpuint(reg & RISCV_IOMMU_REG_IVEC_PIV, ==, 0x3000);
+
+ /* Alloc a 4*16 bytes buffer and use it to set cqb */
+ q_addr = guest_alloc(t_alloc, 4 * 16);
+ reg64 = 0;
+ deposit64(reg64, RISCV_IOMMU_CQB_PPN_START,
+ RISCV_IOMMU_CQB_PPN_LEN, q_addr);
+ deposit64(reg64, RISCV_IOMMU_CQB_LOG2SZ_START,
+ RISCV_IOMMU_CQB_LOG2SZ_LEN, k - 1);
+ riscv_iommu_write_reg64(r_iommu, RISCV_IOMMU_REG_CQB, reg64);
+
+ /* cqt = 0, cqcsr.cqen = 1, poll cqcsr.cqon until it reads 1 */
+ riscv_iommu_write_reg32(r_iommu, RISCV_IOMMU_REG_CQT, 0);
+
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_CQCSR);
+ reg |= RISCV_IOMMU_CQCSR_CQEN;
+ riscv_iommu_write_reg32(r_iommu, RISCV_IOMMU_REG_CQCSR, reg);
+
+ qtest_wait_for_queue_active(r_iommu, RISCV_IOMMU_REG_CQCSR);
+
+ /*
+ * Program the fault queue. Alloc a 4*32 bytes (instead of 4*16)
+ * buffer and use it to set fqb.
+ */
+ q_addr = guest_alloc(t_alloc, 4 * 32);
+ reg64 = 0;
+ deposit64(reg64, RISCV_IOMMU_FQB_PPN_START,
+ RISCV_IOMMU_FQB_PPN_LEN, q_addr);
+ deposit64(reg64, RISCV_IOMMU_FQB_LOG2SZ_START,
+ RISCV_IOMMU_FQB_LOG2SZ_LEN, k - 1);
+ riscv_iommu_write_reg64(r_iommu, RISCV_IOMMU_REG_FQB, reg64);
+
+ /* fqt = 0, fqcsr.fqen = 1, poll fqcsr.fqon until it reads 1 */
+ riscv_iommu_write_reg32(r_iommu, RISCV_IOMMU_REG_FQT, 0);
+
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_FQCSR);
+ reg |= RISCV_IOMMU_FQCSR_FQEN;
+ riscv_iommu_write_reg32(r_iommu, RISCV_IOMMU_REG_FQCSR, reg);
+
+ qtest_wait_for_queue_active(r_iommu, RISCV_IOMMU_REG_FQCSR);
+
+ /*
+ * Program the page-request queue. Alloc a 4*16 bytes buffer
+ * and use it to set pqb.
+ */
+ q_addr = guest_alloc(t_alloc, 4 * 16);
+ reg64 = 0;
+ deposit64(reg64, RISCV_IOMMU_PQB_PPN_START,
+ RISCV_IOMMU_PQB_PPN_LEN, q_addr);
+ deposit64(reg64, RISCV_IOMMU_PQB_LOG2SZ_START,
+ RISCV_IOMMU_PQB_LOG2SZ_LEN, k - 1);
+ riscv_iommu_write_reg64(r_iommu, RISCV_IOMMU_REG_PQB, reg64);
+
+ /* pqt = 0, pqcsr.pqen = 1, poll pqcsr.pqon until it reads 1 */
+ riscv_iommu_write_reg32(r_iommu, RISCV_IOMMU_REG_PQT, 0);
+
+ reg = riscv_iommu_read_reg32(r_iommu, RISCV_IOMMU_REG_PQCSR);
+ reg |= RISCV_IOMMU_PQCSR_PQEN;
+ riscv_iommu_write_reg32(r_iommu, RISCV_IOMMU_REG_PQCSR, reg);
+
+ qtest_wait_for_queue_active(r_iommu, RISCV_IOMMU_REG_PQCSR);
+}
+
static void register_riscv_iommu_test(void)
{
qos_add_test("pci_config", "riscv-iommu-pci", test_pci_config, NULL);
qos_add_test("reg_reset", "riscv-iommu-pci", test_reg_reset, NULL);
+ qos_add_test("iommu_init_queues", "riscv-iommu-pci",
+ test_iommu_init_queues, NULL);
}
libqos_init(register_riscv_iommu_test);
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 25/47] docs/specs: add riscv-iommu
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (23 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 24/47] qtest/riscv-iommu-test: add init queues test Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 26/47] hw/riscv: Respect firmware ELF entry point Alistair Francis
` (22 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Daniel Henrique Barboza, Alistair Francis
From: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Add a simple guideline to use the existing RISC-V IOMMU support we just
added.
This doc will be updated once we add the riscv-iommu-sys device.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240903201633.93182-13-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
docs/specs/index.rst | 1 +
docs/specs/riscv-iommu.rst | 90 ++++++++++++++++++++++++++++++++++++++
docs/system/riscv/virt.rst | 13 ++++++
3 files changed, 104 insertions(+)
create mode 100644 docs/specs/riscv-iommu.rst
diff --git a/docs/specs/index.rst b/docs/specs/index.rst
index 6495ed5ed9..ff5a1f03da 100644
--- a/docs/specs/index.rst
+++ b/docs/specs/index.rst
@@ -36,3 +36,4 @@ guest hardware that is specific to QEMU.
vmgenid
rapl-msr
rocker
+ riscv-iommu
diff --git a/docs/specs/riscv-iommu.rst b/docs/specs/riscv-iommu.rst
new file mode 100644
index 0000000000..463f4cffb6
--- /dev/null
+++ b/docs/specs/riscv-iommu.rst
@@ -0,0 +1,90 @@
+.. _riscv-iommu:
+
+RISC-V IOMMU support for RISC-V machines
+========================================
+
+QEMU implements a RISC-V IOMMU emulation based on the RISC-V IOMMU spec
+version 1.0 `iommu1.0`_.
+
+The emulation includes a PCI reference device, riscv-iommu-pci, that QEMU
+RISC-V boards can use. The 'virt' RISC-V machine is compatible with this
+device.
+
+riscv-iommu-pci reference device
+--------------------------------
+
+This device implements the RISC-V IOMMU emulation as recommended by the section
+"Integrating an IOMMU as a PCIe device" of `iommu1.0`_: a PCI device with base
+class 08h, sub-class 06h and programming interface 00h.
+
+As a reference device it doesn't implement anything outside of the specification,
+so it uses a generic default PCI ID given by QEMU: 1b36:0014.
+
+To include the device in the 'virt' machine:
+
+.. code-block:: bash
+
+ $ qemu-system-riscv64 -M virt -device riscv-iommu-pci,[optional_pci_opts] (...)
+
+This will add a RISC-V IOMMU PCI device in the board following any additional
+PCI parameters (like PCI bus address). The behavior of the RISC-V IOMMU is
+defined by the spec but its operation is OS dependent.
+
+As of this writing the existing Linux kernel support `linux-v8`_, not yet merged,
+does not have support for features like VFIO passthrough. The IOMMU emulation
+was tested using a public Ventana Micro Systems kernel repository in
+`ventana-linux`_. This kernel is based on `linux-v8`_ with additional patches that
+enable features like KVM VFIO passthrough with irqbypass. Until the kernel support
+is feature complete feel free to use the kernel available in the Ventana Micro Systems
+mirror.
+
+The current Linux kernel support will use the IOMMU device to create IOMMU groups
+with any eligible cards available in the system, regardless of factors such as the
+order in which the devices are added in the command line.
+
+This means that these command lines are equivalent as far as the current
+IOMMU kernel driver behaves:
+
+.. code-block:: bash
+
+ $ qemu-system-riscv64 \
+ -M virt,aia=aplic-imsic,aia-guests=5 \
+ -device riscv-iommu-pci,addr=1.0,vendor-id=0x1efd,device-id=0xedf1 \
+ -device e1000e,netdev=net1 -netdev user,id=net1,net=192.168.0.0/24 \
+ -device e1000e,netdev=net2 -netdev user,id=net2,net=192.168.200.0/24 \
+ (...)
+
+ $ qemu-system-riscv64 \
+ -M virt,aia=aplic-imsic,aia-guests=5 \
+ -device e1000e,netdev=net1 -netdev user,id=net1,net=192.168.0.0/24 \
+ -device e1000e,netdev=net2 -netdev user,id=net2,net=192.168.200.0/24 \
+ -device riscv-iommu-pci,addr=1.0,vendor-id=0x1efd,device-id=0xedf1 \
+ (...)
+
+Both will create iommu groups for the two e1000e cards.
+
+Another thing to notice on `linux-v8`_ and `ventana-linux`_ is that the kernel driver
+considers an IOMMU identified as a Rivos device, i.e. it uses Rivos vendor ID. To
+use the riscv-iommu-pci device with the existing kernel support we need to emulate
+a Rivos PCI IOMMU by setting 'vendor-id' and 'device-id':
+
+.. code-block:: bash
+
+ $ qemu-system-riscv64 -M virt \
+ -device riscv-iommu-pci,vendor-id=0x1efd,device-id=0xedf1 (...)
+
+Several options are available to control the capabilities of the device, namely:
+
+- "bus": the bus that the IOMMU device uses
+- "ioatc-limit": size of the Address Translation Cache (default to 2Mb)
+- "intremap": enable/disable MSI support
+- "ats": enable ATS support
+- "off" (Out-of-reset translation mode: 'on' for DMA disabled, 'off' for 'BARE' (passthrough))
+- "s-stage": enable s-stage support
+- "g-stage": enable g-stage support
+
+.. _iommu1.0: https://github.com/riscv-non-isa/riscv-iommu/releases/download/v1.0/riscv-iommu.pdf
+
+.. _linux-v8: https://lore.kernel.org/linux-riscv/cover.1718388908.git.tjeznach@rivosinc.com/
+
+.. _ventana-linux: https://github.com/ventanamicro/linux/tree/dev-upstream
diff --git a/docs/system/riscv/virt.rst b/docs/system/riscv/virt.rst
index 9a06f95a34..8e9a2e4dda 100644
--- a/docs/system/riscv/virt.rst
+++ b/docs/system/riscv/virt.rst
@@ -84,6 +84,19 @@ none``, as in
Firmware images used for pflash must be exactly 32 MiB in size.
+riscv-iommu support
+-------------------
+
+The board has support for the riscv-iommu-pci device by using the following
+command line:
+
+.. code-block:: bash
+
+ $ qemu-system-riscv64 -M virt -device riscv-iommu-pci (...)
+
+Refer to :ref:`riscv-iommu` for more information on how the RISC-V IOMMU support
+works.
+
Machine-specific options
------------------------
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 26/47] hw/riscv: Respect firmware ELF entry point
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (24 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 25/47] docs/specs: add riscv-iommu Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 27/47] target: riscv: Add Svvptc extension support Alistair Francis
` (21 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Samuel Holland, Alistair Francis
From: Samuel Holland <samuel.holland@sifive.com>
When riscv_load_firmware() loads an ELF, the ELF segment addresses are
used, not the passed-in firmware_load_addr. The machine models assume
the firmware entry point is what they provided for firmware_load_addr,
and use that address to generate the boot ROM, so if the ELF is linked
at any other address, the boot ROM will jump to empty memory.
Pass back the ELF entry point to use when generating the boot ROM, so
the boot ROM can jump to firmware loaded anywhere in RAM. For example,
on the virt machine, this allows using an OpenSBI fw_dynamic.elf built
with FW_TEXT_START values other than 0x80000000.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240817002651.3209701-1-samuel.holland@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
include/hw/riscv/boot.h | 4 ++--
hw/riscv/boot.c | 11 ++++++-----
hw/riscv/microchip_pfsoc.c | 2 +-
hw/riscv/opentitan.c | 3 ++-
hw/riscv/shakti_c.c | 13 ++++++-------
hw/riscv/sifive_u.c | 4 ++--
hw/riscv/spike.c | 5 +++--
hw/riscv/virt.c | 4 ++--
8 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
index a2e4ae9cb0..18bfe9f7bf 100644
--- a/include/hw/riscv/boot.h
+++ b/include/hw/riscv/boot.h
@@ -35,13 +35,13 @@ target_ulong riscv_calc_kernel_start_addr(RISCVHartArrayState *harts,
target_ulong firmware_end_addr);
target_ulong riscv_find_and_load_firmware(MachineState *machine,
const char *default_machine_firmware,
- hwaddr firmware_load_addr,
+ hwaddr *firmware_load_addr,
symbol_fn_t sym_cb);
const char *riscv_default_firmware_name(RISCVHartArrayState *harts);
char *riscv_find_firmware(const char *firmware_filename,
const char *default_machine_firmware);
target_ulong riscv_load_firmware(const char *firmware_filename,
- hwaddr firmware_load_addr,
+ hwaddr *firmware_load_addr,
symbol_fn_t sym_cb);
target_ulong riscv_load_kernel(MachineState *machine,
RISCVHartArrayState *harts,
diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
index 47281ca853..9115ecd91f 100644
--- a/hw/riscv/boot.c
+++ b/hw/riscv/boot.c
@@ -128,11 +128,11 @@ char *riscv_find_firmware(const char *firmware_filename,
target_ulong riscv_find_and_load_firmware(MachineState *machine,
const char *default_machine_firmware,
- hwaddr firmware_load_addr,
+ hwaddr *firmware_load_addr,
symbol_fn_t sym_cb)
{
char *firmware_filename;
- target_ulong firmware_end_addr = firmware_load_addr;
+ target_ulong firmware_end_addr = *firmware_load_addr;
firmware_filename = riscv_find_firmware(machine->firmware,
default_machine_firmware);
@@ -148,7 +148,7 @@ target_ulong riscv_find_and_load_firmware(MachineState *machine,
}
target_ulong riscv_load_firmware(const char *firmware_filename,
- hwaddr firmware_load_addr,
+ hwaddr *firmware_load_addr,
symbol_fn_t sym_cb)
{
uint64_t firmware_entry, firmware_end;
@@ -159,15 +159,16 @@ target_ulong riscv_load_firmware(const char *firmware_filename,
if (load_elf_ram_sym(firmware_filename, NULL, NULL, NULL,
&firmware_entry, NULL, &firmware_end, NULL,
0, EM_RISCV, 1, 0, NULL, true, sym_cb) > 0) {
+ *firmware_load_addr = firmware_entry;
return firmware_end;
}
firmware_size = load_image_targphys_as(firmware_filename,
- firmware_load_addr,
+ *firmware_load_addr,
current_machine->ram_size, NULL);
if (firmware_size > 0) {
- return firmware_load_addr + firmware_size;
+ return *firmware_load_addr + firmware_size;
}
error_report("could not load firmware '%s'", firmware_filename);
diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
index 7725dfbde5..f9a3b43d2e 100644
--- a/hw/riscv/microchip_pfsoc.c
+++ b/hw/riscv/microchip_pfsoc.c
@@ -613,7 +613,7 @@ static void microchip_icicle_kit_machine_init(MachineState *machine)
/* Load the firmware */
firmware_end_addr = riscv_find_and_load_firmware(machine, firmware_name,
- firmware_load_addr, NULL);
+ &firmware_load_addr, NULL);
if (kernel_as_payload) {
kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc.u_cpus,
diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index 436503f1ba..e2830e9dc2 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -98,7 +98,8 @@ static void opentitan_machine_init(MachineState *machine)
memmap[IBEX_DEV_RAM].base, machine->ram);
if (machine->firmware) {
- riscv_load_firmware(machine->firmware, memmap[IBEX_DEV_RAM].base, NULL);
+ hwaddr firmware_load_addr = memmap[IBEX_DEV_RAM].base;
+ riscv_load_firmware(machine->firmware, &firmware_load_addr, NULL);
}
if (machine->kernel_filename) {
diff --git a/hw/riscv/shakti_c.c b/hw/riscv/shakti_c.c
index 3888034c2b..2dccc1eff2 100644
--- a/hw/riscv/shakti_c.c
+++ b/hw/riscv/shakti_c.c
@@ -45,6 +45,7 @@ static void shakti_c_machine_state_init(MachineState *mstate)
{
ShaktiCMachineState *sms = RISCV_SHAKTI_MACHINE(mstate);
MemoryRegion *system_memory = get_system_memory();
+ hwaddr firmware_load_addr = shakti_c_memmap[SHAKTI_C_RAM].base;
/* Initialize SoC */
object_initialize_child(OBJECT(mstate), "soc", &sms->soc,
@@ -56,16 +57,14 @@ static void shakti_c_machine_state_init(MachineState *mstate)
shakti_c_memmap[SHAKTI_C_RAM].base,
mstate->ram);
+ if (mstate->firmware) {
+ riscv_load_firmware(mstate->firmware, &firmware_load_addr, NULL);
+ }
+
/* ROM reset vector */
- riscv_setup_rom_reset_vec(mstate, &sms->soc.cpus,
- shakti_c_memmap[SHAKTI_C_RAM].base,
+ riscv_setup_rom_reset_vec(mstate, &sms->soc.cpus, firmware_load_addr,
shakti_c_memmap[SHAKTI_C_ROM].base,
shakti_c_memmap[SHAKTI_C_ROM].size, 0, 0);
- if (mstate->firmware) {
- riscv_load_firmware(mstate->firmware,
- shakti_c_memmap[SHAKTI_C_RAM].base,
- NULL);
- }
}
static void shakti_c_machine_instance_init(Object *obj)
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index af5f923f54..35a689309d 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -515,7 +515,7 @@ static void sifive_u_machine_init(MachineState *machine)
SiFiveUState *s = RISCV_U_MACHINE(machine);
MemoryRegion *system_memory = get_system_memory();
MemoryRegion *flash0 = g_new(MemoryRegion, 1);
- target_ulong start_addr = memmap[SIFIVE_U_DEV_DRAM].base;
+ hwaddr start_addr = memmap[SIFIVE_U_DEV_DRAM].base;
target_ulong firmware_end_addr, kernel_start_addr;
const char *firmware_name;
uint32_t start_addr_hi32 = 0x00000000;
@@ -589,7 +589,7 @@ static void sifive_u_machine_init(MachineState *machine)
firmware_name = riscv_default_firmware_name(&s->soc.u_cpus);
firmware_end_addr = riscv_find_and_load_firmware(machine, firmware_name,
- start_addr, NULL);
+ &start_addr, NULL);
if (machine->kernel_filename) {
kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc.u_cpus,
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 64074395bc..fceb91d946 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -198,6 +198,7 @@ static void spike_board_init(MachineState *machine)
MemoryRegion *system_memory = get_system_memory();
MemoryRegion *mask_rom = g_new(MemoryRegion, 1);
target_ulong firmware_end_addr = memmap[SPIKE_DRAM].base;
+ hwaddr firmware_load_addr = memmap[SPIKE_DRAM].base;
target_ulong kernel_start_addr;
char *firmware_name;
uint32_t fdt_load_addr;
@@ -290,7 +291,7 @@ static void spike_board_init(MachineState *machine)
/* Load firmware */
if (firmware_name) {
firmware_end_addr = riscv_load_firmware(firmware_name,
- memmap[SPIKE_DRAM].base,
+ &firmware_load_addr,
htif_symbol_callback);
g_free(firmware_name);
}
@@ -320,7 +321,7 @@ static void spike_board_init(MachineState *machine)
riscv_load_fdt(fdt_load_addr, machine->fdt);
/* load the reset vector */
- riscv_setup_rom_reset_vec(machine, &s->soc[0], memmap[SPIKE_DRAM].base,
+ riscv_setup_rom_reset_vec(machine, &s->soc[0], firmware_load_addr,
memmap[SPIKE_MROM].base,
memmap[SPIKE_MROM].size, kernel_entry,
fdt_load_addr);
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 63d553ac98..1d32b4b0f3 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -1360,7 +1360,7 @@ static void virt_machine_done(Notifier *notifier, void *data)
machine_done);
const MemMapEntry *memmap = virt_memmap;
MachineState *machine = MACHINE(s);
- target_ulong start_addr = memmap[VIRT_DRAM].base;
+ hwaddr start_addr = memmap[VIRT_DRAM].base;
target_ulong firmware_end_addr, kernel_start_addr;
const char *firmware_name = riscv_default_firmware_name(&s->soc[0]);
uint64_t fdt_load_addr;
@@ -1392,7 +1392,7 @@ static void virt_machine_done(Notifier *notifier, void *data)
}
firmware_end_addr = riscv_find_and_load_firmware(machine, firmware_name,
- start_addr, NULL);
+ &start_addr, NULL);
pflash_blk0 = pflash_cfi01_get_blk(s->flash[0]);
if (pflash_blk0) {
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 27/47] target: riscv: Add Svvptc extension support
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (25 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 26/47] hw/riscv: Respect firmware ELF entry point Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 28/47] target/riscv32: Fix masking of physical address Alistair Francis
` (20 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Alexandre Ghiti, Alistair Francis
From: Alexandre Ghiti <alexghiti@rivosinc.com>
The Svvptc extension describes a uarch that does not cache invalid TLB
entries: that's the case for qemu so there is nothing particular to
implement other than the introduction of this extension.
Since qemu already exposes Svvptc behaviour, let's enable it by default
since it allows to drastically reduce the number of sfence.vma emitted
by S-mode.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240828083651.203861-1-alexghiti@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/cpu_cfg.h | 1 +
target/riscv/cpu.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index 96fe26d4ea..355afedfd3 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -81,6 +81,7 @@ struct RISCVCPUConfig {
bool ext_svinval;
bool ext_svnapot;
bool ext_svpbmt;
+ bool ext_svvptc;
bool ext_zdinx;
bool ext_zaamo;
bool ext_zacas;
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 89bc3955ee..658bdb4ae1 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -197,6 +197,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
ISA_EXT_DATA_ENTRY(svinval, PRIV_VERSION_1_12_0, ext_svinval),
ISA_EXT_DATA_ENTRY(svnapot, PRIV_VERSION_1_12_0, ext_svnapot),
ISA_EXT_DATA_ENTRY(svpbmt, PRIV_VERSION_1_12_0, ext_svpbmt),
+ ISA_EXT_DATA_ENTRY(svvptc, PRIV_VERSION_1_13_0, ext_svvptc),
ISA_EXT_DATA_ENTRY(xtheadba, PRIV_VERSION_1_11_0, ext_xtheadba),
ISA_EXT_DATA_ENTRY(xtheadbb, PRIV_VERSION_1_11_0, ext_xtheadbb),
ISA_EXT_DATA_ENTRY(xtheadbs, PRIV_VERSION_1_11_0, ext_xtheadbs),
@@ -1494,6 +1495,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
MULTI_EXT_CFG_BOOL("svinval", ext_svinval, false),
MULTI_EXT_CFG_BOOL("svnapot", ext_svnapot, false),
MULTI_EXT_CFG_BOOL("svpbmt", ext_svpbmt, false),
+ MULTI_EXT_CFG_BOOL("svvptc", ext_svvptc, true),
MULTI_EXT_CFG_BOOL("zicntr", ext_zicntr, true),
MULTI_EXT_CFG_BOOL("zihpm", ext_zihpm, true),
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 28/47] target/riscv32: Fix masking of physical address
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (26 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 27/47] target: riscv: Add Svvptc extension support Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 29/47] target/riscv/cpu_helper: Fix linking problem with semihosting disabled Alistair Francis
` (19 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Andrew Jones, Alistair Francis, Richard Henderson
From: Andrew Jones <ajones@ventanamicro.com>
C doesn't extend the sign bit for unsigned types since there isn't a
sign bit to extend. This means a promotion of a u32 to a u64 results
in the upper 32 bits of the u64 being zero. If that result is then
used as a mask on another u64 the upper 32 bits will be cleared. rv32
physical addresses may be up to 34 bits wide, so we don't want to
clear the high bits while page aligning the address. The fix is to
use hwaddr for the mask, which, even on rv32, is 64-bits wide.
Fixes: af3fc195e3c8 ("target/riscv: Change the TLB page size depends on PMP entries.")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240909083241.43836-2-ajones@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/cpu_helper.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 395a1d9140..4b2c72780c 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -1323,7 +1323,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
int ret = TRANSLATE_FAIL;
int mode = mmuidx_priv(mmu_idx);
/* default TLB page size */
- target_ulong tlb_size = TARGET_PAGE_SIZE;
+ hwaddr tlb_size = TARGET_PAGE_SIZE;
env->guest_phys_fault_addr = 0;
@@ -1375,7 +1375,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
qemu_log_mask(CPU_LOG_MMU,
"%s PMP address=" HWADDR_FMT_plx " ret %d prot"
- " %d tlb_size " TARGET_FMT_lu "\n",
+ " %d tlb_size %" HWADDR_PRIu "\n",
__func__, pa, ret, prot_pmp, tlb_size);
prot &= prot_pmp;
@@ -1409,7 +1409,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
qemu_log_mask(CPU_LOG_MMU,
"%s PMP address=" HWADDR_FMT_plx " ret %d prot"
- " %d tlb_size " TARGET_FMT_lu "\n",
+ " %d tlb_size %" HWADDR_PRIu "\n",
__func__, pa, ret, prot_pmp, tlb_size);
prot &= prot_pmp;
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 29/47] target/riscv/cpu_helper: Fix linking problem with semihosting disabled
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (27 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 28/47] target/riscv32: Fix masking of physical address Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 30/47] hw/intc: riscv-imsic: Fix interrupt state updates Alistair Francis
` (18 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Thomas Huth, Alistair Francis
From: Thomas Huth <thuth@redhat.com>
If QEMU has been configured with "--without-default-devices", the build
is currently failing with:
/usr/bin/ld: libqemu-riscv32-softmmu.a.p/target_riscv_cpu_helper.c.o:
in function `riscv_cpu_do_interrupt':
.../qemu/target/riscv/cpu_helper.c:1678:(.text+0x2214): undefined
reference to `do_common_semihosting'
We always want semihosting to be enabled if TCG is available, so change
the "imply" statements in the Kconfig file to "select", and make sure to
avoid calling into do_common_semihosting() if TCG is not available.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240906094858.718105-1-thuth@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
target/riscv/cpu_helper.c | 2 ++
target/riscv/Kconfig | 4 ++--
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 4b2c72780c..a935377b4a 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -1674,10 +1674,12 @@ void riscv_cpu_do_interrupt(CPUState *cs)
if (!async) {
/* set tval to badaddr for traps with address information */
switch (cause) {
+#ifdef CONFIG_TCG
case RISCV_EXCP_SEMIHOST:
do_common_semihosting(cs);
env->pc += 4;
return;
+#endif
case RISCV_EXCP_LOAD_GUEST_ACCESS_FAULT:
case RISCV_EXCP_STORE_GUEST_AMO_ACCESS_FAULT:
case RISCV_EXCP_LOAD_ADDR_MIS:
diff --git a/target/riscv/Kconfig b/target/riscv/Kconfig
index c332616d36..11bc09b414 100644
--- a/target/riscv/Kconfig
+++ b/target/riscv/Kconfig
@@ -1,9 +1,9 @@
config RISCV32
bool
- imply ARM_COMPATIBLE_SEMIHOSTING if TCG
+ select ARM_COMPATIBLE_SEMIHOSTING if TCG
select DEVICE_TREE # needed by boot.c
config RISCV64
bool
- imply ARM_COMPATIBLE_SEMIHOSTING if TCG
+ select ARM_COMPATIBLE_SEMIHOSTING if TCG
select DEVICE_TREE # needed by boot.c
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 30/47] hw/intc: riscv-imsic: Fix interrupt state updates.
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (28 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 29/47] target/riscv/cpu_helper: Fix linking problem with semihosting disabled Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 31/47] bsd-user: Implement RISC-V CPU initialization and main loop Alistair Francis
` (17 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel; +Cc: alistair23, Tomasz Jeznach, Alistair Francis
From: Tomasz Jeznach <tjeznach@rivosinc.com>
The IMSIC state variable eistate[] is modified by CSR instructions
within a range dedicated to the local CPU and by MMIO writes from any CPU.
Access to eistate from MMIO accessors is protected by the BQL, but
read-modify-write (RMW) sequences from CSRRW do not acquire the BQL,
making the RMW sequence vulnerable to a race condition with MMIO access
from a remote CPU.
This race can manifest as missing IPI or MSI in multi-CPU systems, eg:
[ 43.008092] watchdog: BUG: soft lockup - CPU#2 stuck for 27s! [kworker/u19:1:52]
[ 43.011723] CPU: 2 UID: 0 PID: 52 Comm: kworker/u19:1 Not tainted 6.11.0-rc6
[ 43.013070] Workqueue: events_unbound deferred_probe_work_func
[ 43.018776] [<ffffffff800b4a86>] smp_call_function_many_cond+0x190/0x5c2
[ 43.019205] [<ffffffff800b4f28>] on_each_cpu_cond_mask+0x20/0x32
[ 43.019447] [<ffffffff8001069a>] __flush_tlb_range+0xf2/0x190
[ 43.019683] [<ffffffff80010914>] flush_tlb_kernel_range+0x20/0x28
The interrupt line raise/lower sequence was changed to prevent a race
between the evaluation of the eistate and the execution of the qemu_irq
raise/lower, ensuring that the interrupt line is not incorrectly
deactivated based on a stale topei check result. To avoid holding BQL
all modifications of eistate are converted to atomic operations.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <a7604e4d61068ca4d384ae2a1377e1521d4d0235.1725651699.git.tjeznach@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
hw/intc/riscv_imsic.c | 50 +++++++++++++++++++++++++++----------------
1 file changed, 32 insertions(+), 18 deletions(-)
diff --git a/hw/intc/riscv_imsic.c b/hw/intc/riscv_imsic.c
index b90f0d731d..9ef65d4012 100644
--- a/hw/intc/riscv_imsic.c
+++ b/hw/intc/riscv_imsic.c
@@ -55,7 +55,7 @@ static uint32_t riscv_imsic_topei(RISCVIMSICState *imsic, uint32_t page)
(imsic->eithreshold[page] <= imsic->num_irqs)) ?
imsic->eithreshold[page] : imsic->num_irqs;
for (i = 1; i < max_irq; i++) {
- if ((imsic->eistate[base + i] & IMSIC_EISTATE_ENPEND) ==
+ if ((qatomic_read(&imsic->eistate[base + i]) & IMSIC_EISTATE_ENPEND) ==
IMSIC_EISTATE_ENPEND) {
return (i << IMSIC_TOPEI_IID_SHIFT) | i;
}
@@ -66,10 +66,24 @@ static uint32_t riscv_imsic_topei(RISCVIMSICState *imsic, uint32_t page)
static void riscv_imsic_update(RISCVIMSICState *imsic, uint32_t page)
{
+ uint32_t base = page * imsic->num_irqs;
+
+ /*
+ * Lower the interrupt line if necessary, then evaluate the current
+ * IMSIC state.
+ * This sequence ensures that any race between evaluating the eistate and
+ * updating the interrupt line will not result in an incorrectly
+ * deactivated connected CPU IRQ line.
+ * If multiple interrupts are pending, this sequence functions identically
+ * to qemu_irq_pulse.
+ */
+
+ if (qatomic_fetch_and(&imsic->eistate[base], ~IMSIC_EISTATE_ENPEND)) {
+ qemu_irq_lower(imsic->external_irqs[page]);
+ }
if (imsic->eidelivery[page] && riscv_imsic_topei(imsic, page)) {
qemu_irq_raise(imsic->external_irqs[page]);
- } else {
- qemu_irq_lower(imsic->external_irqs[page]);
+ qatomic_or(&imsic->eistate[base], IMSIC_EISTATE_ENPEND);
}
}
@@ -125,12 +139,11 @@ static int riscv_imsic_topei_rmw(RISCVIMSICState *imsic, uint32_t page,
topei >>= IMSIC_TOPEI_IID_SHIFT;
base = page * imsic->num_irqs;
if (topei) {
- imsic->eistate[base + topei] &= ~IMSIC_EISTATE_PENDING;
+ qatomic_and(&imsic->eistate[base + topei], ~IMSIC_EISTATE_PENDING);
}
-
- riscv_imsic_update(imsic, page);
}
+ riscv_imsic_update(imsic, page);
return 0;
}
@@ -139,7 +152,7 @@ static int riscv_imsic_eix_rmw(RISCVIMSICState *imsic,
uint32_t num, bool pend, target_ulong *val,
target_ulong new_val, target_ulong wr_mask)
{
- uint32_t i, base;
+ uint32_t i, base, prev;
target_ulong mask;
uint32_t state = (pend) ? IMSIC_EISTATE_PENDING : IMSIC_EISTATE_ENABLED;
@@ -157,10 +170,6 @@ static int riscv_imsic_eix_rmw(RISCVIMSICState *imsic,
if (val) {
*val = 0;
- for (i = 0; i < xlen; i++) {
- mask = (target_ulong)1 << i;
- *val |= (imsic->eistate[base + i] & state) ? mask : 0;
- }
}
for (i = 0; i < xlen; i++) {
@@ -172,10 +181,15 @@ static int riscv_imsic_eix_rmw(RISCVIMSICState *imsic,
mask = (target_ulong)1 << i;
if (wr_mask & mask) {
if (new_val & mask) {
- imsic->eistate[base + i] |= state;
+ prev = qatomic_fetch_or(&imsic->eistate[base + i], state);
} else {
- imsic->eistate[base + i] &= ~state;
+ prev = qatomic_fetch_and(&imsic->eistate[base + i], ~state);
}
+ } else {
+ prev = qatomic_read(&imsic->eistate[base + i]);
+ }
+ if (val && (prev & state)) {
+ *val |= mask;
}
}
@@ -302,14 +316,14 @@ static void riscv_imsic_write(void *opaque, hwaddr addr, uint64_t value,
page = addr >> IMSIC_MMIO_PAGE_SHIFT;
if ((addr & (IMSIC_MMIO_PAGE_SZ - 1)) == IMSIC_MMIO_PAGE_LE) {
if (value && (value < imsic->num_irqs)) {
- imsic->eistate[(page * imsic->num_irqs) + value] |=
- IMSIC_EISTATE_PENDING;
+ qatomic_or(&imsic->eistate[(page * imsic->num_irqs) + value],
+ IMSIC_EISTATE_PENDING);
+
+ /* Update CPU external interrupt status */
+ riscv_imsic_update(imsic, page);
}
}
- /* Update CPU external interrupt status */
- riscv_imsic_update(imsic, page);
-
return;
err:
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 31/47] bsd-user: Implement RISC-V CPU initialization and main loop
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (29 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 30/47] hw/intc: riscv-imsic: Fix interrupt state updates Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 32/47] bsd-user: Add RISC-V CPU execution loop and syscall handling Alistair Francis
` (16 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Jessica Clarke,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added the initial implementation for RISC-V CPU initialization and main
loop. This includes setting up the general-purpose registers and
program counter based on the provided target architecture definitions.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-2-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_cpu.h | 40 ++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
create mode 100644 bsd-user/riscv/target_arch_cpu.h
diff --git a/bsd-user/riscv/target_arch_cpu.h b/bsd-user/riscv/target_arch_cpu.h
new file mode 100644
index 0000000000..f8d85e01ad
--- /dev/null
+++ b/bsd-user/riscv/target_arch_cpu.h
@@ -0,0 +1,40 @@
+/*
+ * RISC-V CPU init and loop
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_CPU_H
+#define TARGET_ARCH_CPU_H
+
+#include "target_arch.h"
+#include "signal-common.h"
+
+#define TARGET_DEFAULT_CPU_MODEL "max"
+
+static inline void target_cpu_init(CPURISCVState *env,
+ struct target_pt_regs *regs)
+{
+ int i;
+
+ for (i = 1; i < 32; i++) {
+ env->gpr[i] = regs->regs[i];
+ }
+
+ env->pc = regs->sepc;
+}
+
+#endif /* TARGET_ARCH_CPU_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 32/47] bsd-user: Add RISC-V CPU execution loop and syscall handling
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (30 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 31/47] bsd-user: Implement RISC-V CPU initialization and main loop Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 33/47] bsd-user: Implement RISC-V CPU register cloning and reset functions Alistair Francis
` (15 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Jessica Clarke, Kyle Evans,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Implemented the RISC-V CPU execution loop, including handling various
exceptions and system calls. The loop continuously executes CPU
instructions,processes exceptions, and handles system calls by invoking
FreeBSD syscall handlers.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
Co-authored-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-3-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_cpu.h | 94 ++++++++++++++++++++++++++++++++
1 file changed, 94 insertions(+)
diff --git a/bsd-user/riscv/target_arch_cpu.h b/bsd-user/riscv/target_arch_cpu.h
index f8d85e01ad..9c31d9dc4c 100644
--- a/bsd-user/riscv/target_arch_cpu.h
+++ b/bsd-user/riscv/target_arch_cpu.h
@@ -37,4 +37,98 @@ static inline void target_cpu_init(CPURISCVState *env,
env->pc = regs->sepc;
}
+static inline void target_cpu_loop(CPURISCVState *env)
+{
+ CPUState *cs = env_cpu(env);
+ int trapnr;
+ abi_long ret;
+ unsigned int syscall_num;
+ int32_t signo, code;
+
+ for (;;) {
+ cpu_exec_start(cs);
+ trapnr = cpu_exec(cs);
+ cpu_exec_end(cs);
+ process_queued_cpu_work(cs);
+
+ signo = 0;
+
+ switch (trapnr) {
+ case EXCP_INTERRUPT:
+ /* just indicate that signals should be handled asap */
+ break;
+ case EXCP_ATOMIC:
+ cpu_exec_step_atomic(cs);
+ break;
+ case RISCV_EXCP_U_ECALL:
+ syscall_num = env->gpr[xT0];
+ env->pc += TARGET_INSN_SIZE;
+ /* Compare to cpu_fetch_syscall_args() in riscv/riscv/trap.c */
+ if (TARGET_FREEBSD_NR___syscall == syscall_num ||
+ TARGET_FREEBSD_NR_syscall == syscall_num) {
+ ret = do_freebsd_syscall(env,
+ env->gpr[xA0],
+ env->gpr[xA1],
+ env->gpr[xA2],
+ env->gpr[xA3],
+ env->gpr[xA4],
+ env->gpr[xA5],
+ env->gpr[xA6],
+ env->gpr[xA7],
+ 0);
+ } else {
+ ret = do_freebsd_syscall(env,
+ syscall_num,
+ env->gpr[xA0],
+ env->gpr[xA1],
+ env->gpr[xA2],
+ env->gpr[xA3],
+ env->gpr[xA4],
+ env->gpr[xA5],
+ env->gpr[xA6],
+ env->gpr[xA7]
+ );
+ }
+
+ /*
+ * Compare to cpu_set_syscall_retval() in
+ * riscv/riscv/vm_machdep.c
+ */
+ if (ret >= 0) {
+ env->gpr[xA0] = ret;
+ env->gpr[xT0] = 0;
+ } else if (ret == -TARGET_ERESTART) {
+ env->pc -= TARGET_INSN_SIZE;
+ } else if (ret != -TARGET_EJUSTRETURN) {
+ env->gpr[xA0] = -ret;
+ env->gpr[xT0] = 1;
+ }
+ break;
+ case RISCV_EXCP_ILLEGAL_INST:
+ signo = TARGET_SIGILL;
+ code = TARGET_ILL_ILLOPC;
+ break;
+ case RISCV_EXCP_BREAKPOINT:
+ signo = TARGET_SIGTRAP;
+ code = TARGET_TRAP_BRKPT;
+ break;
+ case EXCP_DEBUG:
+ signo = TARGET_SIGTRAP;
+ code = TARGET_TRAP_BRKPT;
+ break;
+ default:
+ fprintf(stderr, "qemu: unhandled CPU exception "
+ "0x%x - aborting\n", trapnr);
+ cpu_dump_state(cs, stderr, 0);
+ abort();
+ }
+
+ if (signo) {
+ force_sig_fault(signo, code, env->pc);
+ }
+
+ process_pending_signals(env);
+ }
+}
+
#endif /* TARGET_ARCH_CPU_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 33/47] bsd-user: Implement RISC-V CPU register cloning and reset functions
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (31 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 32/47] bsd-user: Add RISC-V CPU execution loop and syscall handling Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 34/47] bsd-user: Implement RISC-V TLS register setup Alistair Francis
` (14 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Richard Henderson,
Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added functions for cloning CPU registers and resetting the CPU state
for RISC-V architecture.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-4-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_cpu.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/bsd-user/riscv/target_arch_cpu.h b/bsd-user/riscv/target_arch_cpu.h
index 9c31d9dc4c..a93ea3915a 100644
--- a/bsd-user/riscv/target_arch_cpu.h
+++ b/bsd-user/riscv/target_arch_cpu.h
@@ -131,4 +131,18 @@ static inline void target_cpu_loop(CPURISCVState *env)
}
}
+static inline void target_cpu_clone_regs(CPURISCVState *env, target_ulong newsp)
+{
+ if (newsp) {
+ env->gpr[xSP] = newsp;
+ }
+
+ env->gpr[xA0] = 0;
+ env->gpr[xT0] = 0;
+}
+
+static inline void target_cpu_reset(CPUArchState *env)
+{
+}
+
#endif /* TARGET_ARCH_CPU_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 34/47] bsd-user: Implement RISC-V TLS register setup
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (32 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 33/47] bsd-user: Implement RISC-V CPU register cloning and reset functions Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 35/47] bsd-user: Add RISC-V ELF definitions and hardware capability detection Alistair Francis
` (13 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Richard Henderson,
Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Included the prototype for the 'target_cpu_set_tls' function in the
'target_arch.h' header file. This function is responsible for setting
the Thread Local Storage (TLS) register for RISC-V architecture.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-5-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch.h | 27 +++++++++++++++++++++++++++
bsd-user/riscv/target_arch_cpu.c | 29 +++++++++++++++++++++++++++++
2 files changed, 56 insertions(+)
create mode 100644 bsd-user/riscv/target_arch.h
create mode 100644 bsd-user/riscv/target_arch_cpu.c
diff --git a/bsd-user/riscv/target_arch.h b/bsd-user/riscv/target_arch.h
new file mode 100644
index 0000000000..26ce07f343
--- /dev/null
+++ b/bsd-user/riscv/target_arch.h
@@ -0,0 +1,27 @@
+/*
+ * RISC-V specific prototypes
+ *
+ * Copyright (c) 2019 Mark Corbin <mark.corbin@embecsom.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_H
+#define TARGET_ARCH_H
+
+#include "qemu.h"
+
+void target_cpu_set_tls(CPURISCVState *env, target_ulong newtls);
+
+#endif /* TARGET_ARCH_H */
diff --git a/bsd-user/riscv/target_arch_cpu.c b/bsd-user/riscv/target_arch_cpu.c
new file mode 100644
index 0000000000..44e25d2ddf
--- /dev/null
+++ b/bsd-user/riscv/target_arch_cpu.c
@@ -0,0 +1,29 @@
+/*
+ * RISC-V CPU related code
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "qemu/osdep.h"
+
+#include "target_arch.h"
+
+#define TP_OFFSET 16
+
+/* Compare with cpu_set_user_tls() in riscv/riscv/vm_machdep.c */
+void target_cpu_set_tls(CPURISCVState *env, target_ulong newtls)
+{
+ env->gpr[xTP] = newtls + TP_OFFSET;
+}
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 35/47] bsd-user: Add RISC-V ELF definitions and hardware capability detection
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (33 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 34/47] bsd-user: Implement RISC-V TLS register setup Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 36/47] bsd-user: Define RISC-V register structures and register copying Alistair Francis
` (12 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Kyle Evans,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Introduced RISC-V specific ELF definitions and hardware capability
detection.
Additionally, a function to retrieve hardware capabilities
('get_elf_hwcap') is implemented, which returns the common bits set in
each CPU's ISA strings.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-6-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_elf.h | 42 ++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
create mode 100644 bsd-user/riscv/target_arch_elf.h
diff --git a/bsd-user/riscv/target_arch_elf.h b/bsd-user/riscv/target_arch_elf.h
new file mode 100644
index 0000000000..4eb915e61e
--- /dev/null
+++ b/bsd-user/riscv/target_arch_elf.h
@@ -0,0 +1,42 @@
+/*
+ * RISC-V ELF definitions
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_ELF_H
+#define TARGET_ARCH_ELF_H
+
+#define elf_check_arch(x) ((x) == EM_RISCV)
+#define ELF_START_MMAP 0x80000000
+#define ELF_ET_DYN_LOAD_ADDR 0x100000
+#define ELF_CLASS ELFCLASS64
+
+#define ELF_DATA ELFDATA2LSB
+#define ELF_ARCH EM_RISCV
+
+#define ELF_HWCAP get_elf_hwcap()
+static uint32_t get_elf_hwcap(void)
+{
+ RISCVCPU *cpu = RISCV_CPU(thread_cpu);
+
+ return cpu->env.misa_ext_mask;
+}
+
+#define USE_ELF_CORE_DUMP
+#define ELF_EXEC_PAGESIZE 4096
+
+#endif /* TARGET_ARCH_ELF_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 36/47] bsd-user: Define RISC-V register structures and register copying
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (34 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 35/47] bsd-user: Add RISC-V ELF definitions and hardware capability detection Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 37/47] bsd-user: Add RISC-V signal trampoline setup function Alistair Francis
` (11 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Richard Henderson,
Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added definitions for RISC-V register structures, including
general-purpose registers and floating-point registers, in
'target_arch_reg.h'. Implemented the 'target_copy_regs' function to
copy register values from the CPU state to the target register
structure, ensuring proper endianness handling using 'tswapreg'.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-7-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_reg.h | 88 ++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
create mode 100644 bsd-user/riscv/target_arch_reg.h
diff --git a/bsd-user/riscv/target_arch_reg.h b/bsd-user/riscv/target_arch_reg.h
new file mode 100644
index 0000000000..12b1c96b61
--- /dev/null
+++ b/bsd-user/riscv/target_arch_reg.h
@@ -0,0 +1,88 @@
+/*
+ * RISC-V register structures
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_REG_H
+#define TARGET_ARCH_REG_H
+
+/* Compare with riscv/include/reg.h */
+typedef struct target_reg {
+ uint64_t ra; /* return address */
+ uint64_t sp; /* stack pointer */
+ uint64_t gp; /* global pointer */
+ uint64_t tp; /* thread pointer */
+ uint64_t t[7]; /* temporaries */
+ uint64_t s[12]; /* saved registers */
+ uint64_t a[8]; /* function arguments */
+ uint64_t sepc; /* exception program counter */
+ uint64_t sstatus; /* status register */
+} target_reg_t;
+
+typedef struct target_fpreg {
+ uint64_t fp_x[32][2]; /* Floating point registers */
+ uint64_t fp_fcsr; /* Floating point control reg */
+} target_fpreg_t;
+
+#define tswapreg(ptr) tswapal(ptr)
+
+/* Compare with struct trapframe in riscv/include/frame.h */
+static inline void target_copy_regs(target_reg_t *regs,
+ const CPURISCVState *env)
+{
+
+ regs->ra = tswapreg(env->gpr[1]);
+ regs->sp = tswapreg(env->gpr[2]);
+ regs->gp = tswapreg(env->gpr[3]);
+ regs->tp = tswapreg(env->gpr[4]);
+
+ regs->t[0] = tswapreg(env->gpr[5]);
+ regs->t[1] = tswapreg(env->gpr[6]);
+ regs->t[2] = tswapreg(env->gpr[7]);
+ regs->t[3] = tswapreg(env->gpr[28]);
+ regs->t[4] = tswapreg(env->gpr[29]);
+ regs->t[5] = tswapreg(env->gpr[30]);
+ regs->t[6] = tswapreg(env->gpr[31]);
+
+ regs->s[0] = tswapreg(env->gpr[8]);
+ regs->s[1] = tswapreg(env->gpr[9]);
+ regs->s[2] = tswapreg(env->gpr[18]);
+ regs->s[3] = tswapreg(env->gpr[19]);
+ regs->s[4] = tswapreg(env->gpr[20]);
+ regs->s[5] = tswapreg(env->gpr[21]);
+ regs->s[6] = tswapreg(env->gpr[22]);
+ regs->s[7] = tswapreg(env->gpr[23]);
+ regs->s[8] = tswapreg(env->gpr[24]);
+ regs->s[9] = tswapreg(env->gpr[25]);
+ regs->s[10] = tswapreg(env->gpr[26]);
+ regs->s[11] = tswapreg(env->gpr[27]);
+
+ regs->a[0] = tswapreg(env->gpr[10]);
+ regs->a[1] = tswapreg(env->gpr[11]);
+ regs->a[2] = tswapreg(env->gpr[12]);
+ regs->a[3] = tswapreg(env->gpr[13]);
+ regs->a[4] = tswapreg(env->gpr[14]);
+ regs->a[5] = tswapreg(env->gpr[15]);
+ regs->a[6] = tswapreg(env->gpr[16]);
+ regs->a[7] = tswapreg(env->gpr[17]);
+
+ regs->sepc = tswapreg(env->pc);
+}
+
+#undef tswapreg
+
+#endif /* TARGET_ARCH_REG_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 37/47] bsd-user: Add RISC-V signal trampoline setup function
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (35 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 36/47] bsd-user: Define RISC-V register structures and register copying Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 38/47] bsd-user: Implement RISC-V sysarch system call emulation Alistair Francis
` (10 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Richard Henderson,
Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Implemented the 'setup_sigtramp' function for setting up the signal
trampoline code in the RISC-V architecture.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-8-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_sigtramp.h | 41 +++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
create mode 100644 bsd-user/riscv/target_arch_sigtramp.h
diff --git a/bsd-user/riscv/target_arch_sigtramp.h b/bsd-user/riscv/target_arch_sigtramp.h
new file mode 100644
index 0000000000..dfe5076739
--- /dev/null
+++ b/bsd-user/riscv/target_arch_sigtramp.h
@@ -0,0 +1,41 @@
+/*
+ * RISC-V sigcode
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_SIGTRAMP_H
+#define TARGET_ARCH_SIGTRAMP_H
+
+/* Compare with sigcode() in riscv/riscv/locore.S */
+static inline abi_long setup_sigtramp(abi_ulong offset, unsigned sigf_uc,
+ unsigned sys_sigreturn)
+{
+ uint32_t sys_exit = TARGET_FREEBSD_NR_exit;
+
+ uint32_t sigtramp_code[] = {
+ /*1*/ const_le32(0x00010513), /*mv a0, sp*/
+ /*2*/ const_le32(0x00050513 + (sigf_uc << 20)), /*addi a0,a0,sigf_uc*/
+ /*3*/ const_le32(0x00000293 + (sys_sigreturn << 20)),/*li t0,sys_sigreturn*/
+ /*4*/ const_le32(0x00000073), /*ecall*/
+ /*5*/ const_le32(0x00000293 + (sys_exit << 20)), /*li t0,sys_exit*/
+ /*6*/ const_le32(0x00000073), /*ecall*/
+ /*7*/ const_le32(0xFF1FF06F) /*b -16*/
+ };
+
+ return memcpy_to_target(offset, sigtramp_code, TARGET_SZSIGCODE);
+}
+#endif /* TARGET_ARCH_SIGTRAMP_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 38/47] bsd-user: Implement RISC-V sysarch system call emulation
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (36 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 37/47] bsd-user: Add RISC-V signal trampoline setup function Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 39/47] bsd-user: Add RISC-V thread setup and initialization support Alistair Francis
` (9 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Richard Henderson,
Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added the 'do_freebsd_arch_sysarch' function to emulate the 'sysarch'
system call for the RISC-V architecture.
Currently, this function returns '-TARGET_EOPNOTSUPP' to indicate that
the operation is not supported.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-9-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_sysarch.h | 41 ++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
create mode 100644 bsd-user/riscv/target_arch_sysarch.h
diff --git a/bsd-user/riscv/target_arch_sysarch.h b/bsd-user/riscv/target_arch_sysarch.h
new file mode 100644
index 0000000000..9af42331b4
--- /dev/null
+++ b/bsd-user/riscv/target_arch_sysarch.h
@@ -0,0 +1,41 @@
+/*
+ * RISC-V sysarch() system call emulation
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_SYSARCH_H
+#define TARGET_ARCH_SYSARCH_H
+
+#include "target_syscall.h"
+#include "target_arch.h"
+
+static inline abi_long do_freebsd_arch_sysarch(CPURISCVState *env, int op,
+ abi_ulong parms)
+{
+
+ return -TARGET_EOPNOTSUPP;
+}
+
+static inline void do_freebsd_arch_print_sysarch(
+ const struct syscallname *name, abi_long arg1, abi_long arg2,
+ abi_long arg3, abi_long arg4, abi_long arg5, abi_long arg6)
+{
+
+ gemu_log("UNKNOWN OP: %d, " TARGET_ABI_FMT_lx ")", (int)arg1, arg2);
+}
+
+#endif /* TARGET_ARCH_SYSARCH_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 39/47] bsd-user: Add RISC-V thread setup and initialization support
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (37 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 38/47] bsd-user: Implement RISC-V sysarch system call emulation Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 40/47] bsd-user: Define RISC-V VM parameters and helper functions Alistair Francis
` (8 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Jessica Clarke, Kyle Evans,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Implemented functions for setting up and initializing threads in the
RISC-V architecture.
The 'target_thread_set_upcall' function sets up the stack pointer,
program counter, and function argument for new threads.
The 'target_thread_init' function initializes thread registers based on
the provided image information.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
Co-authored-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-10-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_thread.h | 47 +++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
create mode 100644 bsd-user/riscv/target_arch_thread.h
diff --git a/bsd-user/riscv/target_arch_thread.h b/bsd-user/riscv/target_arch_thread.h
new file mode 100644
index 0000000000..95cd0b6ad7
--- /dev/null
+++ b/bsd-user/riscv/target_arch_thread.h
@@ -0,0 +1,47 @@
+/*
+ * RISC-V thread support
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_THREAD_H
+#define TARGET_ARCH_THREAD_H
+
+/* Compare with cpu_set_upcall() in riscv/riscv/vm_machdep.c */
+static inline void target_thread_set_upcall(CPURISCVState *regs,
+ abi_ulong entry, abi_ulong arg, abi_ulong stack_base,
+ abi_ulong stack_size)
+{
+ abi_ulong sp;
+
+ sp = ROUND_DOWN(stack_base + stack_size, 16);
+
+ regs->gpr[xSP] = sp;
+ regs->pc = entry;
+ regs->gpr[xA0] = arg;
+}
+
+/* Compare with exec_setregs() in riscv/riscv/machdep.c */
+static inline void target_thread_init(struct target_pt_regs *regs,
+ struct image_info *infop)
+{
+ regs->sepc = infop->entry;
+ regs->regs[xRA] = infop->entry;
+ regs->regs[xA0] = infop->start_stack;
+ regs->regs[xSP] = ROUND_DOWN(infop->start_stack, 16);
+}
+
+#endif /* TARGET_ARCH_THREAD_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 40/47] bsd-user: Define RISC-V VM parameters and helper functions
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (38 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 39/47] bsd-user: Add RISC-V thread setup and initialization support Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 41/47] bsd-user: Define RISC-V system call structures and constants Alistair Francis
` (7 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Richard Henderson,
Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added definitions for RISC-V VM parameters, including maximum and
default sizes for text, data, and stack, as well as address space
limits.
Implemented helper functions for retrieving and setting specific
values in the CPU state, such as stack pointer and return values.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-11-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_vmparam.h | 53 ++++++++++++++++++++++++++++
1 file changed, 53 insertions(+)
create mode 100644 bsd-user/riscv/target_arch_vmparam.h
diff --git a/bsd-user/riscv/target_arch_vmparam.h b/bsd-user/riscv/target_arch_vmparam.h
new file mode 100644
index 0000000000..0f2486def1
--- /dev/null
+++ b/bsd-user/riscv/target_arch_vmparam.h
@@ -0,0 +1,53 @@
+/*
+ * RISC-V VM parameters definitions
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_VMPARAM_H
+#define TARGET_ARCH_VMPARAM_H
+
+#include "cpu.h"
+
+/* Compare with riscv/include/vmparam.h */
+#define TARGET_MAXTSIZ (1 * GiB) /* max text size */
+#define TARGET_DFLDSIZ (128 * MiB) /* initial data size limit */
+#define TARGET_MAXDSIZ (1 * GiB) /* max data size */
+#define TARGET_DFLSSIZ (128 * MiB) /* initial stack size limit */
+#define TARGET_MAXSSIZ (1 * GiB) /* max stack size */
+#define TARGET_SGROWSIZ (128 * KiB) /* amount to grow stack */
+
+#define TARGET_VM_MINUSER_ADDRESS (0x0000000000000000UL)
+#define TARGET_VM_MAXUSER_ADDRESS (0x0000004000000000UL)
+
+#define TARGET_USRSTACK (TARGET_VM_MAXUSER_ADDRESS - TARGET_PAGE_SIZE)
+
+static inline abi_ulong get_sp_from_cpustate(CPURISCVState *state)
+{
+ return state->gpr[xSP];
+}
+
+static inline void set_second_rval(CPURISCVState *state, abi_ulong retval2)
+{
+ state->gpr[xA1] = retval2;
+}
+
+static inline abi_ulong get_second_rval(CPURISCVState *state)
+{
+ return state->gpr[xA1];
+}
+
+#endif /* TARGET_ARCH_VMPARAM_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 41/47] bsd-user: Define RISC-V system call structures and constants
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (39 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 40/47] bsd-user: Define RISC-V VM parameters and helper functions Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 42/47] bsd-user: Add generic RISC-V64 target definitions Alistair Francis
` (6 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Jessica Clarke,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Introduced definitions for the RISC-V system call interface, including
the 'target_pt_regs' structure that outlines the register storage
layout during a system call.
Added constants for hardware machine identifiers.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-12-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_syscall.h | 38 +++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
create mode 100644 bsd-user/riscv/target_syscall.h
diff --git a/bsd-user/riscv/target_syscall.h b/bsd-user/riscv/target_syscall.h
new file mode 100644
index 0000000000..e7e5231309
--- /dev/null
+++ b/bsd-user/riscv/target_syscall.h
@@ -0,0 +1,38 @@
+/*
+ * RISC-V system call definitions
+ *
+ * Copyright (c) Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef BSD_USER_RISCV_TARGET_SYSCALL_H
+#define BSD_USER_RISCV_TARGET_SYSCALL_H
+
+/*
+ * struct target_pt_regs defines the way the registers are stored on the stack
+ * during a system call.
+ */
+
+struct target_pt_regs {
+ abi_ulong regs[32];
+ abi_ulong sepc;
+};
+
+#define UNAME_MACHINE "riscv64"
+
+#define TARGET_HW_MACHINE "riscv"
+#define TARGET_HW_MACHINE_ARCH UNAME_MACHINE
+
+#endif /* BSD_USER_RISCV_TARGET_SYSCALL_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 42/47] bsd-user: Add generic RISC-V64 target definitions
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (40 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 41/47] bsd-user: Define RISC-V system call structures and constants Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 43/47] bsd-user: Define RISC-V signal handling structures and constants Alistair Francis
` (5 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Warner Losh, Ajeet Singh, Richard Henderson,
Alistair Francis
From: Warner Losh <imp@bsdimp.com>
Added a generic definition for RISC-V64 target-specific details.
Implemented the 'regpairs_aligned' function,which returns 'false'
to indicate that register pairs are not aligned in the RISC-V64 ABI.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-13-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
create mode 100644 bsd-user/riscv/target.h
diff --git a/bsd-user/riscv/target.h b/bsd-user/riscv/target.h
new file mode 100644
index 0000000000..036ddd185e
--- /dev/null
+++ b/bsd-user/riscv/target.h
@@ -0,0 +1,20 @@
+/*
+ * Riscv64 general target stuff that's common to all aarch details
+ *
+ * Copyright (c) 2022 M. Warner Losh <imp@bsdimp.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef TARGET_H
+#define TARGET_H
+
+/*
+ * riscv64 ABI does not 'lump' the registers for 64-bit args.
+ */
+static inline bool regpairs_aligned(void *cpu_env)
+{
+ return false;
+}
+
+#endif /* TARGET_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 43/47] bsd-user: Define RISC-V signal handling structures and constants
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (41 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 42/47] bsd-user: Add generic RISC-V64 target definitions Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 44/47] bsd-user: Implement RISC-V signal trampoline setup functions Alistair Francis
` (4 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Warner Losh,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added definitions for RISC-V signal handling, including structures
and constants for managing signal frames and context
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-14-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/target_arch_signal.h | 75 +++++++++++++++++++++++++++++
1 file changed, 75 insertions(+)
create mode 100644 bsd-user/riscv/target_arch_signal.h
diff --git a/bsd-user/riscv/target_arch_signal.h b/bsd-user/riscv/target_arch_signal.h
new file mode 100644
index 0000000000..1a634b865b
--- /dev/null
+++ b/bsd-user/riscv/target_arch_signal.h
@@ -0,0 +1,75 @@
+/*
+ * RISC-V signal definitions
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef TARGET_ARCH_SIGNAL_H
+#define TARGET_ARCH_SIGNAL_H
+
+#include "cpu.h"
+
+
+#define TARGET_INSN_SIZE 4 /* riscv instruction size */
+
+/* Size of the signal trampoline code placed on the stack. */
+#define TARGET_SZSIGCODE ((abi_ulong)(7 * TARGET_INSN_SIZE))
+
+/* Compare with riscv/include/_limits.h */
+#define TARGET_MINSIGSTKSZ (1024 * 4)
+#define TARGET_SIGSTKSZ (TARGET_MINSIGSTKSZ + 32768)
+
+struct target_gpregs {
+ uint64_t gp_ra;
+ uint64_t gp_sp;
+ uint64_t gp_gp;
+ uint64_t gp_tp;
+ uint64_t gp_t[7];
+ uint64_t gp_s[12];
+ uint64_t gp_a[8];
+ uint64_t gp_sepc;
+ uint64_t gp_sstatus;
+};
+
+struct target_fpregs {
+ uint64_t fp_x[32][2];
+ uint64_t fp_fcsr;
+ uint32_t fp_flags;
+ uint32_t pad;
+};
+
+typedef struct target_mcontext {
+ struct target_gpregs mc_gpregs;
+ struct target_fpregs mc_fpregs;
+ uint32_t mc_flags;
+#define TARGET_MC_FP_VALID 0x01
+ uint32_t mc_pad;
+ uint64_t mc_spare[8];
+} target_mcontext_t;
+
+#define TARGET_MCONTEXT_SIZE 864
+#define TARGET_UCONTEXT_SIZE 936
+
+#include "target_os_ucontext.h"
+
+struct target_sigframe {
+ target_ucontext_t sf_uc; /* = *sf_uncontext */
+ target_siginfo_t sf_si; /* = *sf_siginfo (SA_SIGINFO case)*/
+};
+
+#define TARGET_SIGSTACK_ALIGN 16
+
+#endif /* TARGET_ARCH_SIGNAL_H */
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 44/47] bsd-user: Implement RISC-V signal trampoline setup functions
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (42 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 43/47] bsd-user: Define RISC-V signal handling structures and constants Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 45/47] bsd-user: Implement 'get_mcontext' for RISC-V Alistair Francis
` (3 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Warner Losh,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added functions for setting up the RISC-V signal trampoline and signal
frame:
'set_sigtramp_args()': Configures the RISC-V CPU state with arguments
for the signal handler. It sets up the registers with the signal
number,pointers to the signal info and user context, the signal handler
address, and the signal frame pointer.
'setup_sigframe_arch()': Initializes the signal frame with the current
machine context.This function copies the context from the CPU state to
the signal frame, preparing it for the signal handler.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-15-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/signal.c | 63 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 63 insertions(+)
create mode 100644 bsd-user/riscv/signal.c
diff --git a/bsd-user/riscv/signal.c b/bsd-user/riscv/signal.c
new file mode 100644
index 0000000000..2597fec2fd
--- /dev/null
+++ b/bsd-user/riscv/signal.c
@@ -0,0 +1,63 @@
+/*
+ * RISC-V signal definitions
+ *
+ * Copyright (c) 2019 Mark Corbin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "qemu/osdep.h"
+
+#include "qemu.h"
+
+/*
+ * Compare with sendsig() in riscv/riscv/exec_machdep.c
+ * Assumes that target stack frame memory is locked.
+ */
+abi_long
+set_sigtramp_args(CPURISCVState *regs, int sig, struct target_sigframe *frame,
+ abi_ulong frame_addr, struct target_sigaction *ka)
+{
+ /*
+ * Arguments to signal handler:
+ * a0 (10) = signal number
+ * a1 (11) = siginfo pointer
+ * a2 (12) = ucontext pointer
+ * pc = signal pointer handler
+ * sp (2) = sigframe pointer
+ * ra (1) = sigtramp at base of user stack
+ */
+
+ regs->gpr[xA0] = sig;
+ regs->gpr[xA1] = frame_addr +
+ offsetof(struct target_sigframe, sf_si);
+ regs->gpr[xA2] = frame_addr +
+ offsetof(struct target_sigframe, sf_uc);
+ regs->pc = ka->_sa_handler;
+ regs->gpr[xSP] = frame_addr;
+ regs->gpr[xRA] = TARGET_PS_STRINGS - TARGET_SZSIGCODE;
+ return 0;
+}
+
+/*
+ * Compare to riscv/riscv/exec_machdep.c sendsig()
+ * Assumes that the memory is locked if frame points to user memory.
+ */
+abi_long setup_sigframe_arch(CPURISCVState *env, abi_ulong frame_addr,
+ struct target_sigframe *frame, int flags)
+{
+ target_mcontext_t *mcp = &frame->sf_uc.uc_mcontext;
+
+ get_mcontext(env, mcp, flags);
+ return 0;
+}
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 45/47] bsd-user: Implement 'get_mcontext' for RISC-V
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (43 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 44/47] bsd-user: Implement RISC-V signal trampoline setup functions Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 46/47] bsd-user: Implement set_mcontext and get_ucontext_sigreturn for RISCV Alistair Francis
` (2 subsequent siblings)
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Ajeet Singh, Warner Losh,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added the 'get_mcontext' function to extract and populate
the RISC-V machine context from the CPU state.
This function is used to gather the current state of the
general-purpose registers and store it in a 'target_mcontext_'
structure.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-16-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/signal.c | 53 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 53 insertions(+)
diff --git a/bsd-user/riscv/signal.c b/bsd-user/riscv/signal.c
index 2597fec2fd..072ad821d2 100644
--- a/bsd-user/riscv/signal.c
+++ b/bsd-user/riscv/signal.c
@@ -61,3 +61,56 @@ abi_long setup_sigframe_arch(CPURISCVState *env, abi_ulong frame_addr,
get_mcontext(env, mcp, flags);
return 0;
}
+
+/*
+ * Compare with get_mcontext() in riscv/riscv/machdep.c
+ * Assumes that the memory is locked if mcp points to user memory.
+ */
+abi_long get_mcontext(CPURISCVState *regs, target_mcontext_t *mcp,
+ int flags)
+{
+
+ mcp->mc_gpregs.gp_t[0] = tswap64(regs->gpr[5]);
+ mcp->mc_gpregs.gp_t[1] = tswap64(regs->gpr[6]);
+ mcp->mc_gpregs.gp_t[2] = tswap64(regs->gpr[7]);
+ mcp->mc_gpregs.gp_t[3] = tswap64(regs->gpr[28]);
+ mcp->mc_gpregs.gp_t[4] = tswap64(regs->gpr[29]);
+ mcp->mc_gpregs.gp_t[5] = tswap64(regs->gpr[30]);
+ mcp->mc_gpregs.gp_t[6] = tswap64(regs->gpr[31]);
+
+ mcp->mc_gpregs.gp_s[0] = tswap64(regs->gpr[8]);
+ mcp->mc_gpregs.gp_s[1] = tswap64(regs->gpr[9]);
+ mcp->mc_gpregs.gp_s[2] = tswap64(regs->gpr[18]);
+ mcp->mc_gpregs.gp_s[3] = tswap64(regs->gpr[19]);
+ mcp->mc_gpregs.gp_s[4] = tswap64(regs->gpr[20]);
+ mcp->mc_gpregs.gp_s[5] = tswap64(regs->gpr[21]);
+ mcp->mc_gpregs.gp_s[6] = tswap64(regs->gpr[22]);
+ mcp->mc_gpregs.gp_s[7] = tswap64(regs->gpr[23]);
+ mcp->mc_gpregs.gp_s[8] = tswap64(regs->gpr[24]);
+ mcp->mc_gpregs.gp_s[9] = tswap64(regs->gpr[25]);
+ mcp->mc_gpregs.gp_s[10] = tswap64(regs->gpr[26]);
+ mcp->mc_gpregs.gp_s[11] = tswap64(regs->gpr[27]);
+
+ mcp->mc_gpregs.gp_a[0] = tswap64(regs->gpr[10]);
+ mcp->mc_gpregs.gp_a[1] = tswap64(regs->gpr[11]);
+ mcp->mc_gpregs.gp_a[2] = tswap64(regs->gpr[12]);
+ mcp->mc_gpregs.gp_a[3] = tswap64(regs->gpr[13]);
+ mcp->mc_gpregs.gp_a[4] = tswap64(regs->gpr[14]);
+ mcp->mc_gpregs.gp_a[5] = tswap64(regs->gpr[15]);
+ mcp->mc_gpregs.gp_a[6] = tswap64(regs->gpr[16]);
+ mcp->mc_gpregs.gp_a[7] = tswap64(regs->gpr[17]);
+
+ if (flags & TARGET_MC_GET_CLEAR_RET) {
+ mcp->mc_gpregs.gp_a[0] = 0; /* a0 */
+ mcp->mc_gpregs.gp_a[1] = 0; /* a1 */
+ mcp->mc_gpregs.gp_t[0] = 0; /* clear syscall error */
+ }
+
+ mcp->mc_gpregs.gp_ra = tswap64(regs->gpr[1]);
+ mcp->mc_gpregs.gp_sp = tswap64(regs->gpr[2]);
+ mcp->mc_gpregs.gp_gp = tswap64(regs->gpr[3]);
+ mcp->mc_gpregs.gp_tp = tswap64(regs->gpr[4]);
+ mcp->mc_gpregs.gp_sepc = tswap64(regs->pc);
+
+ return 0;
+}
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 46/47] bsd-user: Implement set_mcontext and get_ucontext_sigreturn for RISCV
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (44 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 45/47] bsd-user: Implement 'get_mcontext' for RISC-V Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-24 22:17 ` [PULL v2 47/47] bsd-user: Add RISC-V 64-bit Target Configuration and Debug XML Files Alistair Francis
2024-09-28 11:34 ` [PULL v2 00/47] riscv-to-apply queue Peter Maydell
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Mark Corbin, Warner Losh, Ajeet Singh,
Richard Henderson, Alistair Francis
From: Mark Corbin <mark@dibsco.co.uk>
Added implementations for 'set_mcontext' and 'get_ucontext_sigreturn'
functions for RISC-V architecture,
Both functions ensure that the CPU state and user context are properly
managed.
Signed-off-by: Mark Corbin <mark@dibsco.co.uk>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-17-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
bsd-user/riscv/signal.c | 54 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)
diff --git a/bsd-user/riscv/signal.c b/bsd-user/riscv/signal.c
index 072ad821d2..10c940cd49 100644
--- a/bsd-user/riscv/signal.c
+++ b/bsd-user/riscv/signal.c
@@ -114,3 +114,57 @@ abi_long get_mcontext(CPURISCVState *regs, target_mcontext_t *mcp,
return 0;
}
+
+/* Compare with set_mcontext() in riscv/riscv/exec_machdep.c */
+abi_long set_mcontext(CPURISCVState *regs, target_mcontext_t *mcp,
+ int srflag)
+{
+
+ regs->gpr[5] = tswap64(mcp->mc_gpregs.gp_t[0]);
+ regs->gpr[6] = tswap64(mcp->mc_gpregs.gp_t[1]);
+ regs->gpr[7] = tswap64(mcp->mc_gpregs.gp_t[2]);
+ regs->gpr[28] = tswap64(mcp->mc_gpregs.gp_t[3]);
+ regs->gpr[29] = tswap64(mcp->mc_gpregs.gp_t[4]);
+ regs->gpr[30] = tswap64(mcp->mc_gpregs.gp_t[5]);
+ regs->gpr[31] = tswap64(mcp->mc_gpregs.gp_t[6]);
+
+ regs->gpr[8] = tswap64(mcp->mc_gpregs.gp_s[0]);
+ regs->gpr[9] = tswap64(mcp->mc_gpregs.gp_s[1]);
+ regs->gpr[18] = tswap64(mcp->mc_gpregs.gp_s[2]);
+ regs->gpr[19] = tswap64(mcp->mc_gpregs.gp_s[3]);
+ regs->gpr[20] = tswap64(mcp->mc_gpregs.gp_s[4]);
+ regs->gpr[21] = tswap64(mcp->mc_gpregs.gp_s[5]);
+ regs->gpr[22] = tswap64(mcp->mc_gpregs.gp_s[6]);
+ regs->gpr[23] = tswap64(mcp->mc_gpregs.gp_s[7]);
+ regs->gpr[24] = tswap64(mcp->mc_gpregs.gp_s[8]);
+ regs->gpr[25] = tswap64(mcp->mc_gpregs.gp_s[9]);
+ regs->gpr[26] = tswap64(mcp->mc_gpregs.gp_s[10]);
+ regs->gpr[27] = tswap64(mcp->mc_gpregs.gp_s[11]);
+
+ regs->gpr[10] = tswap64(mcp->mc_gpregs.gp_a[0]);
+ regs->gpr[11] = tswap64(mcp->mc_gpregs.gp_a[1]);
+ regs->gpr[12] = tswap64(mcp->mc_gpregs.gp_a[2]);
+ regs->gpr[13] = tswap64(mcp->mc_gpregs.gp_a[3]);
+ regs->gpr[14] = tswap64(mcp->mc_gpregs.gp_a[4]);
+ regs->gpr[15] = tswap64(mcp->mc_gpregs.gp_a[5]);
+ regs->gpr[16] = tswap64(mcp->mc_gpregs.gp_a[6]);
+ regs->gpr[17] = tswap64(mcp->mc_gpregs.gp_a[7]);
+
+
+ regs->gpr[1] = tswap64(mcp->mc_gpregs.gp_ra);
+ regs->gpr[2] = tswap64(mcp->mc_gpregs.gp_sp);
+ regs->gpr[3] = tswap64(mcp->mc_gpregs.gp_gp);
+ regs->gpr[4] = tswap64(mcp->mc_gpregs.gp_tp);
+ regs->pc = tswap64(mcp->mc_gpregs.gp_sepc);
+
+ return 0;
+}
+
+/* Compare with sys_sigreturn() in riscv/riscv/machdep.c */
+abi_long get_ucontext_sigreturn(CPURISCVState *regs,
+ abi_ulong target_sf, abi_ulong *target_uc)
+{
+
+ *target_uc = target_sf;
+ return 0;
+}
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PULL v2 47/47] bsd-user: Add RISC-V 64-bit Target Configuration and Debug XML Files
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (45 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 46/47] bsd-user: Implement set_mcontext and get_ucontext_sigreturn for RISCV Alistair Francis
@ 2024-09-24 22:17 ` Alistair Francis
2024-09-28 11:34 ` [PULL v2 00/47] riscv-to-apply queue Peter Maydell
47 siblings, 0 replies; 68+ messages in thread
From: Alistair Francis @ 2024-09-24 22:17 UTC (permalink / raw)
To: qemu-devel
Cc: alistair23, Warner Losh, Ajeet Singh, Richard Henderson,
Alistair Francis
From: Warner Losh <imp@bsdimp.com>
Added configuration for RISC-V 64-bit target to the build system.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240916155119.14610-18-itachis@FreeBSD.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
configs/targets/riscv64-bsd-user.mak | 4 ++++
1 file changed, 4 insertions(+)
create mode 100644 configs/targets/riscv64-bsd-user.mak
diff --git a/configs/targets/riscv64-bsd-user.mak b/configs/targets/riscv64-bsd-user.mak
new file mode 100644
index 0000000000..191c2c483f
--- /dev/null
+++ b/configs/targets/riscv64-bsd-user.mak
@@ -0,0 +1,4 @@
+TARGET_ARCH=riscv64
+TARGET_BASE_ARCH=riscv
+TARGET_ABI_DIR=riscv
+TARGET_XML_FILES= gdb-xml/riscv-64bit-cpu.xml gdb-xml/riscv-32bit-fpu.xml gdb-xml/riscv-64bit-fpu.xml gdb-xml/riscv-64bit-virtual.xml
--
2.46.1
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-24 22:17 [PULL v2 00/47] riscv-to-apply queue Alistair Francis
` (46 preceding siblings ...)
2024-09-24 22:17 ` [PULL v2 47/47] bsd-user: Add RISC-V 64-bit Target Configuration and Debug XML Files Alistair Francis
@ 2024-09-28 11:34 ` Peter Maydell
2024-09-28 20:23 ` Peter Maydell
2024-09-28 20:40 ` Daniel Henrique Barboza
47 siblings, 2 replies; 68+ messages in thread
From: Peter Maydell @ 2024-09-28 11:34 UTC (permalink / raw)
To: Alistair Francis; +Cc: qemu-devel, Alistair Francis
On Tue, 24 Sept 2024 at 23:18, Alistair Francis <alistair23@gmail.com> wrote:
>
> The following changes since commit 01dc65a3bc262ab1bec8fe89775e9bbfa627becb:
>
> Merge tag 'pull-target-arm-20240919' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2024-09-19 14:15:15 +0100)
>
> are available in the Git repository at:
>
> https://github.com/alistair23/qemu.git tags/pull-riscv-to-apply-20240925-1
>
> for you to fetch changes up to 6bfa92c5757fe7a9580e1f6e065076777cae650f:
>
> bsd-user: Add RISC-V 64-bit Target Configuration and Debug XML Files (2024-09-24 12:53:16 +1000)
>
> ----------------------------------------------------------------
> RISC-V PR for 9.2
>
> * Add a property to set vl to ceil(AVL/2)
> * Enable numamem testing for RISC-V
> * Consider MISA bit choice in implied rule
> * Fix the za64rs priv spec requirements
> * Enable Bit Manip for OpenTitan Ibex CPU
> * Fix the group bit setting of AIA with KVM
> * Stop timer with infinite timecmp
> * Add 'fcsr' register to QEMU log as a part of F extension
> * Fix riscv64 build on musl libc
> * Add preliminary textra trigger CSR functions
> * RISC-V IOMMU support
> * RISC-V bsd-user support
> * Respect firmware ELF entry point
> * Add Svvptc extension support
> * Fix masking of rv32 physical address
> * Fix linking problem with semihosting disabled
> * Fix IMSIC interrupt state updates
>
This fails the riscv qos-tests on s390x. My guess is that the new
IOMMU support has endianness bugs and fails on bigendian hosts.
https://gitlab.com/qemu-project/qemu/-/jobs/7942189143
full test log (4MB) at
https://qemu-project.gitlab.io/-/qemu/-/jobs/7942189143/artifacts/build/meson-logs/testlog.txt
The assertion failure is
ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
but there are a lot of virtio errors before that so the
problem probably happened rather earlier.
thanks
-- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-28 11:34 ` [PULL v2 00/47] riscv-to-apply queue Peter Maydell
@ 2024-09-28 20:23 ` Peter Maydell
2024-09-28 20:40 ` Daniel Henrique Barboza
1 sibling, 0 replies; 68+ messages in thread
From: Peter Maydell @ 2024-09-28 20:23 UTC (permalink / raw)
To: Alistair Francis; +Cc: qemu-devel, Alistair Francis
On Sat, 28 Sept 2024 at 12:34, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Tue, 24 Sept 2024 at 23:18, Alistair Francis <alistair23@gmail.com> wrote:
> >
> > The following changes since commit 01dc65a3bc262ab1bec8fe89775e9bbfa627becb:
> >
> > Merge tag 'pull-target-arm-20240919' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2024-09-19 14:15:15 +0100)
> >
> > are available in the Git repository at:
> >
> > https://github.com/alistair23/qemu.git tags/pull-riscv-to-apply-20240925-1
> >
> > for you to fetch changes up to 6bfa92c5757fe7a9580e1f6e065076777cae650f:
> >
> > bsd-user: Add RISC-V 64-bit Target Configuration and Debug XML Files (2024-09-24 12:53:16 +1000)
> >
> > ----------------------------------------------------------------
> > RISC-V PR for 9.2
> >
> > * Add a property to set vl to ceil(AVL/2)
> > * Enable numamem testing for RISC-V
> > * Consider MISA bit choice in implied rule
> > * Fix the za64rs priv spec requirements
> > * Enable Bit Manip for OpenTitan Ibex CPU
> > * Fix the group bit setting of AIA with KVM
> > * Stop timer with infinite timecmp
> > * Add 'fcsr' register to QEMU log as a part of F extension
> > * Fix riscv64 build on musl libc
> > * Add preliminary textra trigger CSR functions
> > * RISC-V IOMMU support
> > * RISC-V bsd-user support
> > * Respect firmware ELF entry point
> > * Add Svvptc extension support
> > * Fix masking of rv32 physical address
> > * Fix linking problem with semihosting disabled
> > * Fix IMSIC interrupt state updates
> >
>
> This fails the riscv qos-tests on s390x. My guess is that the new
> IOMMU support has endianness bugs and fails on bigendian hosts.
I have also noticed that the license text comments in the iommu
patches are confused. That must be fixed before we can accept them.
(see my reply to patch 16.)
thanks
-- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-28 11:34 ` [PULL v2 00/47] riscv-to-apply queue Peter Maydell
2024-09-28 20:23 ` Peter Maydell
@ 2024-09-28 20:40 ` Daniel Henrique Barboza
2024-09-29 15:38 ` Peter Maydell
2024-09-30 12:10 ` Ilya Leoshkevich
1 sibling, 2 replies; 68+ messages in thread
From: Daniel Henrique Barboza @ 2024-09-28 20:40 UTC (permalink / raw)
To: Peter Maydell, Alistair Francis; +Cc: qemu-devel, Alistair Francis
On 9/28/24 8:34 AM, Peter Maydell wrote:
> On Tue, 24 Sept 2024 at 23:18, Alistair Francis <alistair23@gmail.com> wrote:
>>
>> The following changes since commit 01dc65a3bc262ab1bec8fe89775e9bbfa627becb:
>>
>> Merge tag 'pull-target-arm-20240919' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2024-09-19 14:15:15 +0100)
>>
>> are available in the Git repository at:
>>
>> https://github.com/alistair23/qemu.git tags/pull-riscv-to-apply-20240925-1
>>
>> for you to fetch changes up to 6bfa92c5757fe7a9580e1f6e065076777cae650f:
>>
>> bsd-user: Add RISC-V 64-bit Target Configuration and Debug XML Files (2024-09-24 12:53:16 +1000)
>>
>> ----------------------------------------------------------------
>> RISC-V PR for 9.2
>>
>> * Add a property to set vl to ceil(AVL/2)
>> * Enable numamem testing for RISC-V
>> * Consider MISA bit choice in implied rule
>> * Fix the za64rs priv spec requirements
>> * Enable Bit Manip for OpenTitan Ibex CPU
>> * Fix the group bit setting of AIA with KVM
>> * Stop timer with infinite timecmp
>> * Add 'fcsr' register to QEMU log as a part of F extension
>> * Fix riscv64 build on musl libc
>> * Add preliminary textra trigger CSR functions
>> * RISC-V IOMMU support
>> * RISC-V bsd-user support
>> * Respect firmware ELF entry point
>> * Add Svvptc extension support
>> * Fix masking of rv32 physical address
>> * Fix linking problem with semihosting disabled
>> * Fix IMSIC interrupt state updates
>>
>
> This fails the riscv qos-tests on s390x. My guess is that the new
> IOMMU support has endianness bugs and fails on bigendian hosts.
>
> https://gitlab.com/qemu-project/qemu/-/jobs/7942189143
>
> full test log (4MB) at
> https://qemu-project.gitlab.io/-/qemu/-/jobs/7942189143/artifacts/build/meson-logs/testlog.txt
>
> The assertion failure is
> ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
> failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
The root cause is that the qtests I added aren't considering the endianess of the
host. The RISC-V IOMMU is being implemented as LE only and all regs are being
read/written in memory as LE. The qtest read/write helpers must take the qtest
endianess into account. We make this type of handling in other qtest archs like
ppc64.
I have a fix for the tests but I'm unable to run the ubuntu-22.04-s390x-all-system
job to verify it, even after setting Cirrus like Thomas taught me a week ago. In
fact I have no 'ubuntu-22-*' jobs available to run.
If there's a way to run these ubuntu s390x tests, let me know. Otherwise I'm inclined
to remove the IOMMU qtests for now until I'm able to verify that they'll work in a
BE host (I'll install a BE VM to verify).
I also saw that you made comments in one of the patches w.r.t license text and other
changes in the base code. I'll get to it.
Thanks,
Daniel
>
> but there are a lot of virtio errors before that so the
> problem probably happened rather earlier.
>
> thanks
> -- PMM
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-28 20:40 ` Daniel Henrique Barboza
@ 2024-09-29 15:38 ` Peter Maydell
2024-09-29 20:53 ` Daniel Henrique Barboza
2024-09-30 12:10 ` Ilya Leoshkevich
1 sibling, 1 reply; 68+ messages in thread
From: Peter Maydell @ 2024-09-29 15:38 UTC (permalink / raw)
To: Daniel Henrique Barboza; +Cc: Alistair Francis, qemu-devel, Alistair Francis
On Sat, 28 Sept 2024 at 21:40, Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
>
>
> On 9/28/24 8:34 AM, Peter Maydell wrote:
> > The assertion failure is
> > ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
> > failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
>
> The root cause is that the qtests I added aren't considering the endianess of the
> host. The RISC-V IOMMU is being implemented as LE only and all regs are being
> read/written in memory as LE. The qtest read/write helpers must take the qtest
> endianess into account. We make this type of handling in other qtest archs like
> ppc64.
>
> I have a fix for the tests but I'm unable to run the ubuntu-22.04-s390x-all-system
> job to verify it, even after setting Cirrus like Thomas taught me a week ago. In
> fact I have no 'ubuntu-22-*' jobs available to run.
It's on the private s390 VM we have, so it's set up only to
be available on the main CI run (there's not enough capacity
on the machine to do any more than that). If you want to point
me at a gitlab branch I can do a quick "make check" on that
if you like.
thanks
-- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-29 15:38 ` Peter Maydell
@ 2024-09-29 20:53 ` Daniel Henrique Barboza
2024-09-30 10:48 ` Peter Maydell
2024-09-30 10:58 ` Thomas Huth
0 siblings, 2 replies; 68+ messages in thread
From: Daniel Henrique Barboza @ 2024-09-29 20:53 UTC (permalink / raw)
To: Peter Maydell; +Cc: Alistair Francis, qemu-devel, Alistair Francis
On 9/29/24 12:38 PM, Peter Maydell wrote:
> On Sat, 28 Sept 2024 at 21:40, Daniel Henrique Barboza
> <dbarboza@ventanamicro.com> wrote:
>>
>>
>>
>> On 9/28/24 8:34 AM, Peter Maydell wrote:
>>> The assertion failure is
>>> ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
>>> failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
>>
>> The root cause is that the qtests I added aren't considering the endianess of the
>> host. The RISC-V IOMMU is being implemented as LE only and all regs are being
>> read/written in memory as LE. The qtest read/write helpers must take the qtest
>> endianess into account. We make this type of handling in other qtest archs like
>> ppc64.
>>
>> I have a fix for the tests but I'm unable to run the ubuntu-22.04-s390x-all-system
>> job to verify it, even after setting Cirrus like Thomas taught me a week ago. In
>> fact I have no 'ubuntu-22-*' jobs available to run.
>
> It's on the private s390 VM we have, so it's set up only to
> be available on the main CI run (there's not enough capacity
> on the machine to do any more than that). If you want to point
> me at a gitlab branch I can do a quick "make check" on that
> if you like.
I appreciate it. This is the repo:
https://gitlab.com/danielhb/qemu/-/tree/pull_fix
If this is enough to fix the tests, I'll amend it in the new IOMMU version.
If we still failing then I'll need to set this s390 VM.
By the way, if you have any recipe/pointers to set this s390 VM to share,
that would be great.
Thanks,
Daniel
>
> thanks
> -- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-29 20:53 ` Daniel Henrique Barboza
@ 2024-09-30 10:48 ` Peter Maydell
2024-09-30 12:05 ` Daniel Henrique Barboza
2024-09-30 10:58 ` Thomas Huth
1 sibling, 1 reply; 68+ messages in thread
From: Peter Maydell @ 2024-09-30 10:48 UTC (permalink / raw)
To: Daniel Henrique Barboza; +Cc: Alistair Francis, qemu-devel, Alistair Francis
On Sun, 29 Sept 2024 at 21:53, Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
>
>
> On 9/29/24 12:38 PM, Peter Maydell wrote:
> > On Sat, 28 Sept 2024 at 21:40, Daniel Henrique Barboza
> > <dbarboza@ventanamicro.com> wrote:
> >>
> >>
> >>
> >> On 9/28/24 8:34 AM, Peter Maydell wrote:
> >>> The assertion failure is
> >>> ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
> >>> failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
> >>
> >> The root cause is that the qtests I added aren't considering the endianess of the
> >> host. The RISC-V IOMMU is being implemented as LE only and all regs are being
> >> read/written in memory as LE. The qtest read/write helpers must take the qtest
> >> endianess into account. We make this type of handling in other qtest archs like
> >> ppc64.
> >>
> >> I have a fix for the tests but I'm unable to run the ubuntu-22.04-s390x-all-system
> >> job to verify it, even after setting Cirrus like Thomas taught me a week ago. In
> >> fact I have no 'ubuntu-22-*' jobs available to run.
> >
> > It's on the private s390 VM we have, so it's set up only to
> > be available on the main CI run (there's not enough capacity
> > on the machine to do any more than that). If you want to point
> > me at a gitlab branch I can do a quick "make check" on that
> > if you like.
>
> I appreciate it. This is the repo:
>
> https://gitlab.com/danielhb/qemu/-/tree/pull_fix
This doesn't fix the assertion. This is because the test (now) does:
qpci_memread(&r_iommu->dev, r_iommu->reg_bar, reg_offset,
®, sizeof(reg));
if (riscv_iommu_qtest_big_endian()) {
reg = bswap32(reg);
}
where riscv_iommu_qtest_big_endian() is a wrapper for
qtest_big_endian(). But qtest_big_endian() queries the
endianness of the *guest*, and so for riscv it will
always return false and we will never bswap.
If you need to do swapping inline in a test you can use
reg = le32_to_cpu(reg);
which swaps an LE value read from the guest to the host
CPU's endianness ordering (and similarly with cpu_to_le32
on the write path).
But it turns out that libqos provides already functions
to read/write 32 and 64 bit values from PCI devices:
reg = qpci_io_readl(&r_iommu->dev, r_iommu->reg_bar, reg_offset);
which do the byteswap for you.
Similarly qpci_io_writel() etc. (The functions work for
both IO and MEM PCI BARs.)
> If this is enough to fix the tests, I'll amend it in the new IOMMU version.
> If we still failing then I'll need to set this s390 VM.
>
> By the way, if you have any recipe/pointers to set this s390 VM to share,
> that would be great.
It's a VM provided by IBM under their "Community Cloud"
umbrella: https://community.ibm.com/zsystems/l1cc/
thanks
-- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-30 10:48 ` Peter Maydell
@ 2024-09-30 12:05 ` Daniel Henrique Barboza
0 siblings, 0 replies; 68+ messages in thread
From: Daniel Henrique Barboza @ 2024-09-30 12:05 UTC (permalink / raw)
To: Peter Maydell; +Cc: Alistair Francis, qemu-devel, Alistair Francis, Thomas Huth
On 9/30/24 7:48 AM, Peter Maydell wrote:
> On Sun, 29 Sept 2024 at 21:53, Daniel Henrique Barboza
> <dbarboza@ventanamicro.com> wrote:
>>
>>
>>
>> On 9/29/24 12:38 PM, Peter Maydell wrote:
>>> On Sat, 28 Sept 2024 at 21:40, Daniel Henrique Barboza
>>> <dbarboza@ventanamicro.com> wrote:
>>>>
>>>>
>>>>
>>>> On 9/28/24 8:34 AM, Peter Maydell wrote:
>>>>> The assertion failure is
>>>>> ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
>>>>> failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
>>>>
>>>> The root cause is that the qtests I added aren't considering the endianess of the
>>>> host. The RISC-V IOMMU is being implemented as LE only and all regs are being
>>>> read/written in memory as LE. The qtest read/write helpers must take the qtest
>>>> endianess into account. We make this type of handling in other qtest archs like
>>>> ppc64.
>>>>
>>>> I have a fix for the tests but I'm unable to run the ubuntu-22.04-s390x-all-system
>>>> job to verify it, even after setting Cirrus like Thomas taught me a week ago. In
>>>> fact I have no 'ubuntu-22-*' jobs available to run.
>>>
>>> It's on the private s390 VM we have, so it's set up only to
>>> be available on the main CI run (there's not enough capacity
>>> on the machine to do any more than that). If you want to point
>>> me at a gitlab branch I can do a quick "make check" on that
>>> if you like.
>>
>> I appreciate it. This is the repo:
>>
>> https://gitlab.com/danielhb/qemu/-/tree/pull_fix
>
> This doesn't fix the assertion. This is because the test (now) does:
>
> qpci_memread(&r_iommu->dev, r_iommu->reg_bar, reg_offset,
> ®, sizeof(reg));
>
> if (riscv_iommu_qtest_big_endian()) {
> reg = bswap32(reg);
> }
>
> where riscv_iommu_qtest_big_endian() is a wrapper for
> qtest_big_endian(). But qtest_big_endian() queries the
> endianness of the *guest*, and so for riscv it will
> always return false and we will never bswap.
Ooops. My bad.
>
> If you need to do swapping inline in a test you can use
> reg = le32_to_cpu(reg);
> which swaps an LE value read from the guest to the host
> CPU's endianness ordering (and similarly with cpu_to_le32
> on the write path).
>
> But it turns out that libqos provides already functions
> to read/write 32 and 64 bit values from PCI devices:
> reg = qpci_io_readl(&r_iommu->dev, r_iommu->reg_bar, reg_offset);
> which do the byteswap for you.
> Similarly qpci_io_writel() etc. (The functions work for
> both IO and MEM PCI BARs.)
I'll convert the tests to use qcpi_io_readl/writel and friends. Hopefully this
will fix it.
>
>> If this is enough to fix the tests, I'll amend it in the new IOMMU version.
>> If we still failing then I'll need to set this s390 VM.
>>
>> By the way, if you have any recipe/pointers to set this s390 VM to share,
>> that would be great.
>
> It's a VM provided by IBM under their "Community Cloud"
> umbrella: https://community.ibm.com/zsystems/l1cc/
I'll see if I can get something going with the IBM community cloud, but I wonder
how hard it is to launch a s390x TCG guest with Ubuntu. The image is freely
available:
https://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04.5-live-server-s390x.iso
But I'm unsure of whether we need some secret sauce/paid key to do the install. Thomas,
do you have more info?
Thanks,
Daniel
>
> thanks
> -- PMM
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-29 20:53 ` Daniel Henrique Barboza
2024-09-30 10:48 ` Peter Maydell
@ 2024-09-30 10:58 ` Thomas Huth
2024-09-30 11:35 ` Thomas Huth
1 sibling, 1 reply; 68+ messages in thread
From: Thomas Huth @ 2024-09-30 10:58 UTC (permalink / raw)
To: Daniel Henrique Barboza, Peter Maydell
Cc: Alistair Francis, qemu-devel, Alistair Francis
On 29/09/2024 22.53, Daniel Henrique Barboza wrote:
>
>
> On 9/29/24 12:38 PM, Peter Maydell wrote:
>> On Sat, 28 Sept 2024 at 21:40, Daniel Henrique Barboza
>> <dbarboza@ventanamicro.com> wrote:
>>>
>>>
>>>
>>> On 9/28/24 8:34 AM, Peter Maydell wrote:
>>>> The assertion failure is
>>>> ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
>>>> failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
>>>
>>> The root cause is that the qtests I added aren't considering the
>>> endianess of the
>>> host. The RISC-V IOMMU is being implemented as LE only and all regs are
>>> being
>>> read/written in memory as LE. The qtest read/write helpers must take the
>>> qtest
>>> endianess into account. We make this type of handling in other qtest
>>> archs like
>>> ppc64.
>>>
>>> I have a fix for the tests but I'm unable to run the ubuntu-22.04-s390x-
>>> all-system
>>> job to verify it, even after setting Cirrus like Thomas taught me a week
>>> ago. In
>>> fact I have no 'ubuntu-22-*' jobs available to run.
>>
>> It's on the private s390 VM we have, so it's set up only to
>> be available on the main CI run (there's not enough capacity
>> on the machine to do any more than that). If you want to point
>> me at a gitlab branch I can do a quick "make check" on that
>> if you like.
>
> I appreciate it. This is the repo:
>
> https://gitlab.com/danielhb/qemu/-/tree/pull_fix
>
> If this is enough to fix the tests, I'll amend it in the new IOMMU version.
> If we still failing then I'll need to set this s390 VM.
>
> By the way, if you have any recipe/pointers to set this s390 VM to share,
> that would be great.
You can also use Travis-CI for testing QEMU on a s390x host, see e.g. my
runs here:
https://app.travis-ci.com/github/huth/qemu
You just need a github account and set up the Travis-CI for that repository
there - only caveat: in case you run out of open source credits, you've got
to ask the Travis support for granting you new credits (seems like one has
to do it a year from what I experienced in the past).
HTH,
Thomas
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-30 10:58 ` Thomas Huth
@ 2024-09-30 11:35 ` Thomas Huth
2024-10-21 16:17 ` Thomas Huth
0 siblings, 1 reply; 68+ messages in thread
From: Thomas Huth @ 2024-09-30 11:35 UTC (permalink / raw)
To: Daniel Henrique Barboza, Peter Maydell
Cc: Alistair Francis, qemu-devel, Alistair Francis
On 30/09/2024 12.58, Thomas Huth wrote:
> On 29/09/2024 22.53, Daniel Henrique Barboza wrote:
>>
>>
>> On 9/29/24 12:38 PM, Peter Maydell wrote:
>>> On Sat, 28 Sept 2024 at 21:40, Daniel Henrique Barboza
>>> <dbarboza@ventanamicro.com> wrote:
>>>>
>>>>
>>>>
>>>> On 9/28/24 8:34 AM, Peter Maydell wrote:
>>>>> The assertion failure is
>>>>> ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
>>>>> failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
>>>>
>>>> The root cause is that the qtests I added aren't considering the
>>>> endianess of the
>>>> host. The RISC-V IOMMU is being implemented as LE only and all regs are
>>>> being
>>>> read/written in memory as LE. The qtest read/write helpers must take the
>>>> qtest
>>>> endianess into account. We make this type of handling in other qtest
>>>> archs like
>>>> ppc64.
>>>>
>>>> I have a fix for the tests but I'm unable to run the ubuntu-22.04-s390x-
>>>> all-system
>>>> job to verify it, even after setting Cirrus like Thomas taught me a week
>>>> ago. In
>>>> fact I have no 'ubuntu-22-*' jobs available to run.
>>>
>>> It's on the private s390 VM we have, so it's set up only to
>>> be available on the main CI run (there's not enough capacity
>>> on the machine to do any more than that). If you want to point
>>> me at a gitlab branch I can do a quick "make check" on that
>>> if you like.
>>
>> I appreciate it. This is the repo:
>>
>> https://gitlab.com/danielhb/qemu/-/tree/pull_fix
>>
>> If this is enough to fix the tests, I'll amend it in the new IOMMU version.
>> If we still failing then I'll need to set this s390 VM.
>>
>> By the way, if you have any recipe/pointers to set this s390 VM to share,
>> that would be great.
>
> You can also use Travis-CI for testing QEMU on a s390x host, see e.g. my
> runs here:
>
> https://app.travis-ci.com/github/huth/qemu
... apart from the fact, that it currently seems to be completely broken
again :-( ... I guess that means: Never mind, sorry for the distraction!
Thomas
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-30 11:35 ` Thomas Huth
@ 2024-10-21 16:17 ` Thomas Huth
0 siblings, 0 replies; 68+ messages in thread
From: Thomas Huth @ 2024-10-21 16:17 UTC (permalink / raw)
To: Daniel Henrique Barboza, Peter Maydell
Cc: Alistair Francis, qemu-devel, Alistair Francis
On 30/09/2024 13.35, Thomas Huth wrote:
> On 30/09/2024 12.58, Thomas Huth wrote:
>> On 29/09/2024 22.53, Daniel Henrique Barboza wrote:
>>>
>>>
>>> On 9/29/24 12:38 PM, Peter Maydell wrote:
>>>> On Sat, 28 Sept 2024 at 21:40, Daniel Henrique Barboza
>>>> <dbarboza@ventanamicro.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 9/28/24 8:34 AM, Peter Maydell wrote:
>>>>>> The assertion failure is
>>>>>> ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset: assertion
>>>>>> failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
>>>>>
>>>>> The root cause is that the qtests I added aren't considering the
>>>>> endianess of the
>>>>> host. The RISC-V IOMMU is being implemented as LE only and all regs are
>>>>> being
>>>>> read/written in memory as LE. The qtest read/write helpers must take
>>>>> the qtest
>>>>> endianess into account. We make this type of handling in other qtest
>>>>> archs like
>>>>> ppc64.
>>>>>
>>>>> I have a fix for the tests but I'm unable to run the ubuntu-22.04-
>>>>> s390x- all-system
>>>>> job to verify it, even after setting Cirrus like Thomas taught me a
>>>>> week ago. In
>>>>> fact I have no 'ubuntu-22-*' jobs available to run.
>>>>
>>>> It's on the private s390 VM we have, so it's set up only to
>>>> be available on the main CI run (there's not enough capacity
>>>> on the machine to do any more than that). If you want to point
>>>> me at a gitlab branch I can do a quick "make check" on that
>>>> if you like.
>>>
>>> I appreciate it. This is the repo:
>>>
>>> https://gitlab.com/danielhb/qemu/-/tree/pull_fix
>>>
>>> If this is enough to fix the tests, I'll amend it in the new IOMMU version.
>>> If we still failing then I'll need to set this s390 VM.
>>>
>>> By the way, if you have any recipe/pointers to set this s390 VM to share,
>>> that would be great.
>>
>> You can also use Travis-CI for testing QEMU on a s390x host, see e.g. my
>> runs here:
>>
>> https://app.travis-ci.com/github/huth/qemu
>
> ... apart from the fact, that it currently seems to be completely broken
> again :-( ... I guess that means: Never mind, sorry for the distraction!
FWIW, it seems to be working again:
https://app.travis-ci.com/github/huth/qemu/builds/272812225
Thomas
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-28 20:40 ` Daniel Henrique Barboza
2024-09-29 15:38 ` Peter Maydell
@ 2024-09-30 12:10 ` Ilya Leoshkevich
2024-09-30 23:04 ` Daniel Henrique Barboza
1 sibling, 1 reply; 68+ messages in thread
From: Ilya Leoshkevich @ 2024-09-30 12:10 UTC (permalink / raw)
To: Daniel Henrique Barboza, Peter Maydell, Alistair Francis
Cc: qemu-devel, Alistair Francis
On Sat, 2024-09-28 at 17:40 -0300, Daniel Henrique Barboza wrote:
>
>
> On 9/28/24 8:34 AM, Peter Maydell wrote:
> > On Tue, 24 Sept 2024 at 23:18, Alistair Francis
> > <alistair23@gmail.com> wrote:
> > >
> > > The following changes since commit
> > > 01dc65a3bc262ab1bec8fe89775e9bbfa627becb:
> > >
> > > Merge tag 'pull-target-arm-20240919' of
> > > https://git.linaro.org/people/pmaydell/qemu-arm into staging
> > > (2024-09-19 14:15:15 +0100)
> > >
> > > are available in the Git repository at:
> > >
> > > https://github.com/alistair23/qemu.git tags/pull-riscv-to-
> > > apply-20240925-1
> > >
> > > for you to fetch changes up to
> > > 6bfa92c5757fe7a9580e1f6e065076777cae650f:
> > >
> > > bsd-user: Add RISC-V 64-bit Target Configuration and Debug XML
> > > Files (2024-09-24 12:53:16 +1000)
> > >
> > > ----------------------------------------------------------------
> > > RISC-V PR for 9.2
> > >
> > > * Add a property to set vl to ceil(AVL/2)
> > > * Enable numamem testing for RISC-V
> > > * Consider MISA bit choice in implied rule
> > > * Fix the za64rs priv spec requirements
> > > * Enable Bit Manip for OpenTitan Ibex CPU
> > > * Fix the group bit setting of AIA with KVM
> > > * Stop timer with infinite timecmp
> > > * Add 'fcsr' register to QEMU log as a part of F extension
> > > * Fix riscv64 build on musl libc
> > > * Add preliminary textra trigger CSR functions
> > > * RISC-V IOMMU support
> > > * RISC-V bsd-user support
> > > * Respect firmware ELF entry point
> > > * Add Svvptc extension support
> > > * Fix masking of rv32 physical address
> > > * Fix linking problem with semihosting disabled
> > > * Fix IMSIC interrupt state updates
> > >
> >
> > This fails the riscv qos-tests on s390x. My guess is that the new
> > IOMMU support has endianness bugs and fails on bigendian hosts.
> >
> > https://gitlab.com/qemu-project/qemu/-/jobs/7942189143
> >
> > full test log (4MB) at
> > https://qemu-project.gitlab.io/-/qemu/-/jobs/7942189143/artifacts/build/meson-logs/testlog.txt
> >
> > The assertion failure is
> > ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset:
> > assertion
> > failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
>
> The root cause is that the qtests I added aren't considering the
> endianess of the
> host. The RISC-V IOMMU is being implemented as LE only and all regs
> are being
> read/written in memory as LE. The qtest read/write helpers must take
> the qtest
> endianess into account. We make this type of handling in other qtest
> archs like
> ppc64.
>
> I have a fix for the tests but I'm unable to run the ubuntu-22.04-
> s390x-all-system
> job to verify it, even after setting Cirrus like Thomas taught me a
> week ago. In
> fact I have no 'ubuntu-22-*' jobs available to run.
>
> If there's a way to run these ubuntu s390x tests, let me know.
> Otherwise I'm inclined
> to remove the IOMMU qtests for now until I'm able to verify that
> they'll work in a
> BE host (I'll install a BE VM to verify).
You can get a free s390x VM here:
https://linuxone.cloud.marist.edu/#/register?flag=VM
Best regards,
Ilya
[...]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PULL v2 00/47] riscv-to-apply queue
2024-09-30 12:10 ` Ilya Leoshkevich
@ 2024-09-30 23:04 ` Daniel Henrique Barboza
0 siblings, 0 replies; 68+ messages in thread
From: Daniel Henrique Barboza @ 2024-09-30 23:04 UTC (permalink / raw)
To: Ilya Leoshkevich, Peter Maydell, Alistair Francis
Cc: qemu-devel, Alistair Francis
On 9/30/24 9:10 AM, Ilya Leoshkevich wrote:
> On Sat, 2024-09-28 at 17:40 -0300, Daniel Henrique Barboza wrote:
>>
>>
>> On 9/28/24 8:34 AM, Peter Maydell wrote:
>>> On Tue, 24 Sept 2024 at 23:18, Alistair Francis
>>> <alistair23@gmail.com> wrote:
>>>>
>>>> The following changes since commit
>>>> 01dc65a3bc262ab1bec8fe89775e9bbfa627becb:
>>>>
>>>> Merge tag 'pull-target-arm-20240919' of
>>>> https://git.linaro.org/people/pmaydell/qemu-arm into staging
>>>> (2024-09-19 14:15:15 +0100)
>>>>
>>>> are available in the Git repository at:
>>>>
>>>> https://github.com/alistair23/qemu.git tags/pull-riscv-to-
>>>> apply-20240925-1
>>>>
>>>> for you to fetch changes up to
>>>> 6bfa92c5757fe7a9580e1f6e065076777cae650f:
>>>>
>>>> bsd-user: Add RISC-V 64-bit Target Configuration and Debug XML
>>>> Files (2024-09-24 12:53:16 +1000)
>>>>
>>>> ----------------------------------------------------------------
>>>> RISC-V PR for 9.2
>>>>
>>>> * Add a property to set vl to ceil(AVL/2)
>>>> * Enable numamem testing for RISC-V
>>>> * Consider MISA bit choice in implied rule
>>>> * Fix the za64rs priv spec requirements
>>>> * Enable Bit Manip for OpenTitan Ibex CPU
>>>> * Fix the group bit setting of AIA with KVM
>>>> * Stop timer with infinite timecmp
>>>> * Add 'fcsr' register to QEMU log as a part of F extension
>>>> * Fix riscv64 build on musl libc
>>>> * Add preliminary textra trigger CSR functions
>>>> * RISC-V IOMMU support
>>>> * RISC-V bsd-user support
>>>> * Respect firmware ELF entry point
>>>> * Add Svvptc extension support
>>>> * Fix masking of rv32 physical address
>>>> * Fix linking problem with semihosting disabled
>>>> * Fix IMSIC interrupt state updates
>>>>
>>>
>>> This fails the riscv qos-tests on s390x. My guess is that the new
>>> IOMMU support has endianness bugs and fails on bigendian hosts.
>>>
>>> https://gitlab.com/qemu-project/qemu/-/jobs/7942189143
>>>
>>> full test log (4MB) at
>>> https://qemu-project.gitlab.io/-/qemu/-/jobs/7942189143/artifacts/build/meson-logs/testlog.txt
>>>
>>> The assertion failure is
>>> ERROR:../tests/qtest/riscv-iommu-test.c:72:test_reg_reset:
>>> assertion
>>> failed (cap & RISCV_IOMMU_CAP_VERSION == 0x10): (0 == 16)
>>
>> The root cause is that the qtests I added aren't considering the
>> endianess of the
>> host. The RISC-V IOMMU is being implemented as LE only and all regs
>> are being
>> read/written in memory as LE. The qtest read/write helpers must take
>> the qtest
>> endianess into account. We make this type of handling in other qtest
>> archs like
>> ppc64.
>>
>> I have a fix for the tests but I'm unable to run the ubuntu-22.04-
>> s390x-all-system
>> job to verify it, even after setting Cirrus like Thomas taught me a
>> week ago. In
>> fact I have no 'ubuntu-22-*' jobs available to run.
>>
>> If there's a way to run these ubuntu s390x tests, let me know.
>> Otherwise I'm inclined
>> to remove the IOMMU qtests for now until I'm able to verify that
>> they'll work in a
>> BE host (I'll install a BE VM to verify).
>
> You can get a free s390x VM here:
>
> https://linuxone.cloud.marist.edu/#/register?flag=VM
Thanks! This was surprisingly easy to set up and run. Please send my best regards
to the LinuxOne Community Cloud team.
Peter, in fact there were some endianness problems in the code like you hinted
before. I fixed them up and the tests are now (apparently) passing. I'll clean
stuff up and see if I can send a new version of the whole series in the next
few days.
Thanks,
Daniel
>
> Best regards,
> Ilya
>
> [...]
^ permalink raw reply [flat|nested] 68+ messages in thread