[PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
@ 2022-07-07 16:16 Cornelia Huck
  2022-07-07 16:16 ` [PATCH RFC v2 1/2] arm/kvm: add support for MTE Cornelia Huck
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Cornelia Huck @ 2022-07-07 16:16 UTC (permalink / raw)
  To: Peter Maydell, Thomas Huth, Laurent Vivier
  Cc: Eric Auger, Dr. David Alan Gilbert, Juan Quintela, qemu-arm,
	qemu-devel, kvm, Cornelia Huck

This series makes it possible to enable MTE for kvm guests, if the kernel
supports it. Again, tested on the simulator via patiently waiting for the
arm64/mte kselftests to finish successfully.

For tcg, turning on mte on the machine level (to get tag memory) stays a
requirement. If the new mte cpu feature is not explicitly specified, a tcg
vm will get mte depending on the presence of tag memory (just as today).

For kvm, mte stays off by default; this is because migration is not yet
supported (postcopy will need an extension of the kernel interface, possibly
an extension of the userfaultfd interface), and turning on mte will add a
migration blocker.

My biggest question going forward is actually concerning migration; I gather
that we should not bother adding something unless postcopy is working as well?
If I'm not misunderstanding things, we need a way to fault in a page together
with the tag; doing that in one go is probably the only way that we can be
sure that this is race-free on the QEMU side. Comments welcome :)

Changes v1->v2: [Thanks to Eric for the feedback!]
- add documentation
- switch the mte prop to OnOffAuto; this improves the interaction with the
  existing mte machine prop
- leave mte off for kvm by default
- improve tests; the poking in QDicts feels a bit ugly, but seems to work

Cornelia Huck (2):
  arm/kvm: add support for MTE
  qtests/arm: add some mte tests

 docs/system/arm/cpu-features.rst |  21 +++++
 target/arm/cpu.c                 |  18 ++---
 target/arm/cpu.h                 |   1 +
 target/arm/cpu64.c               | 132 +++++++++++++++++++++++++++++++
 target/arm/internals.h           |   1 +
 target/arm/kvm64.c               |   5 ++
 target/arm/kvm_arm.h             |  12 +++
 target/arm/monitor.c             |   1 +
 tests/qtest/arm-cpu-features.c   |  77 ++++++++++++++++++
 9 files changed, 256 insertions(+), 12 deletions(-)

-- 
2.35.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH RFC v2 1/2] arm/kvm: add support for MTE
  2022-07-07 16:16 [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm Cornelia Huck
@ 2022-07-07 16:16 ` Cornelia Huck
  2022-07-07 16:16 ` [PATCH RFC v2 2/2] qtests/arm: add some mte tests Cornelia Huck
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Cornelia Huck @ 2022-07-07 16:16 UTC (permalink / raw)
  To: Peter Maydell, Thomas Huth, Laurent Vivier
  Cc: Eric Auger, Dr. David Alan Gilbert, Juan Quintela, qemu-arm,
	qemu-devel, kvm, Cornelia Huck

Introduce a new cpu feature flag to control MTE support. To preserve
backwards compatibility for tcg, MTE will continue to be enabled as
long as tag memory has been provided.

If MTE has been enabled, we need to disable migration, as we do not
yet have a way to migrate the tags as well. Therefore, MTE will stay
off with KVM unless requested explicitly.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 docs/system/arm/cpu-features.rst |  21 +++++
 target/arm/cpu.c                 |  18 ++---
 target/arm/cpu.h                 |   1 +
 target/arm/cpu64.c               | 132 +++++++++++++++++++++++++++++++
 target/arm/internals.h           |   1 +
 target/arm/kvm64.c               |   5 ++
 target/arm/kvm_arm.h             |  12 +++
 target/arm/monitor.c             |   1 +
 8 files changed, 179 insertions(+), 12 deletions(-)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index 3fd76fa0b4fb..5ca573ddd553 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -443,3 +443,24 @@ As with ``sve-default-vector-length``, if the default length is larger
 than the maximum vector length enabled, the actual vector length will
 be reduced.  If this property is set to ``-1`` then the default vector
 length is set to the maximum possible length.
+
+MTE CPU Property
+================
+
+The ``mte`` property controls the Memory Tagging Extension. For TCG, it requires
+presence of tag memory (which can be turned on for the ``virt`` machine via
+``mte=on``). For KVM, it requires the ``KVM_CAP_ARM_MTE`` capability; until
+proper migration support is implemented, enabling MTE will install a migration
+blocker.
+
+If not specified explicitly via ``on`` or ``off``, MTE will be available
+according to the following rules:
+
+* When TCG is used, MTE will be available iff tag memory is available; i. e. it
+  preserves the behaviour prior to introduction of the feature.
+
+* When KVM is used, MTE will default to off, so that migration will not
+  unintentionally be blocked.
+
+* Other accelerators currently don't support MTE.
+
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index ae6dca2f0106..3844c7d90ea2 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1442,6 +1442,11 @@ void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp)
             error_propagate(errp, local_err);
             return;
         }
+        arm_cpu_mte_finalize(cpu, &local_err);
+        if (local_err != NULL) {
+            error_propagate(errp, local_err);
+            return;
+        }
     }
 #endif
 
@@ -1518,7 +1523,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         }
         if (cpu->tag_memory) {
             error_setg(errp,
-                       "Cannot enable %s when guest CPUs has MTE enabled",
+                       "Cannot enable %s when guest CPUs has tag memory enabled",
                        current_accel_name());
             return;
         }
@@ -1897,17 +1902,6 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
                                        ID_PFR1, VIRTUALIZATION, 0);
     }
 
-#ifndef CONFIG_USER_ONLY
-    if (cpu->tag_memory == NULL && cpu_isar_feature(aa64_mte, cpu)) {
-        /*
-         * Disable the MTE feature bits if we do not have tag-memory
-         * provided by the machine.
-         */
-        cpu->isar.id_aa64pfr1 =
-            FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 0);
-    }
-#endif
-
     /* MPU can be configured out of a PMSA CPU either by setting has-mpu
      * to false or by setting pmsav7-dregion to 0.
      */
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 4a4342f26220..1ce46649b1f0 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1028,6 +1028,7 @@ struct ArchCPU {
     bool prop_pauth;
     bool prop_pauth_impdef;
     bool prop_lpa2;
+    OnOffAuto prop_mte;
 
     /* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
     uint32_t dcz_blocksize;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 19188d6cc2ac..4cc8f724c054 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -35,7 +35,13 @@
 #include "qapi/visitor.h"
 #include "hw/qdev-properties.h"
 #include "internals.h"
+#include "migration/blocker.h"
+#include "qapi/qapi-visit-common.h"
+#include "hw/arm/virt.h"
 
+#ifdef CONFIG_KVM
+static Error *mte_migration_blocker;
+#endif
 
 static void aarch64_a57_initfn(Object *obj)
 {
@@ -900,6 +906,130 @@ void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp)
     cpu->isar.id_aa64mmfr0 = t;
 }
 
+static void aarch64_cpu_get_mte(Object *obj, Visitor *v, const char *name,
+                                void *opaque, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    OnOffAuto mte = cpu->prop_mte;
+
+    visit_type_OnOffAuto(v, name, &mte, errp);
+}
+
+static void aarch64_cpu_set_mte(Object *obj, Visitor *v, const char *name,
+                                void *opaque, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    visit_type_OnOffAuto(v, name, &cpu->prop_mte, errp);
+
+}
+
+static void aarch64_add_mte_properties(Object *obj)
+{
+    /*
+     * For tcg, "AUTO" means turn on mte if tag memory has been provided, and
+     * turn it off (without error) if not.
+     * For kvm, "AUTO" currently means mte off, as migration is not supported
+     * yet.
+     * For all others, "AUTO" means mte off.
+     */
+    object_property_add(obj, "mte", "OnOffAuto", aarch64_cpu_get_mte,
+                        aarch64_cpu_set_mte, NULL, NULL);
+}
+
+static inline bool arm_machine_has_tag_memory(void)
+{
+#ifndef CONFIG_USER_ONLY
+    Object *obj = object_dynamic_cast(qdev_get_machine(), TYPE_VIRT_MACHINE);
+
+    /* so far, only the virt machine has support for tag memory */
+    if (obj) {
+        VirtMachineState *vms = VIRT_MACHINE(obj);
+
+        return vms->mte;
+    }
+#endif
+    return false;
+}
+
+void arm_cpu_mte_finalize(ARMCPU *cpu, Error **errp)
+{
+    bool enable_mte;
+
+    switch (cpu->prop_mte) {
+    case ON_OFF_AUTO_OFF:
+        enable_mte = false;
+        break;
+    case ON_OFF_AUTO_ON:
+        if (!kvm_enabled()) {
+            if (cpu_isar_feature(aa64_mte, cpu)) {
+                if (!arm_machine_has_tag_memory()) {
+                    error_setg(errp, "mte=on requires tag memory");
+                    return;
+                }
+            } else {
+                error_setg(errp, "mte not provided");
+                return;
+            }
+        }
+#ifdef CONFIG_KVM
+        if (kvm_enabled() && !kvm_arm_mte_supported()) {
+            error_setg(errp, "mte not supported by kvm");
+            return;
+        }
+#endif
+        enable_mte = true;
+        break;
+    default: /* AUTO */
+        if (!kvm_enabled()) {
+            if (cpu_isar_feature(aa64_mte, cpu)) {
+                /*
+                 * Tie mte enablement to presence of tag memory, in order to
+                 * preserve pre-existing behaviour.
+                 */
+                enable_mte = arm_machine_has_tag_memory();
+            } else {
+                enable_mte = false;
+            }
+            break;
+        } else {
+            /*
+             * This cannot yet be
+             * enable_mte = kvm_arm_mte_supported();
+             * as we don't support migration yet.
+             */
+            enable_mte = false;
+        }
+    }
+
+    if (!enable_mte) {
+        /* Disable MTE feature bits. */
+        cpu->isar.id_aa64pfr1 =
+            FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 0);
+        return;
+    }
+
+    /* accelerator-specific enablement */
+    if (kvm_enabled()) {
+#ifdef CONFIG_KVM
+        if (kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0)) {
+            error_setg(errp, "Failed to enable KVM_CAP_ARM_MTE");
+        } else {
+            /* TODO: add proper migration support with MTE enabled */
+            if (!mte_migration_blocker) {
+                error_setg(&mte_migration_blocker,
+                           "Live migration disabled due to MTE enabled");
+                if (migrate_add_blocker(mte_migration_blocker, NULL)) {
+                    error_setg(errp, "Failed to add MTE migration blocker");
+                    error_free(&mte_migration_blocker);
+                    mte_migration_blocker = NULL;
+                }
+            }
+        }
+#endif
+    }
+}
+
 static void aarch64_host_initfn(Object *obj)
 {
 #if defined(CONFIG_KVM)
@@ -908,6 +1038,7 @@ static void aarch64_host_initfn(Object *obj)
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
         aarch64_add_sve_properties(obj);
         aarch64_add_pauth_properties(obj);
+        aarch64_add_mte_properties(obj);
     }
 #elif defined(CONFIG_HVF)
     ARMCPU *cpu = ARM_CPU(obj);
@@ -1089,6 +1220,7 @@ static void aarch64_max_initfn(Object *obj)
     object_property_add(obj, "sve-max-vq", "uint32", cpu_max_get_sve_max_vq,
                         cpu_max_set_sve_max_vq, NULL, NULL);
     qdev_property_add_static(DEVICE(obj), &arm_cpu_lpa2_property);
+    aarch64_add_mte_properties(obj);
 }
 
 static void aarch64_a64fx_initfn(Object *obj)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index c66f74a0db12..1385f8d07cab 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1292,6 +1292,7 @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp);
 void arm_cpu_sme_finalize(ARMCPU *cpu, Error **errp);
 void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp);
 void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp);
+void arm_cpu_mte_finalize(ARMCPU *cpu, Error **errp);
 #endif
 
 #ifdef CONFIG_USER_ONLY
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index d16d4ea2500b..7d25d0f01cca 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -761,6 +761,11 @@ bool kvm_arm_steal_time_supported(void)
     return kvm_check_extension(kvm_state, KVM_CAP_STEAL_TIME);
 }
 
+bool kvm_arm_mte_supported(void)
+{
+    return kvm_check_extension(kvm_state, KVM_CAP_ARM_MTE);
+}
+
 QEMU_BUILD_BUG_ON(KVM_ARM64_SVE_VQ_MIN != 1);
 
 uint32_t kvm_arm_sve_get_vls(CPUState *cs)
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 99017b635ce4..762443f8a7c0 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -305,6 +305,13 @@ bool kvm_arm_pmu_supported(void);
  */
 bool kvm_arm_sve_supported(void);
 
+/**
+ * kvm_arm_mte_supported:
+ *
+ * Returns: true if KVM can enable MTE, and false otherwise.
+ */
+bool kvm_arm_mte_supported(void);
+
 /**
  * kvm_arm_get_max_vm_ipa_size:
  * @ms: Machine state handle
@@ -395,6 +402,11 @@ static inline bool kvm_arm_steal_time_supported(void)
     return false;
 }
 
+static inline bool kvm_arm_mte_supported(void)
+{
+    return false;
+}
+
 /*
  * These functions should never actually be called without KVM support.
  */
diff --git a/target/arm/monitor.c b/target/arm/monitor.c
index 80c64fa3556d..f13ff2664b67 100644
--- a/target/arm/monitor.c
+++ b/target/arm/monitor.c
@@ -96,6 +96,7 @@ static const char *cpu_model_advertised_features[] = {
     "sve1408", "sve1536", "sve1664", "sve1792", "sve1920", "sve2048",
     "kvm-no-adjvtime", "kvm-steal-time",
     "pauth", "pauth-impdef",
+    "mte",
     NULL
 };
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC v2 2/2] qtests/arm: add some mte tests
  2022-07-07 16:16 [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm Cornelia Huck
  2022-07-07 16:16 ` [PATCH RFC v2 1/2] arm/kvm: add support for MTE Cornelia Huck
@ 2022-07-07 16:16 ` Cornelia Huck
  2022-07-09  2:59 ` [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm Richard Henderson
  2022-07-11 13:24 ` Dr. David Alan Gilbert
  3 siblings, 0 replies; 12+ messages in thread
From: Cornelia Huck @ 2022-07-07 16:16 UTC (permalink / raw)
  To: Peter Maydell, Thomas Huth, Laurent Vivier
  Cc: Eric Auger, Dr. David Alan Gilbert, Juan Quintela, qemu-arm,
	qemu-devel, kvm, Cornelia Huck

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 tests/qtest/arm-cpu-features.c | 77 ++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/tests/qtest/arm-cpu-features.c b/tests/qtest/arm-cpu-features.c
index 5a145273860c..466be857d391 100644
--- a/tests/qtest/arm-cpu-features.c
+++ b/tests/qtest/arm-cpu-features.c
@@ -22,6 +22,7 @@
 
 #define MACHINE     "-machine virt,gic-version=max -accel tcg "
 #define MACHINE_KVM "-machine virt,gic-version=max -accel kvm -accel tcg "
+#define MACHINE_MTE "-machine virt,gic-version=max,mte=on -accel tcg "
 #define QUERY_HEAD  "{ 'execute': 'query-cpu-model-expansion', " \
                     "  'arguments': { 'type': 'full', "
 #define QUERY_TAIL  "}}"
@@ -155,6 +156,18 @@ static bool resp_get_feature(QDict *resp, const char *feature)
     g_assert(qdict_get_bool(_props, feature) == (expected_value));     \
 })
 
+#define resp_assert_feature_str(resp, feature, expected_value)         \
+({                                                                     \
+    QDict *_props;                                                     \
+                                                                       \
+    g_assert(_resp);                                                   \
+    g_assert(resp_has_props(_resp));                                   \
+    _props = resp_get_props(_resp);                                    \
+    g_assert(qdict_get(_props, feature));                              \
+    g_assert_cmpstr(qdict_get_try_str(_props, feature), ==,            \
+                    expected_value);                                   \
+})
+
 #define assert_feature(qts, cpu_type, feature, expected_value)         \
 ({                                                                     \
     QDict *_resp;                                                      \
@@ -165,6 +178,16 @@ static bool resp_get_feature(QDict *resp, const char *feature)
     qobject_unref(_resp);                                              \
 })
 
+#define assert_feature_str(qts, cpu_type, feature, expected_value)     \
+({                                                                     \
+    QDict *_resp;                                                      \
+                                                                       \
+    _resp = do_query_no_props(qts, cpu_type);                          \
+    g_assert(_resp);                                                   \
+    resp_assert_feature_str(_resp, feature, expected_value);           \
+    qobject_unref(_resp);                                              \
+})
+
 #define assert_set_feature(qts, cpu_type, feature, value)              \
 ({                                                                     \
     const char *_fmt = (value) ? "{ %s: true }" : "{ %s: false }";     \
@@ -176,6 +199,16 @@ static bool resp_get_feature(QDict *resp, const char *feature)
     qobject_unref(_resp);                                              \
 })
 
+#define assert_set_feature_str(qts, cpu_type, feature, value, _fmt)    \
+({                                                                     \
+    QDict *_resp;                                                      \
+                                                                       \
+    _resp = do_query(qts, cpu_type, _fmt, feature);                    \
+    g_assert(_resp);                                                   \
+    resp_assert_feature_str(_resp, feature, value);                    \
+    qobject_unref(_resp);                                              \
+})
+
 #define assert_has_feature_enabled(qts, cpu_type, feature)             \
     assert_feature(qts, cpu_type, feature, true)
 
@@ -412,6 +445,24 @@ static void sve_tests_sve_off_kvm(const void *data)
     qtest_quit(qts);
 }
 
+static void mte_tests_tag_memory_on(const void *data)
+{
+    QTestState *qts;
+
+    qts = qtest_init(MACHINE_MTE "-cpu max");
+
+    /*
+     * With tag memory, "mte" should default to on, and explicitly specifying
+     * either on or off should be fine.
+     */
+    assert_has_feature(qts, "max", "mte");
+
+    assert_set_feature_str(qts, "max", "mte", "off", "{ 'mte': 'off' }");
+    assert_set_feature_str(qts, "max", "mte", "on", "{ 'mte': 'on' }");
+
+    qtest_quit(qts);
+}
+
 static void pauth_tests_default(QTestState *qts, const char *cpu_type)
 {
     assert_has_feature_enabled(qts, cpu_type, "pauth");
@@ -424,6 +475,21 @@ static void pauth_tests_default(QTestState *qts, const char *cpu_type)
                  "{ 'pauth': false, 'pauth-impdef': true }");
 }
 
+static void mte_tests_default(QTestState *qts, const char *cpu_type)
+{
+    assert_has_feature(qts, cpu_type, "mte");
+
+    /*
+     * Without tag memory, mte will be off under tcg.
+     * Explicitly enabling it yields an error.
+     */
+    assert_has_feature(qts, cpu_type, "mte");
+
+    assert_set_feature_str(qts, "max", "mte", "off", "{ 'mte': 'off' }");
+    assert_error(qts, cpu_type, "mte=on requires tag memory",
+                 "{ 'mte': 'on' }");
+}
+
 static void test_query_cpu_model_expansion(const void *data)
 {
     QTestState *qts;
@@ -473,6 +539,7 @@ static void test_query_cpu_model_expansion(const void *data)
 
         sve_tests_default(qts, "max");
         pauth_tests_default(qts, "max");
+        mte_tests_default(qts, "max");
 
         /* Test that features that depend on KVM generate errors without. */
         assert_error(qts, "max",
@@ -499,6 +566,7 @@ static void test_query_cpu_model_expansion_kvm(const void *data)
     if (g_str_equal(qtest_get_arch(), "aarch64")) {
         bool kvm_supports_steal_time;
         bool kvm_supports_sve;
+        bool kvm_supports_mte;
         char max_name[8], name[8];
         uint32_t max_vq, vq;
         uint64_t vls;
@@ -523,10 +591,12 @@ static void test_query_cpu_model_expansion_kvm(const void *data)
          */
         assert_has_feature(qts, "host", "kvm-steal-time");
         assert_has_feature(qts, "host", "sve");
+        assert_has_feature(qts, "host", "mte");
 
         resp = do_query_no_props(qts, "host");
         kvm_supports_steal_time = resp_get_feature(resp, "kvm-steal-time");
         kvm_supports_sve = resp_get_feature(resp, "sve");
+        kvm_supports_mte = resp_get_feature(resp, "mte");
         vls = resp_get_sve_vls(resp);
         qobject_unref(resp);
 
@@ -592,6 +662,11 @@ static void test_query_cpu_model_expansion_kvm(const void *data)
         } else {
             g_assert(vls == 0);
         }
+        if (kvm_supports_mte) {
+            /* If we have mte then we should be able to toggle it. */
+            assert_set_feature(qts, "host", "mte", false);
+            assert_set_feature(qts, "host", "mte", true);
+        }
     } else {
         assert_has_not_feature(qts, "host", "aarch64");
         assert_has_not_feature(qts, "host", "pmu");
@@ -630,6 +705,8 @@ int main(int argc, char **argv)
                             NULL, sve_tests_sve_off);
         qtest_add_data_func("/arm/kvm/query-cpu-model-expansion/sve-off",
                             NULL, sve_tests_sve_off_kvm);
+        qtest_add_data_func("/arm/max/query-cpu-model-expansion/tag-memory",
+                            NULL, mte_tests_tag_memory_on);
     }
 
     return g_test_run();
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-07 16:16 [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm Cornelia Huck
  2022-07-07 16:16 ` [PATCH RFC v2 1/2] arm/kvm: add support for MTE Cornelia Huck
  2022-07-07 16:16 ` [PATCH RFC v2 2/2] qtests/arm: add some mte tests Cornelia Huck
@ 2022-07-09  2:59 ` Richard Henderson
  2022-07-11 13:24 ` Dr. David Alan Gilbert
  3 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2022-07-09  2:59 UTC (permalink / raw)
  To: Cornelia Huck, Peter Maydell, Thomas Huth, Laurent Vivier
  Cc: Eric Auger, Dr. David Alan Gilbert, Juan Quintela, qemu-arm,
	qemu-devel, kvm

On 7/7/22 21:46, Cornelia Huck wrote:
> If I'm not misunderstanding things, we need a way to fault in a page together
> with the tag; doing that in one go is probably the only way that we can be
> sure that this is race-free on the QEMU side.

That's my understanding as well.


r~


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-07 16:16 [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm Cornelia Huck
                   ` (2 preceding siblings ...)
  2022-07-09  2:59 ` [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm Richard Henderson
@ 2022-07-11 13:24 ` Dr. David Alan Gilbert
  2022-07-11 13:39   ` Peter Maydell
                     ` (2 more replies)
  3 siblings, 3 replies; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2022-07-11 13:24 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Peter Maydell, Thomas Huth, Laurent Vivier, Eric Auger,
	Juan Quintela, qemu-arm, qemu-devel, kvm

* Cornelia Huck (cohuck@redhat.com) wrote:
> This series makes it possible to enable MTE for kvm guests, if the kernel
> supports it. Again, tested on the simulator via patiently waiting for the
> arm64/mte kselftests to finish successfully.
> 
> For tcg, turning on mte on the machine level (to get tag memory) stays a
> requirement. If the new mte cpu feature is not explicitly specified, a tcg
> vm will get mte depending on the presence of tag memory (just as today).
> 
> For kvm, mte stays off by default; this is because migration is not yet
> supported (postcopy will need an extension of the kernel interface, possibly
> an extension of the userfaultfd interface), and turning on mte will add a
> migration blocker.

My assumption was that a normal migration would need something as well
to retrieve and place the MTE flags; albeit not atomically.

> My biggest question going forward is actually concerning migration; I gather
> that we should not bother adding something unless postcopy is working as well?

I don't think that restriction is fair on you; just make sure
postcopy_ram_supported_by_host gains an arch call and fails cleanly;
that way if anyone tries to enable postcopy they'll find out with a
clean fail.

> If I'm not misunderstanding things, we need a way to fault in a page together
> with the tag; doing that in one go is probably the only way that we can be
> sure that this is race-free on the QEMU side. Comments welcome :)

I think it will.
But, ignoring postcopy for a minute, with KVM how do different types of
backing memory work - e.g. if I back a region of guest memory with
/dev/shm/something or a hugepage equivalent, where does the MTE memory
come from, and how do you set it?

Dave

> Changes v1->v2: [Thanks to Eric for the feedback!]
> - add documentation
> - switch the mte prop to OnOffAuto; this improves the interaction with the
>   existing mte machine prop
> - leave mte off for kvm by default
> - improve tests; the poking in QDicts feels a bit ugly, but seems to work
> 
> Cornelia Huck (2):
>   arm/kvm: add support for MTE
>   qtests/arm: add some mte tests
> 
>  docs/system/arm/cpu-features.rst |  21 +++++
>  target/arm/cpu.c                 |  18 ++---
>  target/arm/cpu.h                 |   1 +
>  target/arm/cpu64.c               | 132 +++++++++++++++++++++++++++++++
>  target/arm/internals.h           |   1 +
>  target/arm/kvm64.c               |   5 ++
>  target/arm/kvm_arm.h             |  12 +++
>  target/arm/monitor.c             |   1 +
>  tests/qtest/arm-cpu-features.c   |  77 ++++++++++++++++++
>  9 files changed, 256 insertions(+), 12 deletions(-)
> 
> -- 
> 2.35.3
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-11 13:24 ` Dr. David Alan Gilbert
@ 2022-07-11 13:39   ` Peter Maydell
  2022-07-11 14:26     ` Dr. David Alan Gilbert
  2022-07-11 13:55   ` Dr. David Alan Gilbert
  2022-07-11 15:08   ` Cornelia Huck
  2 siblings, 1 reply; 12+ messages in thread
From: Peter Maydell @ 2022-07-11 13:39 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Cornelia Huck, Thomas Huth, Laurent Vivier, Eric Auger,
	Juan Quintela, qemu-arm, qemu-devel, kvm

On Mon, 11 Jul 2022 at 14:24, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
> But, ignoring postcopy for a minute, with KVM how do different types of
> backing memory work - e.g. if I back a region of guest memory with
> /dev/shm/something or a hugepage equivalent, where does the MTE memory
> come from, and how do you set it?

Generally in an MTE system anything that's "plain old RAM" is expected
to support tags. (The architecture manual calls this "conventional
memory". This isn't quite the same as "anything that looks RAM-like",
e.g. the graphics card framebuffer doesn't have to support tags!)

One plausible implementation is that the firmware and memory controller
are in cahoots and arrange that the appropriate fraction of the DRAM is
reserved for holding tags (and inaccessible as normal RAM even by the OS);
but where the tags are stored is entirely impdef and an implementation
could choose to put the tags in their own entirely separate storage if
it liked. The only way to access the tag storage is via the instructions
for getting and setting tags.

-- PMM

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-11 13:24 ` Dr. David Alan Gilbert
  2022-07-11 13:39   ` Peter Maydell
@ 2022-07-11 13:55   ` Dr. David Alan Gilbert
  2022-07-11 15:08   ` Cornelia Huck
  2 siblings, 0 replies; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2022-07-11 13:55 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Peter Maydell, Thomas Huth, Laurent Vivier, Eric Auger,
	Juan Quintela, qemu-arm, qemu-devel, kvm

* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> * Cornelia Huck (cohuck@redhat.com) wrote:
> > This series makes it possible to enable MTE for kvm guests, if the kernel
> > supports it. Again, tested on the simulator via patiently waiting for the
> > arm64/mte kselftests to finish successfully.
> > 
> > For tcg, turning on mte on the machine level (to get tag memory) stays a
> > requirement. If the new mte cpu feature is not explicitly specified, a tcg
> > vm will get mte depending on the presence of tag memory (just as today).
> > 
> > For kvm, mte stays off by default; this is because migration is not yet
> > supported (postcopy will need an extension of the kernel interface, possibly
> > an extension of the userfaultfd interface), and turning on mte will add a
> > migration blocker.
> 
> My assumption was that a normal migration would need something as well
> to retrieve and place the MTE flags; albeit not atomically.
> 
> > My biggest question going forward is actually concerning migration; I gather
> > that we should not bother adding something unless postcopy is working as well?
> 
> I don't think that restriction is fair on you; just make sure
> postcopy_ram_supported_by_host gains an arch call and fails cleanly;
> that way if anyone tries to enable postcopy they'll find out with a
> clean fail.
> 
> > If I'm not misunderstanding things, we need a way to fault in a page together
> > with the tag; doing that in one go is probably the only way that we can be
> > sure that this is race-free on the QEMU side. Comments welcome :)
> 
> I think it will.
> But, ignoring postcopy for a minute, with KVM how do different types of
> backing memory work - e.g. if I back a region of guest memory with
> /dev/shm/something or a hugepage equivalent, where does the MTE memory
> come from, and how do you set it?

Another case that just came to mind, are the data content optimisations;
we special case all-zero pages, which I guess you still need to transmit
tags for, and the xbzrle page-difference code wouldn't notice
differences in tags.

Dave

> Dave
> 
> > Changes v1->v2: [Thanks to Eric for the feedback!]
> > - add documentation
> > - switch the mte prop to OnOffAuto; this improves the interaction with the
> >   existing mte machine prop
> > - leave mte off for kvm by default
> > - improve tests; the poking in QDicts feels a bit ugly, but seems to work
> > 
> > Cornelia Huck (2):
> >   arm/kvm: add support for MTE
> >   qtests/arm: add some mte tests
> > 
> >  docs/system/arm/cpu-features.rst |  21 +++++
> >  target/arm/cpu.c                 |  18 ++---
> >  target/arm/cpu.h                 |   1 +
> >  target/arm/cpu64.c               | 132 +++++++++++++++++++++++++++++++
> >  target/arm/internals.h           |   1 +
> >  target/arm/kvm64.c               |   5 ++
> >  target/arm/kvm_arm.h             |  12 +++
> >  target/arm/monitor.c             |   1 +
> >  tests/qtest/arm-cpu-features.c   |  77 ++++++++++++++++++
> >  9 files changed, 256 insertions(+), 12 deletions(-)
> > 
> > -- 
> > 2.35.3
> > 
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-11 13:39   ` Peter Maydell
@ 2022-07-11 14:26     ` Dr. David Alan Gilbert
  2022-07-11 14:56       ` Cornelia Huck
  0 siblings, 1 reply; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2022-07-11 14:26 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Cornelia Huck, Thomas Huth, Laurent Vivier, Eric Auger,
	Juan Quintela, qemu-arm, qemu-devel, kvm

* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Mon, 11 Jul 2022 at 14:24, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> > But, ignoring postcopy for a minute, with KVM how do different types of
> > backing memory work - e.g. if I back a region of guest memory with
> > /dev/shm/something or a hugepage equivalent, where does the MTE memory
> > come from, and how do you set it?
> 
> Generally in an MTE system anything that's "plain old RAM" is expected
> to support tags. (The architecture manual calls this "conventional
> memory". This isn't quite the same as "anything that looks RAM-like",
> e.g. the graphics card framebuffer doesn't have to support tags!)

I guess things like non-volatile disks mapped as DAX are fun edge cases.

> One plausible implementation is that the firmware and memory controller
> are in cahoots and arrange that the appropriate fraction of the DRAM is
> reserved for holding tags (and inaccessible as normal RAM even by the OS);
> but where the tags are stored is entirely impdef and an implementation
> could choose to put the tags in their own entirely separate storage if
> it liked. The only way to access the tag storage is via the instructions
> for getting and setting tags.

Hmm OK;   In postcopy, at the moment, the call qemu uses is a call that
atomically places a page of data in memory and then tells the vCPUs to
continue.  I guess a variant that took an extra blob of MTE data would
do.
Note that other VMMs built on kvm work in different ways; the other
common way is to write into the backing file (i.e. the /dev/shm
whatever atomically somehow) and then do the userfault call to tell the
vcpus to continue.  It looks like this is the way things will work in
the split hugepage mechanism Google are currently adding.

Dave

> -- PMM
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-11 14:26     ` Dr. David Alan Gilbert
@ 2022-07-11 14:56       ` Cornelia Huck
  2022-07-11 15:30         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 12+ messages in thread
From: Cornelia Huck @ 2022-07-11 14:56 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Peter Maydell
  Cc: Thomas Huth, Laurent Vivier, Eric Auger, Juan Quintela, qemu-arm,
	qemu-devel, kvm

On Mon, Jul 11 2022, "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Peter Maydell (peter.maydell@linaro.org) wrote:
>> On Mon, 11 Jul 2022 at 14:24, Dr. David Alan Gilbert
>> <dgilbert@redhat.com> wrote:
>> > But, ignoring postcopy for a minute, with KVM how do different types of
>> > backing memory work - e.g. if I back a region of guest memory with
>> > /dev/shm/something or a hugepage equivalent, where does the MTE memory
>> > come from, and how do you set it?
>> 
>> Generally in an MTE system anything that's "plain old RAM" is expected
>> to support tags. (The architecture manual calls this "conventional
>> memory". This isn't quite the same as "anything that looks RAM-like",
>> e.g. the graphics card framebuffer doesn't have to support tags!)
>
> I guess things like non-volatile disks mapped as DAX are fun edge cases.
>
>> One plausible implementation is that the firmware and memory controller
>> are in cahoots and arrange that the appropriate fraction of the DRAM is
>> reserved for holding tags (and inaccessible as normal RAM even by the OS);
>> but where the tags are stored is entirely impdef and an implementation
>> could choose to put the tags in their own entirely separate storage if
>> it liked. The only way to access the tag storage is via the instructions
>> for getting and setting tags.
>
> Hmm OK;   In postcopy, at the moment, the call qemu uses is a call that
> atomically places a page of data in memory and then tells the vCPUs to
> continue.  I guess a variant that took an extra blob of MTE data would
> do.

Yes, the current idea is to extend UFFDIO_COPY with a flag so that we
get the tag data along with the page.

> Note that other VMMs built on kvm work in different ways; the other
> common way is to write into the backing file (i.e. the /dev/shm
> whatever atomically somehow) and then do the userfault call to tell the
> vcpus to continue.  It looks like this is the way things will work in
> the split hugepage mechanism Google are currently adding.

Hmm... I had the impression that other VMMs had not cared about this
particular use case yet; if they need a slightly different mechanism,
it would complicate things a bit.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-11 13:24 ` Dr. David Alan Gilbert
  2022-07-11 13:39   ` Peter Maydell
  2022-07-11 13:55   ` Dr. David Alan Gilbert
@ 2022-07-11 15:08   ` Cornelia Huck
  2022-07-11 15:28     ` Dr. David Alan Gilbert
  2 siblings, 1 reply; 12+ messages in thread
From: Cornelia Huck @ 2022-07-11 15:08 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Peter Maydell, Thomas Huth, Laurent Vivier, Eric Auger,
	Juan Quintela, qemu-arm, qemu-devel, kvm

On Mon, Jul 11 2022, "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Cornelia Huck (cohuck@redhat.com) wrote:
>> For kvm, mte stays off by default; this is because migration is not yet
>> supported (postcopy will need an extension of the kernel interface, possibly
>> an extension of the userfaultfd interface), and turning on mte will add a
>> migration blocker.
>
> My assumption was that a normal migration would need something as well
> to retrieve and place the MTE flags; albeit not atomically.

There's KVM_ARM_MTE_COPY_TAGS, which should be sufficient to move tags
around for normal migration.

>
>> My biggest question going forward is actually concerning migration; I gather
>> that we should not bother adding something unless postcopy is working as well?
>
> I don't think that restriction is fair on you; just make sure
> postcopy_ram_supported_by_host gains an arch call and fails cleanly;
> that way if anyone tries to enable postcopy they'll find out with a
> clean fail.

Ok, if simply fencing off postcopy is fine, we can try to move forward
with what we have now. The original attempt at
https://lore.kernel.org/all/881871e8394fa18a656dfb105d42e6099335c721.1615972140.git.haibo.xu@linaro.org/
hooked itself directly into common code; maybe we should rather copy the
approach used for s390 storage keys (extra "device") instead?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-11 15:08   ` Cornelia Huck
@ 2022-07-11 15:28     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2022-07-11 15:28 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Peter Maydell, Thomas Huth, Laurent Vivier, Eric Auger,
	Juan Quintela, qemu-arm, qemu-devel, kvm

* Cornelia Huck (cohuck@redhat.com) wrote:
> On Mon, Jul 11 2022, "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Cornelia Huck (cohuck@redhat.com) wrote:
> >> For kvm, mte stays off by default; this is because migration is not yet
> >> supported (postcopy will need an extension of the kernel interface, possibly
> >> an extension of the userfaultfd interface), and turning on mte will add a
> >> migration blocker.
> >
> > My assumption was that a normal migration would need something as well
> > to retrieve and place the MTE flags; albeit not atomically.
> 
> There's KVM_ARM_MTE_COPY_TAGS, which should be sufficient to move tags
> around for normal migration.
> 
> >
> >> My biggest question going forward is actually concerning migration; I gather
> >> that we should not bother adding something unless postcopy is working as well?
> >
> > I don't think that restriction is fair on you; just make sure
> > postcopy_ram_supported_by_host gains an arch call and fails cleanly;
> > that way if anyone tries to enable postcopy they'll find out with a
> > clean fail.
> 
> Ok, if simply fencing off postcopy is fine, we can try to move forward
> with what we have now. The original attempt at
> https://lore.kernel.org/all/881871e8394fa18a656dfb105d42e6099335c721.1615972140.git.haibo.xu@linaro.org/
> hooked itself directly into common code; maybe we should rather copy the
> approach used for s390 storage keys (extra "device") instead?

I don't understand how a separate device would keep the idea of page
changed flags coherent with the main RAM that the tags correspond to.

Dave

-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm
  2022-07-11 14:56       ` Cornelia Huck
@ 2022-07-11 15:30         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2022-07-11 15:30 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Peter Maydell, Thomas Huth, Laurent Vivier, Eric Auger,
	Juan Quintela, qemu-arm, qemu-devel, kvm

* Cornelia Huck (cohuck@redhat.com) wrote:
> On Mon, Jul 11 2022, "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Peter Maydell (peter.maydell@linaro.org) wrote:
> >> On Mon, 11 Jul 2022 at 14:24, Dr. David Alan Gilbert
> >> <dgilbert@redhat.com> wrote:
> >> > But, ignoring postcopy for a minute, with KVM how do different types of
> >> > backing memory work - e.g. if I back a region of guest memory with
> >> > /dev/shm/something or a hugepage equivalent, where does the MTE memory
> >> > come from, and how do you set it?
> >> 
> >> Generally in an MTE system anything that's "plain old RAM" is expected
> >> to support tags. (The architecture manual calls this "conventional
> >> memory". This isn't quite the same as "anything that looks RAM-like",
> >> e.g. the graphics card framebuffer doesn't have to support tags!)
> >
> > I guess things like non-volatile disks mapped as DAX are fun edge cases.
> >
> >> One plausible implementation is that the firmware and memory controller
> >> are in cahoots and arrange that the appropriate fraction of the DRAM is
> >> reserved for holding tags (and inaccessible as normal RAM even by the OS);
> >> but where the tags are stored is entirely impdef and an implementation
> >> could choose to put the tags in their own entirely separate storage if
> >> it liked. The only way to access the tag storage is via the instructions
> >> for getting and setting tags.
> >
> > Hmm OK;   In postcopy, at the moment, the call qemu uses is a call that
> > atomically places a page of data in memory and then tells the vCPUs to
> > continue.  I guess a variant that took an extra blob of MTE data would
> > do.
> 
> Yes, the current idea is to extend UFFDIO_COPY with a flag so that we
> get the tag data along with the page.
> 
> > Note that other VMMs built on kvm work in different ways; the other
> > common way is to write into the backing file (i.e. the /dev/shm
> > whatever atomically somehow) and then do the userfault call to tell the
> > vcpus to continue.  It looks like this is the way things will work in
> > the split hugepage mechanism Google are currently adding.
> 
> Hmm... I had the impression that other VMMs had not cared about this
> particular use case yet; if they need a slightly different mechanism,
> it would complicate things a bit.

I think Google's internal VMM doesn't use UFFDIO_COPY - but I don't have
details to be sure of that.

Dave

-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-07-11 15:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-07-07 16:16 [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm Cornelia Huck
2022-07-07 16:16 ` [PATCH RFC v2 1/2] arm/kvm: add support for MTE Cornelia Huck
2022-07-07 16:16 ` [PATCH RFC v2 2/2] qtests/arm: add some mte tests Cornelia Huck
2022-07-09  2:59 ` [PATCH RFC v2 0/2] arm: enable MTE for QEMU + kvm Richard Henderson
2022-07-11 13:24 ` Dr. David Alan Gilbert
2022-07-11 13:39   ` Peter Maydell
2022-07-11 14:26     ` Dr. David Alan Gilbert
2022-07-11 14:56       ` Cornelia Huck
2022-07-11 15:30         ` Dr. David Alan Gilbert
2022-07-11 13:55   ` Dr. David Alan Gilbert
2022-07-11 15:08   ` Cornelia Huck
2022-07-11 15:28     ` Dr. David Alan Gilbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).