* [PATCH v7 0/1] arm: enable MTE for QEMU + kvm @ 2023-04-28 9:55 Cornelia Huck 2023-04-28 9:55 ` [PATCH v7 1/1] arm/kvm: add support for MTE Cornelia Huck ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Cornelia Huck @ 2023-04-28 9:55 UTC (permalink / raw) To: Peter Maydell, Paolo Bonzini Cc: qemu-arm, qemu-devel, kvm, Eric Auger, Juan Quintela, Gavin Shan, Philippe Mathieu-Daudé, Richard Henderson, Andrea Bolognani, Cornelia Huck v7 takes a different approach to wiring up MTE, so I still include a cover letter where I can explain things better, even though it is now only a single patch :) Previous versions used a cpu property to control MTE enablement, while keeping the same semantics for the virt machine "mte" property as used with tcg. This version now uses the machine property for kvm as well; while it continues to control allocation of tag memory for tcg, it now also controls enablement of the kvm capability. Since the cpu prop is now gone, so is the testing via the arm cpu props test; I don't have a good idea for testing mte on kvm automatically, since it has a hard dependency on host support. (Should I tack some futher documentation somewhere? I'm not seeing an obvious place.) A kvm guest with mte enabled still cannot be migrated (we need some uffd interface enhancements before we can support postcopy), so it is still off per default. Another open problem is mte vs mte3: tcg emulates mte3, kvm gives the guest whatever the host supports. Without migration support, this is not too much of a problem yet, but for compatibility handling, we'll need a way to keep QEMU from handing out mte3 for guests that might be migrated to a mte3-less host. We could tack this unto the mte property (specifying the version or max supported), or we could handle this via cpu properties if we go with handling compatibility via cpu models (sorting this out for kvm is probably going to be interesting in general.) In any case, I think we'll need a way to inform kvm of it. Tested with kvm selftests on FVP (as I still don't have any hardware with MTE...) and some make check(-tcg) incantations. Hopefully, I haven't (re)- introduced too many issues. Cornelia Huck (1): arm/kvm: add support for MTE hw/arm/virt.c | 69 +++++++++++++++++++++++++------------------- target/arm/cpu.c | 9 +++--- target/arm/cpu.h | 4 +++ target/arm/kvm.c | 35 ++++++++++++++++++++++ target/arm/kvm64.c | 5 ++++ target/arm/kvm_arm.h | 19 ++++++++++++ 6 files changed, 107 insertions(+), 34 deletions(-) -- 2.40.0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v7 1/1] arm/kvm: add support for MTE 2023-04-28 9:55 [PATCH v7 0/1] arm: enable MTE for QEMU + kvm Cornelia Huck @ 2023-04-28 9:55 ` Cornelia Huck 2023-04-28 17:50 ` Juan Quintela ` (2 more replies) 2023-05-18 10:09 ` [PATCH v7 0/1] arm: enable MTE for QEMU + kvm Peter Maydell 2023-05-22 12:04 ` Cornelia Huck 2 siblings, 3 replies; 15+ messages in thread From: Cornelia Huck @ 2023-04-28 9:55 UTC (permalink / raw) To: Peter Maydell, Paolo Bonzini Cc: qemu-arm, qemu-devel, kvm, Eric Auger, Juan Quintela, Gavin Shan, Philippe Mathieu-Daudé, Richard Henderson, Andrea Bolognani, Cornelia Huck Extend the 'mte' property for the virt machine to cover KVM as well. For KVM, we don't allocate tag memory, but instead enable the capability. If MTE has been enabled, we need to disable migration, as we do not yet have a way to migrate the tags as well. Therefore, MTE will stay off with KVM unless requested explicitly. Signed-off-by: Cornelia Huck <cohuck@redhat.com> --- hw/arm/virt.c | 69 +++++++++++++++++++++++++------------------- target/arm/cpu.c | 9 +++--- target/arm/cpu.h | 4 +++ target/arm/kvm.c | 35 ++++++++++++++++++++++ target/arm/kvm64.c | 5 ++++ target/arm/kvm_arm.h | 19 ++++++++++++ 6 files changed, 107 insertions(+), 34 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index a89d699f0b76..544a6c5bec8f 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2146,7 +2146,7 @@ static void machvirt_init(MachineState *machine) exit(1); } - if (vms->mte && (kvm_enabled() || hvf_enabled())) { + if (vms->mte && hvf_enabled()) { error_report("mach-virt: %s does not support providing " "MTE to the guest CPU", current_accel_name()); @@ -2216,39 +2216,48 @@ static void machvirt_init(MachineState *machine) } if (vms->mte) { - /* Create the memory region only once, but link to all cpus. */ - if (!tag_sysmem) { - /* - * The property exists only if MemTag is supported. - * If it is, we must allocate the ram to back that up. - */ - if (!object_property_find(cpuobj, "tag-memory")) { - error_report("MTE requested, but not supported " - "by the guest CPU"); - exit(1); + if (tcg_enabled()) { + /* Create the memory region only once, but link to all cpus. */ + if (!tag_sysmem) { + /* + * The property exists only if MemTag is supported. + * If it is, we must allocate the ram to back that up. + */ + if (!object_property_find(cpuobj, "tag-memory")) { + error_report("MTE requested, but not supported " + "by the guest CPU"); + exit(1); + } + + tag_sysmem = g_new(MemoryRegion, 1); + memory_region_init(tag_sysmem, OBJECT(machine), + "tag-memory", UINT64_MAX / 32); + + if (vms->secure) { + secure_tag_sysmem = g_new(MemoryRegion, 1); + memory_region_init(secure_tag_sysmem, OBJECT(machine), + "secure-tag-memory", + UINT64_MAX / 32); + + /* As with ram, secure-tag takes precedence over tag. */ + memory_region_add_subregion_overlap(secure_tag_sysmem, + 0, tag_sysmem, -1); + } } - tag_sysmem = g_new(MemoryRegion, 1); - memory_region_init(tag_sysmem, OBJECT(machine), - "tag-memory", UINT64_MAX / 32); - + object_property_set_link(cpuobj, "tag-memory", + OBJECT(tag_sysmem), &error_abort); if (vms->secure) { - secure_tag_sysmem = g_new(MemoryRegion, 1); - memory_region_init(secure_tag_sysmem, OBJECT(machine), - "secure-tag-memory", UINT64_MAX / 32); - - /* As with ram, secure-tag takes precedence over tag. */ - memory_region_add_subregion_overlap(secure_tag_sysmem, 0, - tag_sysmem, -1); + object_property_set_link(cpuobj, "secure-tag-memory", + OBJECT(secure_tag_sysmem), + &error_abort); } - } - - object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem), - &error_abort); - if (vms->secure) { - object_property_set_link(cpuobj, "secure-tag-memory", - OBJECT(secure_tag_sysmem), - &error_abort); + } else if (kvm_enabled()) { + if (!kvm_arm_mte_supported()) { + error_report("MTE requested, but not supported by KVM"); + exit(1); + } + kvm_arm_enable_mte(cpuobj, &error_abort); } } diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 5182ed0c9113..f6a88e52ac21 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1480,6 +1480,7 @@ void arm_cpu_post_init(Object *obj) qdev_prop_allow_set_link_before_realize, OBJ_PROP_LINK_STRONG); } + cpu->has_mte = true; } #endif } @@ -1616,7 +1617,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) } if (cpu->tag_memory) { error_setg(errp, - "Cannot enable %s when guest CPUs has MTE enabled", + "Cannot enable %s when guest CPUs has tag memory enabled", current_accel_name()); return; } @@ -1996,10 +1997,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) } #ifndef CONFIG_USER_ONLY - if (cpu->tag_memory == NULL && cpu_isar_feature(aa64_mte, cpu)) { + if (!cpu->has_mte && cpu_isar_feature(aa64_mte, cpu)) { /* - * Disable the MTE feature bits if we do not have tag-memory - * provided by the machine. + * Disable the MTE feature bits if we do not have the feature + * setup by the machine. */ cpu->isar.id_aa64pfr1 = FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 0); diff --git a/target/arm/cpu.h b/target/arm/cpu.h index d469a2637b32..c3463e39bcdf 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -935,6 +935,9 @@ struct ArchCPU { */ uint32_t psci_conduit; + /* CPU has Memory Tag Extension */ + bool has_mte; + /* For v8M, initial value of the Secure VTOR */ uint32_t init_svtor; /* For v8M, initial value of the Non-secure VTOR */ @@ -1053,6 +1056,7 @@ struct ArchCPU { bool prop_pauth; bool prop_pauth_impdef; bool prop_lpa2; + OnOffAuto prop_mte; /* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */ uint32_t dcz_blocksize; diff --git a/target/arm/kvm.c b/target/arm/kvm.c index 84da49332c4b..9553488ecd13 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -31,6 +31,7 @@ #include "hw/boards.h" #include "hw/irq.h" #include "qemu/log.h" +#include "migration/blocker.h" const KVMCapabilityInfo kvm_arch_required_capabilities[] = { KVM_CAP_LAST_INFO @@ -1064,3 +1065,37 @@ bool kvm_arch_cpu_check_are_resettable(void) void kvm_arch_accel_class_init(ObjectClass *oc) { } + +void kvm_arm_enable_mte(Object *cpuobj, Error **errp) +{ + static bool tried_to_enable; + static bool succeeded_to_enable; + Error *mte_migration_blocker = NULL; + int ret; + + if (!tried_to_enable) { + /* + * MTE on KVM is enabled on a per-VM basis (and retrying doesn't make + * sense), and we only want a single migration blocker as well. + */ + tried_to_enable = true; + + ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0); + if (ret) { + error_setg_errno(errp, -ret, "Failed to enable KVM_CAP_ARM_MTE"); + return; + } + + /* TODO: add proper migration support with MTE enabled */ + error_setg(&mte_migration_blocker, + "Live migration disabled due to MTE enabled"); + if (migrate_add_blocker(mte_migration_blocker, errp)) { + error_free(mte_migration_blocker); + return; + } + succeeded_to_enable = true; + } + if (succeeded_to_enable) { + object_property_set_bool(cpuobj, "has_mte", true, NULL); + } +} diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c index 810db33ccbd6..1893f3879363 100644 --- a/target/arm/kvm64.c +++ b/target/arm/kvm64.c @@ -756,6 +756,11 @@ bool kvm_arm_steal_time_supported(void) return kvm_check_extension(kvm_state, KVM_CAP_STEAL_TIME); } +bool kvm_arm_mte_supported(void) +{ + return kvm_check_extension(kvm_state, KVM_CAP_ARM_MTE); +} + QEMU_BUILD_BUG_ON(KVM_ARM64_SVE_VQ_MIN != 1); uint32_t kvm_arm_sve_get_vls(CPUState *cs) diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h index 330fbe5c7220..2083547bf604 100644 --- a/target/arm/kvm_arm.h +++ b/target/arm/kvm_arm.h @@ -313,6 +313,13 @@ bool kvm_arm_pmu_supported(void); */ bool kvm_arm_sve_supported(void); +/** + * kvm_arm_mte_supported: + * + * Returns: true if KVM can enable MTE, and false otherwise. + */ +bool kvm_arm_mte_supported(void); + /** * kvm_arm_get_max_vm_ipa_size: * @ms: Machine state handle @@ -377,6 +384,8 @@ void kvm_arm_pvtime_init(CPUState *cs, uint64_t ipa); int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level); +void kvm_arm_enable_mte(Object *cpuobj, Error **errp); + #else /* @@ -403,6 +412,11 @@ static inline bool kvm_arm_steal_time_supported(void) return false; } +static inline bool kvm_arm_mte_supported(void) +{ + return false; +} + /* * These functions should never actually be called without KVM support. */ @@ -451,6 +465,11 @@ static inline uint32_t kvm_arm_sve_get_vls(CPUState *cs) g_assert_not_reached(); } +static inline void kvm_arm_enable_mte(Object *cpuobj, Error **errp) +{ + g_assert_not_reached(); +} + #endif static inline const char *gic_class_name(void) -- 2.40.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-04-28 9:55 ` [PATCH v7 1/1] arm/kvm: add support for MTE Cornelia Huck @ 2023-04-28 17:50 ` Juan Quintela 2023-05-01 9:37 ` Richard Henderson 2023-05-16 13:48 ` Peter Maydell 2023-05-16 20:31 ` Richard Henderson 2 siblings, 1 reply; 15+ messages in thread From: Juan Quintela @ 2023-04-28 17:50 UTC (permalink / raw) To: Cornelia Huck Cc: Peter Maydell, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Gavin Shan, Philippe Mathieu-Daudé, Richard Henderson, Andrea Bolognani Cornelia Huck <cohuck@redhat.com> wrote: > Extend the 'mte' property for the virt machine to cover KVM as > well. For KVM, we don't allocate tag memory, but instead enable the > capability. > > If MTE has been enabled, we need to disable migration, And I was wondering why I was cc'd in a patch that talks about arm, cpus and architectures O:-) > as we do not > yet have a way to migrate the tags as well. Therefore, MTE will stay > off with KVM unless requested explicitly. > > Signed-off-by: Cornelia Huck <cohuck@redhat.com> > --- > hw/arm/virt.c | 69 +++++++++++++++++++++++++------------------- > target/arm/cpu.c | 9 +++--- > target/arm/cpu.h | 4 +++ > target/arm/kvm.c | 35 ++++++++++++++++++++++ > target/arm/kvm64.c | 5 ++++ > target/arm/kvm_arm.h | 19 ++++++++++++ > 6 files changed, 107 insertions(+), 34 deletions(-) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > index a89d699f0b76..544a6c5bec8f 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -2146,7 +2146,7 @@ static void machvirt_init(MachineState *machine) > exit(1); > } > > - if (vms->mte && (kvm_enabled() || hvf_enabled())) { > + if (vms->mte && hvf_enabled()) { > error_report("mach-virt: %s does not support providing " > "MTE to the guest CPU", > current_accel_name()); > @@ -2216,39 +2216,48 @@ static void machvirt_init(MachineState *machine) > } > > if (vms->mte) { > - /* Create the memory region only once, but link to all cpus. */ > - if (!tag_sysmem) { > - /* > - * The property exists only if MemTag is supported. > - * If it is, we must allocate the ram to back that up. > - */ > - if (!object_property_find(cpuobj, "tag-memory")) { > - error_report("MTE requested, but not supported " > - "by the guest CPU"); > - exit(1); > + if (tcg_enabled()) { > + /* Create the memory region only once, but link to all cpus. */ > + if (!tag_sysmem) { > + /* > + * The property exists only if MemTag is supported. > + * If it is, we must allocate the ram to back that up. > + */ > + if (!object_property_find(cpuobj, "tag-memory")) { > + error_report("MTE requested, but not supported " > + "by the guest CPU"); > + exit(1); > + } > + > + tag_sysmem = g_new(MemoryRegion, 1); > + memory_region_init(tag_sysmem, OBJECT(machine), > + "tag-memory", UINT64_MAX / 32); > + > + if (vms->secure) { > + secure_tag_sysmem = g_new(MemoryRegion, 1); > + memory_region_init(secure_tag_sysmem, OBJECT(machine), > + "secure-tag-memory", > + UINT64_MAX / 32); > + > + /* As with ram, secure-tag takes precedence over tag. */ > + memory_region_add_subregion_overlap(secure_tag_sysmem, > + 0, tag_sysmem, -1); > + } > } Pardon my ignorance here, but to try to help with migration. How is this mte tag stored? - 1 array of 8bits per page of memory - 1 array of 64bits per page of memory - whatever Lets asume that it is 1 byte per page. For the explanation it don't matter, only matters that it is an array of things that are one for each page. What I arrived for migration the 1st time that I looked at this problem is that you can "abuse" multifd and call it a day. In multifd propper you just send in each page: - 1 array of page addresses - 1 array of pages that correspond to the previous addresses So my suggestion is just to send another array: - 1 array of page addresses - 1 array of page tags that correspond to the previous one - 1 array of pages that correspond to the previous addresses You put compatiblity marks here and there checking that you are using mte (and the same version) in both sides and you call that a day. Notice that this requires the series (still not upstream but already on the list) that move the zero page detection to the multifd thread, because I am assuming that zero pages also have tags (yes, it was not a very impressive guess). What do you think? Does this work for you? What I would need for kvm/tcg would be some way of doing: - get_the_mte_tag_of_page(page_id) - set_the_mte_tag_of_page(page_id) Now you need to tell me if I should do this for each page, or use some kind of scatter-gather function that allows me to receive the mte tags from an array of pages. You could pass this information when we are searching for dirty pages, but it is going to be complicated doing that (basically we only pass the dirty page id, nothing else). Doing this in normal precopy can also be done, but it would be an exercise in masochism. Another question, if you are using MTE, all pages have MTE, right? Or there are other exceptions? Sorry for my ignorance on this matter. Later, Juan. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-04-28 17:50 ` Juan Quintela @ 2023-05-01 9:37 ` Richard Henderson 2023-05-02 9:03 ` Cornelia Huck 0 siblings, 1 reply; 15+ messages in thread From: Richard Henderson @ 2023-05-01 9:37 UTC (permalink / raw) To: quintela, Cornelia Huck Cc: Peter Maydell, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Gavin Shan, Philippe Mathieu-Daudé, Andrea Bolognani On 4/28/23 18:50, Juan Quintela wrote: > Pardon my ignorance here, but to try to help with migration. How is > this mte tag stored? > - 1 array of 8bits per page of memory > - 1 array of 64bits per page of memory > - whatever > > Lets asume that it is 1 byte per page. For the explanation it don't > matter, only matters that it is an array of things that are one for each > page. Not that it matters, as you say, but for concreteness, 1 4-bit tag per 16 bytes, packed, so 128 bytes per 4k page. > So my suggestion is just to send another array: > > - 1 array of page addresses > - 1 array of page tags that correspond to the previous one > - 1 array of pages that correspond to the previous addresses > > You put compatiblity marks here and there checking that you are using > mte (and the same version) in both sides and you call that a day. Sounds reasonable. > Notice that this requires the series (still not upstream but already on > the list) that move the zero page detection to the multifd thread, > because I am assuming that zero pages also have tags (yes, it was not a > very impressive guess). Correct. "Proper" zero detection would include checking the tags as well. Zero tags are what you get from the kernel on a new allocation. > Now you need to tell me if I should do this for each page, or use some > kind of scatter-gather function that allows me to receive the mte tags > from an array of pages. That is going to depend on if KVM exposes an interface with which to bulk-set tags (STGM, "store tag multiple", is only available to kernel mode for some reason), a-la arch/arm64/mm/copypage.c (which copies the page data then the page tags separately). For the moment, KVM believes that memcpy from userspace is sufficient, which means we'll want a custom memcpy using STGP to store 16 bytes along with its tag. > You could pass this information when we are searching for dirty pages, > but it is going to be complicated doing that (basically we only pass the > dirty page id, nothing else). A page can be dirtied by changing nothing but a tag. So we cannot of course send tags "early", they must come with the data. I'm not 100% sure I understood your question here. > Another question, if you are using MTE, all pages have MTE, right? > Or there are other exceptions? No such systems are built yet, so we won't know what corner cases the host system will have to cope with, but I believe as written so far all pages must have tags when MTE is enabled by KVM. r~ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-05-01 9:37 ` Richard Henderson @ 2023-05-02 9:03 ` Cornelia Huck 2023-05-02 10:17 ` Juan Quintela 2023-05-02 10:58 ` Richard Henderson 0 siblings, 2 replies; 15+ messages in thread From: Cornelia Huck @ 2023-05-02 9:03 UTC (permalink / raw) To: Richard Henderson, quintela Cc: Peter Maydell, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Gavin Shan, Philippe Mathieu-Daudé, Andrea Bolognani On Mon, May 01 2023, Richard Henderson <richard.henderson@linaro.org> wrote: > On 4/28/23 18:50, Juan Quintela wrote: >> Pardon my ignorance here, but to try to help with migration. How is >> this mte tag stored? >> - 1 array of 8bits per page of memory >> - 1 array of 64bits per page of memory >> - whatever >> >> Lets asume that it is 1 byte per page. For the explanation it don't >> matter, only matters that it is an array of things that are one for each >> page. > > Not that it matters, as you say, but for concreteness, 1 4-bit tag per 16 bytes, packed, > so 128 bytes per 4k page. > >> So my suggestion is just to send another array: >> >> - 1 array of page addresses >> - 1 array of page tags that correspond to the previous one >> - 1 array of pages that correspond to the previous addresses >> >> You put compatiblity marks here and there checking that you are using >> mte (and the same version) in both sides and you call that a day. > > Sounds reasonable. Yes, something like that sounds reasonable as an interface. > >> Notice that this requires the series (still not upstream but already on >> the list) that move the zero page detection to the multifd thread, >> because I am assuming that zero pages also have tags (yes, it was not a >> very impressive guess). > > Correct. "Proper" zero detection would include checking the tags as well. > Zero tags are what you get from the kernel on a new allocation. > >> Now you need to tell me if I should do this for each page, or use some >> kind of scatter-gather function that allows me to receive the mte tags >> from an array of pages. > > That is going to depend on if KVM exposes an interface with which to bulk-set tags (STGM, > "store tag multiple", is only available to kernel mode for some reason), a-la > arch/arm64/mm/copypage.c (which copies the page data then the page tags separately). > > For the moment, KVM believes that memcpy from userspace is sufficient, which means we'll > want a custom memcpy using STGP to store 16 bytes along with its tag. > >> You could pass this information when we are searching for dirty pages, >> but it is going to be complicated doing that (basically we only pass the >> dirty page id, nothing else). > > A page can be dirtied by changing nothing but a tag. > So we cannot of course send tags "early", they must come with the data. > I'm not 100% sure I understood your question here. Last time MTE migration came up, we thought that we would need to go with an uffd extension (page + extra data) to handle the postcopy case properly (i.e. use some kind of array for precopy, and that interface extension for postcopy.) TBH, I'm not sure if multifd makes any difference here. > >> Another question, if you are using MTE, all pages have MTE, right? >> Or there are other exceptions? > > No such systems are built yet, so we won't know what corner cases the host system will > have to cope with, but I believe as written so far all pages must have tags when MTE is > enabled by KVM. Has anyone been able to access a real system with MTE? (All the systems where I had hoped that MTE would be available didn't have MTE in the end so far, so I'd be interested to hear if anybody else already got to play with one.) Honestly, I don't want to even try to test migration if I only have access to MTE on the FVP... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-05-02 9:03 ` Cornelia Huck @ 2023-05-02 10:17 ` Juan Quintela 2023-05-02 10:58 ` Richard Henderson 1 sibling, 0 replies; 15+ messages in thread From: Juan Quintela @ 2023-05-02 10:17 UTC (permalink / raw) To: Cornelia Huck Cc: Richard Henderson, Peter Maydell, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Gavin Shan, Philippe Mathieu-Daudé, Andrea Bolognani Cornelia Huck <cohuck@redhat.com> wrote: > On Mon, May 01 2023, Richard Henderson <richard.henderson@linaro.org> wrote: > >> On 4/28/23 18:50, Juan Quintela wrote: >>> Pardon my ignorance here, but to try to help with migration. How is >>> this mte tag stored? >>> - 1 array of 8bits per page of memory >>> - 1 array of 64bits per page of memory >>> - whatever >>> >>> Lets asume that it is 1 byte per page. For the explanation it don't >>> matter, only matters that it is an array of things that are one for each >>> page. >> >> Not that it matters, as you say, but for concreteness, 1 4-bit tag per 16 bytes, packed, >> so 128 bytes per 4k page. >> >>> So my suggestion is just to send another array: >>> >>> - 1 array of page addresses >>> - 1 array of page tags that correspond to the previous one >>> - 1 array of pages that correspond to the previous addresses >>> >>> You put compatiblity marks here and there checking that you are using >>> mte (and the same version) in both sides and you call that a day. >> >> Sounds reasonable. > > Yes, something like that sounds reasonable as an interface. Ok. >>> Notice that this requires the series (still not upstream but already on >>> the list) that move the zero page detection to the multifd thread, >>> because I am assuming that zero pages also have tags (yes, it was not a >>> very impressive guess). >> >> Correct. "Proper" zero detection would include checking the tags as well. >> Zero tags are what you get from the kernel on a new allocation. That was a different thing. On precopy, we handle zero page one way and non_zero page other way. On upstream multifd, we detect and send zero page on the migration thread, and everything else on the migration threads. With my patches (on list) it send both zero and non-zero pages through multifd. My proposal will be that we send the 128bytes/page for every page, included zero pages. Here I mean zero page a page that is full of zeros, independently of the tag. >> A page can be dirtied by changing nothing but a tag. I hope/expect that the dirty bitmap reflects that. >> So we cannot of course send tags "early", they must come with the data. >> I'm not 100% sure I understood your question here. > > Last time MTE migration came up, we thought that we would need to go > with an uffd extension (page + extra data) to handle the postcopy case > properly (i.e. use some kind of array for precopy, and that interface > extension for postcopy.) TBH, I'm not sure if multifd makes any > difference here. uffd is a completely different beast, and I agree with you. I was meaning here the precopy/multifd case. >>> Another question, if you are using MTE, all pages have MTE, right? >>> Or there are other exceptions? >> >> No such systems are built yet, so we won't know what corner cases the host system will >> have to cope with, but I believe as written so far all pages must have tags when MTE is >> enabled by KVM. > > Has anyone been able to access a real system with MTE? (All the systems > where I had hoped that MTE would be available didn't have MTE in the end > so far, so I'd be interested to hear if anybody else already got to play > with one.) Honestly, I don't want to even try to test migration if I only > have access to MTE on the FVP... So here we are. I have seen the future. And it is very blurred. O:-) Later, Juan. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-05-02 9:03 ` Cornelia Huck 2023-05-02 10:17 ` Juan Quintela @ 2023-05-02 10:58 ` Richard Henderson 2023-05-04 15:01 ` Cornelia Huck 1 sibling, 1 reply; 15+ messages in thread From: Richard Henderson @ 2023-05-02 10:58 UTC (permalink / raw) To: Cornelia Huck, quintela Cc: Peter Maydell, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Gavin Shan, Philippe Mathieu-Daudé, Andrea Bolognani On 5/2/23 10:03, Cornelia Huck wrote: > Has anyone been able to access a real system with MTE? (All the systems > where I had hoped that MTE would be available didn't have MTE in the end > so far, so I'd be interested to hear if anybody else already got to play > with one.) Honestly, I don't want to even try to test migration if I only > have access to MTE on the FVP... Well there's always MTE on QEMU with TCG. :-) But I agree that while it's better than FVP, it's still slow, and difficult to test anything at scale. I have no objection to getting non-migratable MTE on KVM in before attempting to solve migration. r~ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-05-02 10:58 ` Richard Henderson @ 2023-05-04 15:01 ` Cornelia Huck 2023-05-05 15:14 ` Richard Henderson 0 siblings, 1 reply; 15+ messages in thread From: Cornelia Huck @ 2023-05-04 15:01 UTC (permalink / raw) To: Richard Henderson, quintela Cc: Peter Maydell, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Gavin Shan, Philippe Mathieu-Daudé, Andrea Bolognani On Tue, May 02 2023, Richard Henderson <richard.henderson@linaro.org> wrote: > On 5/2/23 10:03, Cornelia Huck wrote: >> Has anyone been able to access a real system with MTE? (All the systems >> where I had hoped that MTE would be available didn't have MTE in the end >> so far, so I'd be interested to hear if anybody else already got to play >> with one.) Honestly, I don't want to even try to test migration if I only >> have access to MTE on the FVP... > > Well there's always MTE on QEMU with TCG. :-) Which actually worked very nicely to verify my test setup :) > > But I agree that while it's better than FVP, it's still slow, and difficult to test > anything at scale. I have no objection to getting non-migratable MTE on KVM in before > attempting to solve migration. I'm wondering whether we should block migration with MTE enabled in general... OTOH, people probably don't commonly try to migrate with tcg, unless they are testing something? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-05-04 15:01 ` Cornelia Huck @ 2023-05-05 15:14 ` Richard Henderson 0 siblings, 0 replies; 15+ messages in thread From: Richard Henderson @ 2023-05-05 15:14 UTC (permalink / raw) To: Cornelia Huck, quintela Cc: Peter Maydell, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Gavin Shan, Philippe Mathieu-Daudé, Andrea Bolognani On 5/4/23 16:01, Cornelia Huck wrote: > I'm wondering whether we should block migration with MTE enabled in > general... OTOH, people probably don't commonly try to migrate with tcg, > unless they are testing something? Yes, savevm/loadvm is an extremely useful tcg debugging tool. r~ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-04-28 9:55 ` [PATCH v7 1/1] arm/kvm: add support for MTE Cornelia Huck 2023-04-28 17:50 ` Juan Quintela @ 2023-05-16 13:48 ` Peter Maydell 2023-05-16 20:31 ` Richard Henderson 2 siblings, 0 replies; 15+ messages in thread From: Peter Maydell @ 2023-05-16 13:48 UTC (permalink / raw) To: Cornelia Huck Cc: Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Juan Quintela, Gavin Shan, Philippe Mathieu-Daudé, Richard Henderson, Andrea Bolognani On Fri, 28 Apr 2023 at 10:55, Cornelia Huck <cohuck@redhat.com> wrote: > > Extend the 'mte' property for the virt machine to cover KVM as > well. For KVM, we don't allocate tag memory, but instead enable the > capability. > > If MTE has been enabled, we need to disable migration, as we do not > yet have a way to migrate the tags as well. Therefore, MTE will stay > off with KVM unless requested explicitly. > > Signed-off-by: Cornelia Huck <cohuck@redhat.com> So this looks OK to me -- we don't handle migration but we do install a blocker so we can deal with that later. Richard, is there anything else here that means it needs more work? I think I'm OK giving it Reviewed-by: Peter Maydell <peter.maydell@linaro.org> and will pick it up for target-arm.next later this week unless anybody raises any issues. thanks -- PMM ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/1] arm/kvm: add support for MTE 2023-04-28 9:55 ` [PATCH v7 1/1] arm/kvm: add support for MTE Cornelia Huck 2023-04-28 17:50 ` Juan Quintela 2023-05-16 13:48 ` Peter Maydell @ 2023-05-16 20:31 ` Richard Henderson 2 siblings, 0 replies; 15+ messages in thread From: Richard Henderson @ 2023-05-16 20:31 UTC (permalink / raw) To: Cornelia Huck, Peter Maydell, Paolo Bonzini Cc: qemu-arm, qemu-devel, kvm, Eric Auger, Juan Quintela, Gavin Shan, Philippe Mathieu-Daudé, Andrea Bolognani On 4/28/23 02:55, Cornelia Huck wrote: > Extend the 'mte' property for the virt machine to cover KVM as > well. For KVM, we don't allocate tag memory, but instead enable the > capability. > > If MTE has been enabled, we need to disable migration, as we do not > yet have a way to migrate the tags as well. Therefore, MTE will stay > off with KVM unless requested explicitly. > > Signed-off-by: Cornelia Huck<cohuck@redhat.com> > --- > hw/arm/virt.c | 69 +++++++++++++++++++++++++------------------- > target/arm/cpu.c | 9 +++--- > target/arm/cpu.h | 4 +++ > target/arm/kvm.c | 35 ++++++++++++++++++++++ > target/arm/kvm64.c | 5 ++++ > target/arm/kvm_arm.h | 19 ++++++++++++ > 6 files changed, 107 insertions(+), 34 deletions(-) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 0/1] arm: enable MTE for QEMU + kvm 2023-04-28 9:55 [PATCH v7 0/1] arm: enable MTE for QEMU + kvm Cornelia Huck 2023-04-28 9:55 ` [PATCH v7 1/1] arm/kvm: add support for MTE Cornelia Huck @ 2023-05-18 10:09 ` Peter Maydell 2023-05-22 12:04 ` Cornelia Huck 2 siblings, 0 replies; 15+ messages in thread From: Peter Maydell @ 2023-05-18 10:09 UTC (permalink / raw) To: Cornelia Huck Cc: Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Juan Quintela, Gavin Shan, Philippe Mathieu-Daudé, Richard Henderson, Andrea Bolognani On Fri, 28 Apr 2023 at 10:55, Cornelia Huck <cohuck@redhat.com> wrote: > > v7 takes a different approach to wiring up MTE, so I still include a cover > letter where I can explain things better, even though it is now only a > single patch :) Applied to target-arm.next, thanks. -- PMM ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 0/1] arm: enable MTE for QEMU + kvm 2023-04-28 9:55 [PATCH v7 0/1] arm: enable MTE for QEMU + kvm Cornelia Huck 2023-04-28 9:55 ` [PATCH v7 1/1] arm/kvm: add support for MTE Cornelia Huck 2023-05-18 10:09 ` [PATCH v7 0/1] arm: enable MTE for QEMU + kvm Peter Maydell @ 2023-05-22 12:04 ` Cornelia Huck 2023-05-29 10:15 ` Andrea Bolognani 2 siblings, 1 reply; 15+ messages in thread From: Cornelia Huck @ 2023-05-22 12:04 UTC (permalink / raw) To: Peter Maydell, Paolo Bonzini Cc: qemu-arm, qemu-devel, kvm, Eric Auger, Juan Quintela, Gavin Shan, Philippe Mathieu-Daudé, Richard Henderson, Andrea Bolognani On Fri, Apr 28 2023, Cornelia Huck <cohuck@redhat.com> wrote: > Another open problem is mte vs mte3: tcg emulates mte3, kvm gives the guest > whatever the host supports. Without migration support, this is not too much > of a problem yet, but for compatibility handling, we'll need a way to keep > QEMU from handing out mte3 for guests that might be migrated to a mte3-less > host. We could tack this unto the mte property (specifying the version or > max supported), or we could handle this via cpu properties if we go with > handling compatibility via cpu models (sorting this out for kvm is probably > going to be interesting in general.) In any case, I think we'll need a way > to inform kvm of it. Before I start to figure out the initialization breakage, I think it might be worth pointing to this open issue again. As Andrea mentioned in https://listman.redhat.com/archives/libvir-list/2023-May/239926.html, libvirt wants to provide a stable guest ABI, not only in the context of migration compatibility (which we can handwave away via the migration blocker.) The part I'm mostly missing right now is how to tell KVM to not present mte3 to a guest while running on a mte3 capable host (i.e. the KVM interface for that; it's more a case of "we don't have it right now", though.) I'd expect it to be on the cpu level, rather than on the vm level, but it's not there yet; we also probably want something that's not fighting whatever tcg (or other accels) end up doing. I see several options here: - Continue to ignore mte3 and its implications for now. The big risk is that someone might end up implementing support for MTE in libvirt again, with the same stable guest ABI issues as for this version. - Add a "version" qualifier to the mte machine prop (probably with semantics similar to the gic stuff), with the default working with tcg as it does right now (i.e. defaulting to mte3). KVM would only support "no mte" or "same as host" (with no stable guest ABI guarantees) for now. I'm not sure how hairy this might get if we end up with a per-cpu configuration of mte (and other features) with kvm. - Add cpu properties for mte and mte3. I think we've been there before :) It would likely match any KVM infrastructure well, but we gain interactions with the machine property. Also, there's a lot in the whole CPU model area that need proper figuring out first... if we go that route, we won't be able to add MTE support with KVM anytime soon, I fear. The second option might be the most promising, except for potential future headaches; but a lot depends on where we want to be going with cpu models for KVM in general. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 0/1] arm: enable MTE for QEMU + kvm 2023-05-22 12:04 ` Cornelia Huck @ 2023-05-29 10:15 ` Andrea Bolognani 2023-05-29 10:27 ` Peter Maydell 0 siblings, 1 reply; 15+ messages in thread From: Andrea Bolognani @ 2023-05-29 10:15 UTC (permalink / raw) To: Cornelia Huck Cc: Peter Maydell, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Juan Quintela, Gavin Shan, Philippe Mathieu-Daudé, Richard Henderson On Mon, May 22, 2023 at 02:04:28PM +0200, Cornelia Huck wrote: > On Fri, Apr 28 2023, Cornelia Huck <cohuck@redhat.com> wrote: > > Another open problem is mte vs mte3: tcg emulates mte3, kvm gives the guest > > whatever the host supports. Without migration support, this is not too much > > of a problem yet, but for compatibility handling, we'll need a way to keep > > QEMU from handing out mte3 for guests that might be migrated to a mte3-less > > host. We could tack this unto the mte property (specifying the version or > > max supported), or we could handle this via cpu properties if we go with > > handling compatibility via cpu models (sorting this out for kvm is probably > > going to be interesting in general.) In any case, I think we'll need a way > > to inform kvm of it. > > Before I start to figure out the initialization breakage, I think it > might be worth pointing to this open issue again. As Andrea mentioned in > https://listman.redhat.com/archives/libvir-list/2023-May/239926.html, > libvirt wants to provide a stable guest ABI, not only in the context of > migration compatibility (which we can handwave away via the migration > blocker.) Yeah, in order to guarantee a stable guest ABI it's critical that libvirt can ask for a *specific* version of MTE (MTE or MTE3) and either get exactly that version, or an error on QEMU's side. > The part I'm mostly missing right now is how to tell KVM to not present > mte3 to a guest while running on a mte3 capable host (i.e. the KVM > interface for that; it's more a case of "we don't have it right now", > though.) I'd expect it to be on the cpu level, rather than on the vm > level, but it's not there yet; we also probably want something that's > not fighting whatever tcg (or other accels) end up doing. > > I see several options here: > - Continue to ignore mte3 and its implications for now. The big risk is > that someone might end up implementing support for MTE in libvirt again, > with the same stable guest ABI issues as for this version. > - Add a "version" qualifier to the mte machine prop (probably with > semantics similar to the gic stuff), with the default working with tcg > as it does right now (i.e. defaulting to mte3). KVM would only support > "no mte" or "same as host" (with no stable guest ABI guarantees) for > now. I'm not sure how hairy this might get if we end up with a per-cpu > configuration of mte (and other features) with kvm. > - Add cpu properties for mte and mte3. I think we've been there before > :) It would likely match any KVM infrastructure well, but we gain > interactions with the machine property. Also, there's a lot in the > whole CPU model area that need proper figuring out first... if we go > that route, we won't be able to add MTE support with KVM anytime soon, > I fear. > > The second option might be the most promising, except for potential > future headaches; but a lot depends on where we want to be going with > cpu models for KVM in general. What are the arguments for/against making MTE a machine type option or CPU feature flag? IIUC on real hardware you get "mte" or "mte3" listed in /proc/cpuinfo, so a CPU feature would seem a pretty natural fit to me, but I seem to recall that Peter was pushing for keeping it a machine property instead. Working off the assumption that Peter knows what he's doing :) can we do something like this? * introduce a new machine type property mte-version, which accepts either a specific version (2 for MTE and 3 for MTE3), an abstract setting (max and host) or a way to disable MTE entirely (none); * turn the existing mte machine type option into an alias with the mapping mte=off -> mte-version=none mte=on -> mte-version=max for TCG mte=on -> mte-version=host for KVM and deprecate it; * optionally introduce a new QMP command query-mte-capabilities that can be used by libvirt to figure out ahead of time which MTE versions are available for use on the current hardware. Yes, this is basically a shameless rip-off of how GIC is handled :) I'm pretty satisfied with how that works and see no reason to reinvent the wheel. Note that it's perfectly fine if the lack of KVM-level APIs results in mte-version=2 being rejected on MTE3-capable hardware for now! What's important is that you don't get a different MTE version than what you asked for. I assume that the existing KVM API for enabling MTE have good enough granularity to make this work? If not, that's going to be a problem :) -- Andrea Bolognani / Red Hat / Virtualization ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 0/1] arm: enable MTE for QEMU + kvm 2023-05-29 10:15 ` Andrea Bolognani @ 2023-05-29 10:27 ` Peter Maydell 0 siblings, 0 replies; 15+ messages in thread From: Peter Maydell @ 2023-05-29 10:27 UTC (permalink / raw) To: Andrea Bolognani Cc: Cornelia Huck, Paolo Bonzini, qemu-arm, qemu-devel, kvm, Eric Auger, Juan Quintela, Gavin Shan, Philippe Mathieu-Daudé, Richard Henderson On Mon, 29 May 2023 at 11:15, Andrea Bolognani <abologna@redhat.com> wrote: > > On Mon, May 22, 2023 at 02:04:28PM +0200, Cornelia Huck wrote: > > On Fri, Apr 28 2023, Cornelia Huck <cohuck@redhat.com> wrote: > > > Another open problem is mte vs mte3: tcg emulates mte3, kvm gives the guest > > > whatever the host supports. Without migration support, this is not too much > > > of a problem yet, but for compatibility handling, we'll need a way to keep > > > QEMU from handing out mte3 for guests that might be migrated to a mte3-less > > > host. We could tack this unto the mte property (specifying the version or > > > max supported), or we could handle this via cpu properties if we go with > > > handling compatibility via cpu models (sorting this out for kvm is probably > > > going to be interesting in general.) In any case, I think we'll need a way > > > to inform kvm of it. > > > > Before I start to figure out the initialization breakage, I think it > > might be worth pointing to this open issue again. As Andrea mentioned in > > https://listman.redhat.com/archives/libvir-list/2023-May/239926.html, > > libvirt wants to provide a stable guest ABI, not only in the context of > > migration compatibility (which we can handwave away via the migration > > blocker.) > > Yeah, in order to guarantee a stable guest ABI it's critical that > libvirt can ask for a *specific* version of MTE (MTE or MTE3) and > either get exactly that version, or an error on QEMU's side. > > > The part I'm mostly missing right now is how to tell KVM to not present > > mte3 to a guest while running on a mte3 capable host (i.e. the KVM > > interface for that; it's more a case of "we don't have it right now", > > though.) I'd expect it to be on the cpu level, rather than on the vm > > level, but it's not there yet; we also probably want something that's > > not fighting whatever tcg (or other accels) end up doing. > > > > I see several options here: > > - Continue to ignore mte3 and its implications for now. The big risk is > > that someone might end up implementing support for MTE in libvirt again, > > with the same stable guest ABI issues as for this version. > > - Add a "version" qualifier to the mte machine prop (probably with > > semantics similar to the gic stuff), with the default working with tcg > > as it does right now (i.e. defaulting to mte3). KVM would only support > > "no mte" or "same as host" (with no stable guest ABI guarantees) for > > now. I'm not sure how hairy this might get if we end up with a per-cpu > > configuration of mte (and other features) with kvm. > > - Add cpu properties for mte and mte3. I think we've been there before > > :) It would likely match any KVM infrastructure well, but we gain > > interactions with the machine property. Also, there's a lot in the > > whole CPU model area that need proper figuring out first... if we go > > that route, we won't be able to add MTE support with KVM anytime soon, > > I fear. > > > > The second option might be the most promising, except for potential > > future headaches; but a lot depends on where we want to be going with > > cpu models for KVM in general. > > What are the arguments for/against making MTE a machine type option > or CPU feature flag? IIUC on real hardware you get "mte" or "mte3" > listed in /proc/cpuinfo, so a CPU feature would seem a pretty natural > fit to me, but I seem to recall that Peter was pushing for keeping it > a machine property instead. The arguments for a machine property are: * MTE needs not just CPU support but also system level support (on real hardware there needs to be actual tag ram somewhere out there in the system; for TCG we need to create the tag RAM in the virt board code) * it's what we're already doing for TCG, so it keeps the UI consistent between TCG and KVM * it's what we already do for things like the virtualization extensions and TrustZone emulation (which also generally need to be supported not just by the CPU but also by the board) thanks -- PMM ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-05-29 10:28 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-28 9:55 [PATCH v7 0/1] arm: enable MTE for QEMU + kvm Cornelia Huck 2023-04-28 9:55 ` [PATCH v7 1/1] arm/kvm: add support for MTE Cornelia Huck 2023-04-28 17:50 ` Juan Quintela 2023-05-01 9:37 ` Richard Henderson 2023-05-02 9:03 ` Cornelia Huck 2023-05-02 10:17 ` Juan Quintela 2023-05-02 10:58 ` Richard Henderson 2023-05-04 15:01 ` Cornelia Huck 2023-05-05 15:14 ` Richard Henderson 2023-05-16 13:48 ` Peter Maydell 2023-05-16 20:31 ` Richard Henderson 2023-05-18 10:09 ` [PATCH v7 0/1] arm: enable MTE for QEMU + kvm Peter Maydell 2023-05-22 12:04 ` Cornelia Huck 2023-05-29 10:15 ` Andrea Bolognani 2023-05-29 10:27 ` Peter Maydell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).