[PATCH v4 0/5] x86/HVM: load state checking

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/5] x86/HVM: load state checking
@ 2023-12-18 14:37 Jan Beulich
  2023-12-18 14:39 ` [PATCH v4 1/5] x86/HVM: split restore state checking from state loading Jan Beulich
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Jan Beulich @ 2023-12-18 14:37 UTC (permalink / raw)
  To: xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

With the request to convert bounding to actual refusal, and then
doing so in new hooks, the two previously separate patches now need
to be in a series, with infrastructure work done first. Clearly the
checking in other load handlers could (and likely wants to be) moved
to separate check handlers as well - one example of doing so is
added anew in v4, the rest will want doing down the road.

1: HVM: split restore state checking from state loading
2: HVM: adjust save/restore hook registration for optional check handler
3: vPIT: check values loaded from state save record
4: vPIC: check values loaded from state save record
5: vIRQ: split PCI link load state checking from actual loading

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/5] x86/HVM: split restore state checking from state loading
  2023-12-18 14:37 [PATCH v4 0/5] x86/HVM: load state checking Jan Beulich
@ 2023-12-18 14:39 ` Jan Beulich
  2023-12-19 14:36   ` Roger Pau Monné
  2023-12-18 14:40 ` [PATCH v4 2/5] x86/HVM: adjust save/restore hook registration for optional check handler Jan Beulich
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2023-12-18 14:39 UTC (permalink / raw)
  To: xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

..., at least as reasonably feasible without making a check hook
mandatory (in particular strict vs relaxed/zero-extend length checking
can't be done early this way).

Note that only one of the two uses of "real" hvm_load() is accompanied
with a "checking" one. The other directly consumes hvm_save() output,
which ought to be well-formed. This means that while input data related
checks don't need repeating in the "load" function when already done by
the "check" one (albeit assertions to this effect may be desirable),
domain state related checks (e.g. has_xyz(d)) will be required in both
places.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Now that this re-arranges hvm_load() anyway, wouldn't it be better to
down the vCPU-s ahead of calling arch_hvm_load() (which is now easy to
arrange for)?

Do we really need all the copying involved in use of _hvm_read_entry()
(backing hvm_load_entry()? Zero-extending loads are likely easier to
handle that way, but for strict loads all we gain is a reduced risk of
unaligned accesses (compared to simply pointing into h->data[]).
---
v4: Fold hvm_check() into hvm_load().
v2: New.

--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -379,8 +379,12 @@ long arch_do_domctl(
         if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) != 0 )
             goto sethvmcontext_out;
 
+        ret = hvm_load(d, false, &c);
+        if ( ret )
+            goto sethvmcontext_out;
+
         domain_pause(d);
-        ret = hvm_load(d, &c);
+        ret = hvm_load(d, true, &c);
         domain_unpause(d);
 
     sethvmcontext_out:
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5397,7 +5397,7 @@ int hvm_copy_context_and_params(struct d
     }
 
     c.cur = 0;
-    rc = hvm_load(dst, &c);
+    rc = hvm_load(dst, true, &c);
 
  out:
     vfree(c.data);
--- a/xen/arch/x86/hvm/save.c
+++ b/xen/arch/x86/hvm/save.c
@@ -30,7 +30,8 @@ static void arch_hvm_save(struct domain
     d->arch.hvm.sync_tsc = rdtsc();
 }
 
-static int arch_hvm_load(struct domain *d, const struct hvm_save_header *hdr)
+static int arch_hvm_check(const struct domain *d,
+                          const struct hvm_save_header *hdr)
 {
     uint32_t eax, ebx, ecx, edx;
 
@@ -55,6 +56,11 @@ static int arch_hvm_load(struct domain *
                "(%#"PRIx32") and restored on another (%#"PRIx32").\n",
                d->domain_id, hdr->cpuid, eax);
 
+    return 0;
+}
+
+static void arch_hvm_load(struct domain *d, const struct hvm_save_header *hdr)
+{
     /* Restore guest's preferred TSC frequency. */
     if ( hdr->gtsc_khz )
         d->arch.tsc_khz = hdr->gtsc_khz;
@@ -66,13 +72,12 @@ static int arch_hvm_load(struct domain *
 
     /* VGA state is not saved/restored, so we nobble the cache. */
     d->arch.hvm.stdvga.cache = STDVGA_CACHE_DISABLED;
-
-    return 0;
 }
 
 /* List of handlers for various HVM save and restore types */
 static struct {
     hvm_save_handler save;
+    hvm_check_handler check;
     hvm_load_handler load;
     const char *name;
     size_t size;
@@ -88,6 +93,7 @@ void __init hvm_register_savevm(uint16_t
 {
     ASSERT(typecode <= HVM_SAVE_CODE_MAX);
     ASSERT(hvm_sr_handlers[typecode].save == NULL);
+    ASSERT(hvm_sr_handlers[typecode].check == NULL);
     ASSERT(hvm_sr_handlers[typecode].load == NULL);
     hvm_sr_handlers[typecode].save = save_state;
     hvm_sr_handlers[typecode].load = load_state;
@@ -275,12 +281,10 @@ int hvm_save(struct domain *d, hvm_domai
     return 0;
 }
 
-int hvm_load(struct domain *d, hvm_domain_context_t *h)
+int hvm_load(struct domain *d, bool real, hvm_domain_context_t *h)
 {
     const struct hvm_save_header *hdr;
     struct hvm_save_descriptor *desc;
-    hvm_load_handler handler;
-    struct vcpu *v;
     int rc;
 
     if ( d->is_dying )
@@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai
     if ( !hdr )
         return -ENODATA;
 
-    rc = arch_hvm_load(d, hdr);
-    if ( rc )
-        return rc;
+    rc = arch_hvm_check(d, hdr);
+    if ( real )
+    {
+        struct vcpu *v;
+
+        ASSERT(!rc);
+        arch_hvm_load(d, hdr);
 
-    /* Down all the vcpus: we only re-enable the ones that had state saved. */
-    for_each_vcpu(d, v)
-        if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
-            vcpu_sleep_nosync(v);
+        /*
+         * Down all the vcpus: we only re-enable the ones that had state
+         * saved.
+         */
+        for_each_vcpu(d, v)
+            if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
+                vcpu_sleep_nosync(v);
+    }
+    else if ( rc )
+        return rc;
 
     for ( ; ; )
     {
+        const char *name;
+        hvm_load_handler load;
+
         if ( h->size - h->cur < sizeof(struct hvm_save_descriptor) )
         {
             /* Run out of data */
             printk(XENLOG_G_ERR
                    "HVM%d restore: save did not end with a null entry\n",
                    d->domain_id);
+            ASSERT(!real);
             return -ENODATA;
         }
 
         /* Read the typecode of the next entry  and check for the end-marker */
         desc = (struct hvm_save_descriptor *)(&h->data[h->cur]);
-        if ( desc->typecode == 0 )
+        if ( desc->typecode == HVM_SAVE_CODE(END) )
+        {
+            /* Reset cursor for hvm_load(, true, ). */
+            if ( !real )
+                h->cur = 0;
             return 0;
+        }
 
         /* Find the handler for this entry */
-        if ( (desc->typecode > HVM_SAVE_CODE_MAX) ||
-             ((handler = hvm_sr_handlers[desc->typecode].load) == NULL) )
+        if ( desc->typecode >= ARRAY_SIZE(hvm_sr_handlers) ||
+             !(name = hvm_sr_handlers[desc->typecode].name) ||
+             !(load = hvm_sr_handlers[desc->typecode].load) )
         {
             printk(XENLOG_G_ERR "HVM%d restore: unknown entry typecode %u\n",
                    d->domain_id, desc->typecode);
+            ASSERT(!real);
             return -EINVAL;
         }
 
-        /* Load the entry */
-        printk(XENLOG_G_INFO "HVM%d restore: %s %"PRIu16"\n", d->domain_id,
-               hvm_sr_handlers[desc->typecode].name, desc->instance);
-        rc = handler(d, h);
+        if ( real )
+        {
+            /* Load the entry */
+            printk(XENLOG_G_INFO "HVM restore %pd: %s %"PRIu16"\n", d,
+                   name, desc->instance);
+            rc = load(d, h);
+        }
+        else
+        {
+            /* Check the entry. */
+            hvm_check_handler check = hvm_sr_handlers[desc->typecode].check;
+
+            if ( !check )
+            {
+                if ( desc->length > h->size - h->cur - sizeof(*desc) )
+                    return -ENODATA;
+                h->cur += sizeof(*desc) + desc->length;
+                rc = 0;
+            }
+            else
+                rc = check(d, h);
+        }
+
         if ( rc )
         {
-            printk(XENLOG_G_ERR "HVM%d restore: failed to load entry %u/%u rc %d\n",
-                   d->domain_id, desc->typecode, desc->instance, rc);
+            printk(XENLOG_G_ERR "HVM restore %pd: failed to %s %s:%u rc %d\n",
+                   d, real ? "load" : "check", name, desc->instance, rc);
             return rc;
         }
+
         process_pending_softirqs();
     }
 
--- a/xen/arch/x86/include/asm/hvm/save.h
+++ b/xen/arch/x86/include/asm/hvm/save.h
@@ -103,6 +103,8 @@ static inline unsigned int hvm_load_inst
  * restoring.  Both return non-zero on error. */
 typedef int (*hvm_save_handler) (struct vcpu *v,
                                  hvm_domain_context_t *h);
+typedef int (*hvm_check_handler)(const struct domain *d,
+                                 hvm_domain_context_t *h);
 typedef int (*hvm_load_handler) (struct domain *d,
                                  hvm_domain_context_t *h);
 
@@ -140,6 +142,6 @@ size_t hvm_save_size(struct domain *d);
 int hvm_save(struct domain *d, hvm_domain_context_t *h);
 int hvm_save_one(struct domain *d, unsigned int typecode, unsigned int instance,
                  XEN_GUEST_HANDLE_64(uint8) handle, uint64_t *bufsz);
-int hvm_load(struct domain *d, hvm_domain_context_t *h);
+int hvm_load(struct domain *d, bool real, hvm_domain_context_t *h);
 
 #endif /* __XEN_HVM_SAVE_H__ */



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 2/5] x86/HVM: adjust save/restore hook registration for optional check handler
  2023-12-18 14:37 [PATCH v4 0/5] x86/HVM: load state checking Jan Beulich
  2023-12-18 14:39 ` [PATCH v4 1/5] x86/HVM: split restore state checking from state loading Jan Beulich
@ 2023-12-18 14:40 ` Jan Beulich
  2023-12-18 14:40 ` [PATCH v4 3/5] x86/vPIT: check values loaded from state save record Jan Beulich
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2023-12-18 14:40 UTC (permalink / raw)
  To: xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Register NULL uniformly as a first step.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: New.

--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -374,7 +374,7 @@ static int cf_check vmce_load_vcpu_ctxt(
     return err ?: vmce_restore_vcpu(v, &ctxt);
 }
 
-HVM_REGISTER_SAVE_RESTORE(VMCE_VCPU, vmce_save_vcpu_ctxt,
+HVM_REGISTER_SAVE_RESTORE(VMCE_VCPU, vmce_save_vcpu_ctxt, NULL,
                           vmce_load_vcpu_ctxt, 1, HVMSR_PER_VCPU);
 #endif
 
--- a/xen/arch/x86/emul-i8254.c
+++ b/xen/arch/x86/emul-i8254.c
@@ -458,7 +458,7 @@ static int cf_check pit_load(struct doma
     return rc;
 }
 
-HVM_REGISTER_SAVE_RESTORE(PIT, pit_save, pit_load, 1, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(PIT, pit_save, NULL, pit_load, 1, HVMSR_PER_DOM);
 #endif
 
 /* The intercept action for PIT DM retval: 0--not handled; 1--handled. */
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -692,7 +692,7 @@ static int cf_check hpet_load(struct dom
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(HPET, hpet_save, hpet_load, 1, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(HPET, hpet_save, NULL, hpet_load, 1, HVMSR_PER_DOM);
 
 static void hpet_set(HPETState *h)
 {
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -793,7 +793,7 @@ static int cf_check hvm_load_tsc_adjust(
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(TSC_ADJUST, hvm_save_tsc_adjust,
+HVM_REGISTER_SAVE_RESTORE(TSC_ADJUST, hvm_save_tsc_adjust, NULL,
                           hvm_load_tsc_adjust, 1, HVMSR_PER_VCPU);
 
 static int cf_check hvm_save_cpu_ctxt(struct vcpu *v, hvm_domain_context_t *h)
@@ -1189,7 +1189,7 @@ static int cf_check hvm_load_cpu_ctxt(st
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(CPU, hvm_save_cpu_ctxt, hvm_load_cpu_ctxt, 1,
+HVM_REGISTER_SAVE_RESTORE(CPU, hvm_save_cpu_ctxt, NULL, hvm_load_cpu_ctxt, 1,
                           HVMSR_PER_VCPU);
 
 #define HVM_CPU_XSAVE_SIZE(xcr0) (offsetof(struct hvm_hw_cpu_xsave, \
@@ -1538,6 +1538,7 @@ static int __init cf_check hvm_register_
     hvm_register_savevm(CPU_XSAVE_CODE,
                         "CPU_XSAVE",
                         hvm_save_cpu_xsave_states,
+                        NULL,
                         hvm_load_cpu_xsave_states,
                         HVM_CPU_XSAVE_SIZE(xfeature_mask) +
                             sizeof(struct hvm_save_descriptor),
@@ -1546,6 +1547,7 @@ static int __init cf_check hvm_register_
     hvm_register_savevm(CPU_MSR_CODE,
                         "CPU_MSR",
                         hvm_save_cpu_msrs,
+                        NULL,
                         hvm_load_cpu_msrs,
                         HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send)) +
                             sizeof(struct hvm_save_descriptor),
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -784,9 +784,9 @@ static int cf_check irq_load_link(struct
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(PCI_IRQ, irq_save_pci, irq_load_pci,
+HVM_REGISTER_SAVE_RESTORE(PCI_IRQ, irq_save_pci, NULL, irq_load_pci,
                           1, HVMSR_PER_DOM);
-HVM_REGISTER_SAVE_RESTORE(ISA_IRQ, irq_save_isa, irq_load_isa,
+HVM_REGISTER_SAVE_RESTORE(ISA_IRQ, irq_save_isa, NULL, irq_load_isa,
                           1, HVMSR_PER_DOM);
-HVM_REGISTER_SAVE_RESTORE(PCI_LINK, irq_save_link, irq_load_link,
+HVM_REGISTER_SAVE_RESTORE(PCI_LINK, irq_save_link, NULL, irq_load_link,
                           1, HVMSR_PER_DOM);
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -773,7 +773,7 @@ static int cf_check hvm_load_mtrr_msr(st
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr, 1,
+HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, NULL, hvm_load_mtrr_msr, 1,
                           HVMSR_PER_VCPU);
 
 void memory_type_changed(struct domain *d)
--- a/xen/arch/x86/hvm/pmtimer.c
+++ b/xen/arch/x86/hvm/pmtimer.c
@@ -300,7 +300,7 @@ static int cf_check acpi_load(struct dom
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(PMTIMER, acpi_save, acpi_load,
+HVM_REGISTER_SAVE_RESTORE(PMTIMER, acpi_save, NULL, acpi_load,
                           1, HVMSR_PER_DOM);
 
 int pmtimer_change_ioport(struct domain *d, uint64_t version)
--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -797,7 +797,7 @@ static int cf_check rtc_load(struct doma
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(RTC, rtc_save, rtc_load, 1, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(RTC, rtc_save, NULL, rtc_load, 1, HVMSR_PER_DOM);
 
 void rtc_reset(struct domain *d)
 {
--- a/xen/arch/x86/hvm/save.c
+++ b/xen/arch/x86/hvm/save.c
@@ -88,6 +88,7 @@ static struct {
 void __init hvm_register_savevm(uint16_t typecode,
                                 const char *name,
                                 hvm_save_handler save_state,
+                                hvm_check_handler check_state,
                                 hvm_load_handler load_state,
                                 size_t size, int kind)
 {
@@ -96,6 +97,7 @@ void __init hvm_register_savevm(uint16_t
     ASSERT(hvm_sr_handlers[typecode].check == NULL);
     ASSERT(hvm_sr_handlers[typecode].load == NULL);
     hvm_sr_handlers[typecode].save = save_state;
+    hvm_sr_handlers[typecode].check = check_state;
     hvm_sr_handlers[typecode].load = load_state;
     hvm_sr_handlers[typecode].name = name;
     hvm_sr_handlers[typecode].size = size;
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -631,7 +631,8 @@ static int cf_check ioapic_load(struct d
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(IOAPIC, ioapic_save, ioapic_load, 1, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(IOAPIC, ioapic_save, NULL, ioapic_load, 1,
+                          HVMSR_PER_DOM);
 
 void vioapic_reset(struct domain *d)
 {
--- a/xen/arch/x86/hvm/viridian/viridian.c
+++ b/xen/arch/x86/hvm/viridian/viridian.c
@@ -1145,7 +1145,7 @@ static int cf_check viridian_load_domain
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(VIRIDIAN_DOMAIN, viridian_save_domain_ctxt,
+HVM_REGISTER_SAVE_RESTORE(VIRIDIAN_DOMAIN, viridian_save_domain_ctxt, NULL,
                           viridian_load_domain_ctxt, 1, HVMSR_PER_DOM);
 
 static int cf_check viridian_save_vcpu_ctxt(
@@ -1188,7 +1188,7 @@ static int cf_check viridian_load_vcpu_c
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(VIRIDIAN_VCPU, viridian_save_vcpu_ctxt,
+HVM_REGISTER_SAVE_RESTORE(VIRIDIAN_VCPU, viridian_save_vcpu_ctxt, NULL,
                           viridian_load_vcpu_ctxt, 1, HVMSR_PER_VCPU);
 
 static int __init cf_check parse_viridian_version(const char *arg)
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1617,9 +1617,9 @@ static int cf_check lapic_load_regs(stru
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(LAPIC, lapic_save_hidden,
+HVM_REGISTER_SAVE_RESTORE(LAPIC, lapic_save_hidden, NULL,
                           lapic_load_hidden, 1, HVMSR_PER_VCPU);
-HVM_REGISTER_SAVE_RESTORE(LAPIC_REGS, lapic_save_regs,
+HVM_REGISTER_SAVE_RESTORE(LAPIC_REGS, lapic_save_regs, NULL,
                           lapic_load_regs, 1, HVMSR_PER_VCPU);
 
 int vlapic_init(struct vcpu *v)
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -449,7 +449,7 @@ static int cf_check vpic_load(struct dom
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(PIC, vpic_save, vpic_load, 2, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(PIC, vpic_save, NULL, vpic_load, 2, HVMSR_PER_DOM);
 
 void vpic_reset(struct domain *d)
 {
--- a/xen/arch/x86/include/asm/hvm/save.h
+++ b/xen/arch/x86/include/asm/hvm/save.h
@@ -113,6 +113,7 @@ typedef int (*hvm_load_handler) (struct
 void hvm_register_savevm(uint16_t typecode,
                          const char *name, 
                          hvm_save_handler save_state,
+                         hvm_check_handler check_state,
                          hvm_load_handler load_state,
                          size_t size, int kind);
 
@@ -122,12 +123,13 @@ void hvm_register_savevm(uint16_t typeco
 
 /* Syntactic sugar around that function: specify the max number of
  * saves, and this calculates the size of buffer needed */
-#define HVM_REGISTER_SAVE_RESTORE(_x, _save, _load, _num, _k)             \
+#define HVM_REGISTER_SAVE_RESTORE(_x, _save, check, _load, _num, _k)      \
 static int __init cf_check __hvm_register_##_x##_save_and_restore(void)   \
 {                                                                         \
     hvm_register_savevm(HVM_SAVE_CODE(_x),                                \
                         #_x,                                              \
                         &_save,                                           \
+                        check,                                            \
                         &_load,                                           \
                         (_num) * (HVM_SAVE_LENGTH(_x)                     \
                                   + sizeof (struct hvm_save_descriptor)), \



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 3/5] x86/vPIT: check values loaded from state save record
  2023-12-18 14:37 [PATCH v4 0/5] x86/HVM: load state checking Jan Beulich
  2023-12-18 14:39 ` [PATCH v4 1/5] x86/HVM: split restore state checking from state loading Jan Beulich
  2023-12-18 14:40 ` [PATCH v4 2/5] x86/HVM: adjust save/restore hook registration for optional check handler Jan Beulich
@ 2023-12-18 14:40 ` Jan Beulich
  2023-12-18 14:40 ` [PATCH v4 4/5] x86/vPIC: " Jan Beulich
  2023-12-18 14:41 ` [PATCH v4 5/5] x86/vIRQ: split PCI link load state checking from actual loading Jan Beulich
  4 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2023-12-18 14:40 UTC (permalink / raw)
  To: xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

In particular pit_latch_status() and speaker_ioport_read() perform
calculations which assume in-bounds values. Several of the state save
record fields can hold wider ranges, though. Refuse to load values which
cannot result from normal operation, except mode, the init state of
which (see also below) cannot otherwise be reached.

Note that ->gate should only be possible to be zero for channel 2;
enforce that as well.

Adjust pit_reset()'s writing of ->mode as well, to not unduly affect
the value pit_latch_status() may calculate. The chosen mode of 7 is
still one which cannot be established by writing the control word. Note
that with or without this adjustment effectively all switch() statements
using mode as the control expression aren't quite right when the PIT is
still in that init state; there is an apparent assumption that before
these can sensibly be invoked, the guest would init the PIT (i.e. in
particular set the mode).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
For mode we could refuse to load values in the [0x08,0xfe] range; I'm
not certain that's going to be overly helpful.

For count I was considering to clip the saved value to 16 bits (i.e. to
convert the internally used 0x10000 back to the architectural 0x0000),
but pit_save() doesn't easily lend itself to such a "fixup". If desired
perhaps better a separate change anyway.
---
v3: Slightly adjust two comments. Re-base over rename in earlier patch.
v2: Introduce separate checking function; switch to refusing to load
    bogus values. Re-base.

--- a/xen/arch/x86/emul-i8254.c
+++ b/xen/arch/x86/emul-i8254.c
@@ -47,6 +47,7 @@
 #define RW_STATE_MSB 2
 #define RW_STATE_WORD0 3
 #define RW_STATE_WORD1 4
+#define RW_STATE_NUM 5
 
 #define get_guest_time(v) \
    (is_hvm_vcpu(v) ? hvm_get_guest_time(v) : (u64)get_s_time())
@@ -427,6 +428,47 @@ static int cf_check pit_save(struct vcpu
     return rc;
 }
 
+static int cf_check pit_check(const struct domain *d, hvm_domain_context_t *h)
+{
+    const struct hvm_hw_pit *hw;
+    unsigned int i;
+
+    if ( !has_vpit(d) )
+        return -ENODEV;
+
+    hw = hvm_get_entry(PIT, h);
+    if ( !hw )
+        return -ENODATA;
+
+    /*
+     * Check to-be-loaded values are within valid range, for them to represent
+     * actually reachable state.  Uses of some of the values elsewhere assume
+     * this is the case.  Note that the channels' mode fields aren't checked;
+     * Xen prior to 4.19 might save them as 0xff.
+     */
+    if ( hw->speaker_data_on > 1 || hw->pad0 )
+        return -EDOM;
+
+    for ( i = 0; i < ARRAY_SIZE(hw->channels); ++i )
+    {
+        const struct hvm_hw_pit_channel *ch = &hw->channels[i];
+
+        if ( ch->count > 0x10000 ||
+             ch->count_latched >= RW_STATE_NUM ||
+             ch->read_state >= RW_STATE_NUM ||
+             ch->write_state >= RW_STATE_NUM ||
+             ch->rw_mode > RW_STATE_WORD0 ||
+             ch->gate > 1 ||
+             ch->bcd > 1 )
+            return -EDOM;
+
+        if ( i != 2 && !ch->gate )
+            return -EINVAL;
+    }
+
+    return 0;
+}
+
 static int cf_check pit_load(struct domain *d, hvm_domain_context_t *h)
 {
     PITState *pit = domain_vpit(d);
@@ -443,6 +485,14 @@ static int cf_check pit_load(struct doma
         goto out;
     }
     
+    for ( i = 0; i < ARRAY_SIZE(pit->hw.channels); ++i )
+    {
+        struct hvm_hw_pit_channel *ch = &pit->hw.channels[i];
+
+        if ( (ch->mode &= 7) > 5 )
+            ch->mode -= 4;
+    }
+
     /*
      * Recreate platform timers from hardware state.  There will be some 
      * time jitter here, but the wall-clock will have jumped massively, so 
@@ -458,7 +508,7 @@ static int cf_check pit_load(struct doma
     return rc;
 }
 
-HVM_REGISTER_SAVE_RESTORE(PIT, pit_save, NULL, pit_load, 1, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(PIT, pit_save, pit_check, pit_load, 1, HVMSR_PER_DOM);
 #endif
 
 /* The intercept action for PIT DM retval: 0--not handled; 1--handled. */
@@ -575,7 +625,7 @@ void pit_reset(struct domain *d)
     for ( i = 0; i < 3; i++ )
     {
         s = &pit->hw.channels[i];
-        s->mode = 0xff; /* the init mode */
+        s->mode = 7; /* unreachable sentinel */
         s->gate = (i != 2);
         pit_load_count(pit, i, 0);
     }



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 4/5] x86/vPIC: check values loaded from state save record
  2023-12-18 14:37 [PATCH v4 0/5] x86/HVM: load state checking Jan Beulich
                   ` (2 preceding siblings ...)
  2023-12-18 14:40 ` [PATCH v4 3/5] x86/vPIT: check values loaded from state save record Jan Beulich
@ 2023-12-18 14:40 ` Jan Beulich
  2023-12-18 14:41 ` [PATCH v4 5/5] x86/vIRQ: split PCI link load state checking from actual loading Jan Beulich
  4 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2023-12-18 14:40 UTC (permalink / raw)
  To: xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Loading is_master from the state save record can lead to out-of-bounds
accesses via at least the two container_of() uses by vpic_domain() and
__vpic_lock(). Make sure the value is consistent with the instance being
loaded.

For ->int_output (which for whatever reason isn't a 1-bit bitfield),
besides bounds checking also take ->init_state into account.

For ELCR follow what vpic_intercept_elcr_io()'s write path and
vpic_reset() do, i.e. don't insist on the internal view of the value to
be saved.

Move the instance range check as well, leaving just an assertion in the
load handler.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v3: vpic_domain() fix and vpic_elcr_mask() adjustment split out. Re-base
    over rename in earlier patch.
v2: Introduce separate checking function; switch to refusing to load
    bogus values. Re-base.

--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -429,6 +429,38 @@ static int cf_check vpic_save(struct vcp
     return 0;
 }
 
+static int cf_check vpic_check(const struct domain *d, hvm_domain_context_t *h)
+{
+    unsigned int inst = hvm_load_instance(h);
+    const struct hvm_hw_vpic *s;
+
+    if ( !has_vpic(d) )
+        return -ENODEV;
+
+    /* Which PIC is this? */
+    if ( inst >= ARRAY_SIZE(d->arch.hvm.vpic) )
+        return -ENOENT;
+
+    s = hvm_get_entry(PIC, h);
+    if ( !s )
+        return -ENODATA;
+
+    /*
+     * Check to-be-loaded values are within valid range, for them to represent
+     * actually reachable state.  Uses of some of the values elsewhere assume
+     * this is the case.
+     */
+    if ( s->int_output > 1 )
+        return -EDOM;
+
+    if ( s->is_master != !inst ||
+         (s->int_output && s->init_state) ||
+         (s->elcr & ~vpic_elcr_mask(s, 1)) )
+        return -EINVAL;
+
+    return 0;
+}
+
 static int cf_check vpic_load(struct domain *d, hvm_domain_context_t *h)
 {
     struct hvm_hw_vpic *s;
@@ -438,18 +470,21 @@ static int cf_check vpic_load(struct dom
         return -ENODEV;
 
     /* Which PIC is this? */
-    if ( inst > 1 )
-        return -ENOENT;
+    ASSERT(inst < ARRAY_SIZE(d->arch.hvm.vpic));
     s = &d->arch.hvm.vpic[inst];
 
     /* Load the state */
     if ( hvm_load_entry(PIC, h, s) != 0 )
         return -EINVAL;
 
+    if ( s->is_master )
+        s->elcr |= 1 << 2;
+
     return 0;
 }
 
-HVM_REGISTER_SAVE_RESTORE(PIC, vpic_save, NULL, vpic_load, 2, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(PIC, vpic_save, vpic_check, vpic_load, 2,
+                          HVMSR_PER_DOM);
 
 void vpic_reset(struct domain *d)
 {



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 5/5] x86/vIRQ: split PCI link load state checking from actual loading
  2023-12-18 14:37 [PATCH v4 0/5] x86/HVM: load state checking Jan Beulich
                   ` (3 preceding siblings ...)
  2023-12-18 14:40 ` [PATCH v4 4/5] x86/vPIC: " Jan Beulich
@ 2023-12-18 14:41 ` Jan Beulich
  2023-12-19 14:50   ` Roger Pau Monné
  4 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2023-12-18 14:41 UTC (permalink / raw)
  To: xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Move the checking into a check hook, and add checking of the padding
fields as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: New.

--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -749,6 +749,30 @@ static int cf_check irq_load_isa(struct
     return 0;
 }
 
+static int cf_check irq_check_link(const struct domain *d,
+                                   hvm_domain_context_t *h)
+{
+    const struct hvm_hw_pci_link *pci_link = hvm_get_entry(PCI_LINK, h);
+    unsigned int link;
+
+    if ( !pci_link )
+        return -ENODATA;
+
+    for ( link = 0; link < ARRAY_SIZE(pci_link->pad0); link++ )
+        if ( pci_link->pad0[link] )
+            return -EINVAL;
+
+    for ( link = 0; link < ARRAY_SIZE(pci_link->route); link++ )
+        if ( pci_link->route[link] > 15 )
+        {
+            printk(XENLOG_G_ERR
+                   "HVM restore: PCI-ISA link %u out of range (%u)\n",
+                   link, pci_link->route[link]);
+            return -EINVAL;
+        }
+
+    return 0;
+}
 
 static int cf_check irq_load_link(struct domain *d, hvm_domain_context_t *h)
 {
@@ -759,16 +783,6 @@ static int cf_check irq_load_link(struct
     if ( hvm_load_entry(PCI_LINK, h, &hvm_irq->pci_link) != 0 )
         return -EINVAL;
 
-    /* Sanity check */
-    for ( link = 0; link < 4; link++ )
-        if ( hvm_irq->pci_link.route[link] > 15 )
-        {
-            printk(XENLOG_G_ERR
-                   "HVM restore: PCI-ISA link %u out of range (%u)\n",
-                   link, hvm_irq->pci_link.route[link]);
-            return -EINVAL;
-        }
-
     /* Adjust the GSI assert counts for the link outputs.
      * This relies on the PCI and ISA IRQ state being loaded first */
     for ( link = 0; link < 4; link++ )
@@ -788,5 +802,5 @@ HVM_REGISTER_SAVE_RESTORE(PCI_IRQ, irq_s
                           1, HVMSR_PER_DOM);
 HVM_REGISTER_SAVE_RESTORE(ISA_IRQ, irq_save_isa, NULL, irq_load_isa,
                           1, HVMSR_PER_DOM);
-HVM_REGISTER_SAVE_RESTORE(PCI_LINK, irq_save_link, NULL, irq_load_link,
-                          1, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(PCI_LINK, irq_save_link, irq_check_link,
+                          irq_load_link, 1, HVMSR_PER_DOM);



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading
  2023-12-18 14:39 ` [PATCH v4 1/5] x86/HVM: split restore state checking from state loading Jan Beulich
@ 2023-12-19 14:36   ` Roger Pau Monné
  2023-12-19 15:24     ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monné @ 2023-12-19 14:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Wei Liu

On Mon, Dec 18, 2023 at 03:39:55PM +0100, Jan Beulich wrote:
> ..., at least as reasonably feasible without making a check hook
> mandatory (in particular strict vs relaxed/zero-extend length checking
> can't be done early this way).
> 
> Note that only one of the two uses of "real" hvm_load() is accompanied
> with a "checking" one. The other directly consumes hvm_save() output,
> which ought to be well-formed. This means that while input data related
> checks don't need repeating in the "load" function when already done by
> the "check" one (albeit assertions to this effect may be desirable),
> domain state related checks (e.g. has_xyz(d)) will be required in both
> places.
> 
> Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Now that this re-arranges hvm_load() anyway, wouldn't it be better to
> down the vCPU-s ahead of calling arch_hvm_load() (which is now easy to
> arrange for)?

Seems OK to me.

> Do we really need all the copying involved in use of _hvm_read_entry()
> (backing hvm_load_entry()? Zero-extending loads are likely easier to
> handle that way, but for strict loads all we gain is a reduced risk of
> unaligned accesses (compared to simply pointing into h->data[]).

I do feel it's safer to copy the data so the checks are done against
what's loaded.  Albeit hvm_load() is already using hvm_get_entry().

> ---
> v4: Fold hvm_check() into hvm_load().
> v2: New.
> 
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -379,8 +379,12 @@ long arch_do_domctl(
>          if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) != 0 )
>              goto sethvmcontext_out;
>  
> +        ret = hvm_load(d, false, &c);
> +        if ( ret )
> +            goto sethvmcontext_out;
> +
>          domain_pause(d);
> -        ret = hvm_load(d, &c);
> +        ret = hvm_load(d, true, &c);

Now that the check has been done ahead, do we want to somehow assert
that this cannot fail?  AIUI that's the expectation.

>          domain_unpause(d);
>  
>      sethvmcontext_out:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -5397,7 +5397,7 @@ int hvm_copy_context_and_params(struct d
>      }
>  
>      c.cur = 0;
> -    rc = hvm_load(dst, &c);
> +    rc = hvm_load(dst, true, &c);
>  
>   out:
>      vfree(c.data);
> --- a/xen/arch/x86/hvm/save.c
> +++ b/xen/arch/x86/hvm/save.c
> @@ -30,7 +30,8 @@ static void arch_hvm_save(struct domain
>      d->arch.hvm.sync_tsc = rdtsc();
>  }
>  
> -static int arch_hvm_load(struct domain *d, const struct hvm_save_header *hdr)
> +static int arch_hvm_check(const struct domain *d,
> +                          const struct hvm_save_header *hdr)
>  {
>      uint32_t eax, ebx, ecx, edx;
>  
> @@ -55,6 +56,11 @@ static int arch_hvm_load(struct domain *
>                 "(%#"PRIx32") and restored on another (%#"PRIx32").\n",
>                 d->domain_id, hdr->cpuid, eax);
>  
> +    return 0;
> +}
> +
> +static void arch_hvm_load(struct domain *d, const struct hvm_save_header *hdr)
> +{
>      /* Restore guest's preferred TSC frequency. */
>      if ( hdr->gtsc_khz )
>          d->arch.tsc_khz = hdr->gtsc_khz;
> @@ -66,13 +72,12 @@ static int arch_hvm_load(struct domain *
>  
>      /* VGA state is not saved/restored, so we nobble the cache. */
>      d->arch.hvm.stdvga.cache = STDVGA_CACHE_DISABLED;
> -
> -    return 0;
>  }
>  
>  /* List of handlers for various HVM save and restore types */
>  static struct {
>      hvm_save_handler save;
> +    hvm_check_handler check;
>      hvm_load_handler load;
>      const char *name;
>      size_t size;
> @@ -88,6 +93,7 @@ void __init hvm_register_savevm(uint16_t
>  {
>      ASSERT(typecode <= HVM_SAVE_CODE_MAX);
>      ASSERT(hvm_sr_handlers[typecode].save == NULL);
> +    ASSERT(hvm_sr_handlers[typecode].check == NULL);
>      ASSERT(hvm_sr_handlers[typecode].load == NULL);
>      hvm_sr_handlers[typecode].save = save_state;
>      hvm_sr_handlers[typecode].load = load_state;
> @@ -275,12 +281,10 @@ int hvm_save(struct domain *d, hvm_domai
>      return 0;
>  }
>  
> -int hvm_load(struct domain *d, hvm_domain_context_t *h)
> +int hvm_load(struct domain *d, bool real, hvm_domain_context_t *h)

Maybe the 'real' parameter should instead be an enum:

enum hvm_load_action {
    CHECK,
    LOAD,
};
int hvm_load(struct domain *d, enum hvm_load_action action,
             hvm_domain_context_t *h);

Otherwise a comment might be warranted about how 'real' affects the
logic in the function.

>  {
>      const struct hvm_save_header *hdr;
>      struct hvm_save_descriptor *desc;
> -    hvm_load_handler handler;
> -    struct vcpu *v;
>      int rc;
>  
>      if ( d->is_dying )
> @@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai
>      if ( !hdr )
>          return -ENODATA;
>  
> -    rc = arch_hvm_load(d, hdr);
> -    if ( rc )
> -        return rc;
> +    rc = arch_hvm_check(d, hdr);

Shouldn't this _check function only be called when real == false?

> +    if ( real )
> +    {
> +        struct vcpu *v;
> +
> +        ASSERT(!rc);
> +        arch_hvm_load(d, hdr);
>  
> -    /* Down all the vcpus: we only re-enable the ones that had state saved. */
> -    for_each_vcpu(d, v)
> -        if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
> -            vcpu_sleep_nosync(v);
> +        /*
> +         * Down all the vcpus: we only re-enable the ones that had state
> +         * saved.
> +         */
> +        for_each_vcpu(d, v)
> +            if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
> +                vcpu_sleep_nosync(v);
> +    }
> +    else if ( rc )
> +        return rc;
>  
>      for ( ; ; )
>      {
> +        const char *name;
> +        hvm_load_handler load;
> +
>          if ( h->size - h->cur < sizeof(struct hvm_save_descriptor) )
>          {
>              /* Run out of data */
>              printk(XENLOG_G_ERR
>                     "HVM%d restore: save did not end with a null entry\n",
>                     d->domain_id);
> +            ASSERT(!real);
>              return -ENODATA;
>          }
>  
>          /* Read the typecode of the next entry  and check for the end-marker */
>          desc = (struct hvm_save_descriptor *)(&h->data[h->cur]);
> -        if ( desc->typecode == 0 )
> +        if ( desc->typecode == HVM_SAVE_CODE(END) )
> +        {
> +            /* Reset cursor for hvm_load(, true, ). */
> +            if ( !real )
> +                h->cur = 0;
>              return 0;
> +        }
>  
>          /* Find the handler for this entry */
> -        if ( (desc->typecode > HVM_SAVE_CODE_MAX) ||
> -             ((handler = hvm_sr_handlers[desc->typecode].load) == NULL) )
> +        if ( desc->typecode >= ARRAY_SIZE(hvm_sr_handlers) ||
> +             !(name = hvm_sr_handlers[desc->typecode].name) ||
> +             !(load = hvm_sr_handlers[desc->typecode].load) )
>          {
>              printk(XENLOG_G_ERR "HVM%d restore: unknown entry typecode %u\n",
>                     d->domain_id, desc->typecode);

The message is not very accurate here, it does fail when the typecode
is unknown, but also fails when such typecode has no name or load
function setup.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 5/5] x86/vIRQ: split PCI link load state checking from actual loading
  2023-12-18 14:41 ` [PATCH v4 5/5] x86/vIRQ: split PCI link load state checking from actual loading Jan Beulich
@ 2023-12-19 14:50   ` Roger Pau Monné
  0 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monné @ 2023-12-19 14:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Wei Liu

On Mon, Dec 18, 2023 at 03:41:14PM +0100, Jan Beulich wrote:
> Move the checking into a check hook, and add checking of the padding
> fields as well.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading
  2023-12-19 14:36   ` Roger Pau Monné
@ 2023-12-19 15:24     ` Jan Beulich
  2024-01-09 10:26       ` Roger Pau Monné
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2023-12-19 15:24 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Wei Liu

On 19.12.2023 15:36, Roger Pau Monné wrote:
> On Mon, Dec 18, 2023 at 03:39:55PM +0100, Jan Beulich wrote:
>> ..., at least as reasonably feasible without making a check hook
>> mandatory (in particular strict vs relaxed/zero-extend length checking
>> can't be done early this way).
>>
>> Note that only one of the two uses of "real" hvm_load() is accompanied
>> with a "checking" one. The other directly consumes hvm_save() output,
>> which ought to be well-formed. This means that while input data related
>> checks don't need repeating in the "load" function when already done by
>> the "check" one (albeit assertions to this effect may be desirable),
>> domain state related checks (e.g. has_xyz(d)) will be required in both
>> places.
>>
>> Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> Now that this re-arranges hvm_load() anyway, wouldn't it be better to
>> down the vCPU-s ahead of calling arch_hvm_load() (which is now easy to
>> arrange for)?
> 
> Seems OK to me.

As is, or with the suggested adjustment, or either way?

>> Do we really need all the copying involved in use of _hvm_read_entry()
>> (backing hvm_load_entry()? Zero-extending loads are likely easier to
>> handle that way, but for strict loads all we gain is a reduced risk of
>> unaligned accesses (compared to simply pointing into h->data[]).
> 
> I do feel it's safer to copy the data so the checks are done against
> what's loaded.  Albeit hvm_load() is already using hvm_get_entry().

The comment is about individual handlers, not hvm_load() itself. In some
cases - when copying directly into guest state structures - the copying
makes sense. In other cases, where there is separate copying anyway,
things could be done with less duplication of data (see hpet_load(),
which was already converted; along those lines hvm_load() itself also
was already switched away from the original copying).

Checking in any event can't be done against what _is_ loaded (as during
checking we want to avoid altering guest state); it'll always be only
against what is going to be loaded. The difference would be whether we
check data in the incoming buffer or in a copy of that data in a local
variable on the stack. But that applies to checking done in the load
hooks only anyway; the cases with split off check handlers should never
need to do any copying.

>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -379,8 +379,12 @@ long arch_do_domctl(
>>          if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) != 0 )
>>              goto sethvmcontext_out;
>>  
>> +        ret = hvm_load(d, false, &c);
>> +        if ( ret )
>> +            goto sethvmcontext_out;
>> +
>>          domain_pause(d);
>> -        ret = hvm_load(d, &c);
>> +        ret = hvm_load(d, true, &c);
> 
> Now that the check has been done ahead, do we want to somehow assert
> that this cannot fail?  AIUI that's the expectation.

We certainly can't until all checking was moved out of the load handlers.
And even then I think there are still cases where load might produce an
error. (In fact I would have refused a little more strongly to folding
the prior hvm_check() into hvm_load() if indeed a separate hvm_load()
could have ended up returning void in the long run.)

>> @@ -275,12 +281,10 @@ int hvm_save(struct domain *d, hvm_domai
>>      return 0;
>>  }
>>  
>> -int hvm_load(struct domain *d, hvm_domain_context_t *h)
>> +int hvm_load(struct domain *d, bool real, hvm_domain_context_t *h)
> 
> Maybe the 'real' parameter should instead be an enum:
> 
> enum hvm_load_action {
>     CHECK,
>     LOAD,
> };
> int hvm_load(struct domain *d, enum hvm_load_action action,
>              hvm_domain_context_t *h);

Hmm, yes, it could. I'm not a fan of enums for boolean-like things,
though.

> Otherwise a comment might be warranted about how 'real' affects the
> logic in the function.

I can certainly add a comment immediately ahead of the function:

/*
 * @real = false requests checking of the incoming state, while @real = true
 * requests actual loading, which will then assume that checking was already
 * done or is unnecessary.
 */

>> @@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai
>>      if ( !hdr )
>>          return -ENODATA;
>>  
>> -    rc = arch_hvm_load(d, hdr);
>> -    if ( rc )
>> -        return rc;
>> +    rc = arch_hvm_check(d, hdr);
> 
> Shouldn't this _check function only be called when real == false?

Possibly. In v4 I directly transformed what I had in v3:

    ASSERT(!arch_hvm_check(d, hdr));

I.e. it is now the call above plus ...

>> +    if ( real )
>> +    {
>> +        struct vcpu *v;
>> +
>> +        ASSERT(!rc);

... this assertion. Really the little brother of the call site assertion
you're asking for (see above).

>> +        arch_hvm_load(d, hdr);
>>  
>> -    /* Down all the vcpus: we only re-enable the ones that had state saved. */
>> -    for_each_vcpu(d, v)
>> -        if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>> -            vcpu_sleep_nosync(v);
>> +        /*
>> +         * Down all the vcpus: we only re-enable the ones that had state
>> +         * saved.
>> +         */
>> +        for_each_vcpu(d, v)
>> +            if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>> +                vcpu_sleep_nosync(v);
>> +    }
>> +    else if ( rc )
>> +        return rc;
>>  
>>      for ( ; ; )
>>      {
>> +        const char *name;
>> +        hvm_load_handler load;
>> +
>>          if ( h->size - h->cur < sizeof(struct hvm_save_descriptor) )
>>          {
>>              /* Run out of data */
>>              printk(XENLOG_G_ERR
>>                     "HVM%d restore: save did not end with a null entry\n",
>>                     d->domain_id);
>> +            ASSERT(!real);
>>              return -ENODATA;
>>          }
>>  
>>          /* Read the typecode of the next entry  and check for the end-marker */
>>          desc = (struct hvm_save_descriptor *)(&h->data[h->cur]);
>> -        if ( desc->typecode == 0 )
>> +        if ( desc->typecode == HVM_SAVE_CODE(END) )
>> +        {
>> +            /* Reset cursor for hvm_load(, true, ). */
>> +            if ( !real )
>> +                h->cur = 0;
>>              return 0;
>> +        }
>>  
>>          /* Find the handler for this entry */
>> -        if ( (desc->typecode > HVM_SAVE_CODE_MAX) ||
>> -             ((handler = hvm_sr_handlers[desc->typecode].load) == NULL) )
>> +        if ( desc->typecode >= ARRAY_SIZE(hvm_sr_handlers) ||
>> +             !(name = hvm_sr_handlers[desc->typecode].name) ||
>> +             !(load = hvm_sr_handlers[desc->typecode].load) )
>>          {
>>              printk(XENLOG_G_ERR "HVM%d restore: unknown entry typecode %u\n",
>>                     d->domain_id, desc->typecode);
> 
> The message is not very accurate here, it does fail when the typecode
> is unknown, but also fails when such typecode has no name or load
> function setup.

Yes and no, and it's not changing in this patch. Are you suggesting I should
change it despite being unrelated? If so, there not being a name (which is
the new check I'm adding) still suggests the code is unknown. There not being
a load handler really indicates a bug in Xen (yet no reason to e.g. BUG() in
that case, the failed loading will hopefully be noticeable enough).

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading
  2023-12-19 15:24     ` Jan Beulich
@ 2024-01-09 10:26       ` Roger Pau Monné
  2024-01-09 10:58         ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monné @ 2024-01-09 10:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Wei Liu

On Tue, Dec 19, 2023 at 04:24:02PM +0100, Jan Beulich wrote:
> On 19.12.2023 15:36, Roger Pau Monné wrote:
> > On Mon, Dec 18, 2023 at 03:39:55PM +0100, Jan Beulich wrote:
> >> ..., at least as reasonably feasible without making a check hook
> >> mandatory (in particular strict vs relaxed/zero-extend length checking
> >> can't be done early this way).
> >>
> >> Note that only one of the two uses of "real" hvm_load() is accompanied
> >> with a "checking" one. The other directly consumes hvm_save() output,
> >> which ought to be well-formed. This means that while input data related
> >> checks don't need repeating in the "load" function when already done by
> >> the "check" one (albeit assertions to this effect may be desirable),
> >> domain state related checks (e.g. has_xyz(d)) will be required in both
> >> places.
> >>
> >> Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> ---
> >> Now that this re-arranges hvm_load() anyway, wouldn't it be better to
> >> down the vCPU-s ahead of calling arch_hvm_load() (which is now easy to
> >> arrange for)?
> > 
> > Seems OK to me.
> 
> As is, or with the suggested adjustment, or either way?

I'm fine either way if you don't want to do it as part of this
patch.

> >> --- a/xen/arch/x86/domctl.c
> >> +++ b/xen/arch/x86/domctl.c
> >> @@ -379,8 +379,12 @@ long arch_do_domctl(
> >>          if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) != 0 )
> >>              goto sethvmcontext_out;
> >>  
> >> +        ret = hvm_load(d, false, &c);
> >> +        if ( ret )
> >> +            goto sethvmcontext_out;
> >> +
> >>          domain_pause(d);
> >> -        ret = hvm_load(d, &c);
> >> +        ret = hvm_load(d, true, &c);
> > 
> > Now that the check has been done ahead, do we want to somehow assert
> > that this cannot fail?  AIUI that's the expectation.
> 
> We certainly can't until all checking was moved out of the load handlers.
> And even then I think there are still cases where load might produce an
> error. (In fact I would have refused a little more strongly to folding
> the prior hvm_check() into hvm_load() if indeed a separate hvm_load()
> could have ended up returning void in the long run.)

I see, _load could fail even if all the data provided was correct, for
example because the hypervisor is OoM?

> >> @@ -275,12 +281,10 @@ int hvm_save(struct domain *d, hvm_domai
> >>      return 0;
> >>  }
> >>  
> >> -int hvm_load(struct domain *d, hvm_domain_context_t *h)
> >> +int hvm_load(struct domain *d, bool real, hvm_domain_context_t *h)
> > 
> > Maybe the 'real' parameter should instead be an enum:
> > 
> > enum hvm_load_action {
> >     CHECK,
> >     LOAD,
> > };
> > int hvm_load(struct domain *d, enum hvm_load_action action,
> >              hvm_domain_context_t *h);
> 
> Hmm, yes, it could. I'm not a fan of enums for boolean-like things,
> though.
> 
> > Otherwise a comment might be warranted about how 'real' affects the
> > logic in the function.
> 
> I can certainly add a comment immediately ahead of the function:
> 
> /*
>  * @real = false requests checking of the incoming state, while @real = true
>  * requests actual loading, which will then assume that checking was already
>  * done or is unnecessary.
>  */

Seems good to me.  I do think the usage of an action enum is clearer,
but I'm fine with the comment and the usage of a boolean.

> >> @@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai
> >>      if ( !hdr )
> >>          return -ENODATA;
> >>  
> >> -    rc = arch_hvm_load(d, hdr);
> >> -    if ( rc )
> >> -        return rc;
> >> +    rc = arch_hvm_check(d, hdr);
> > 
> > Shouldn't this _check function only be called when real == false?
> 
> Possibly. In v4 I directly transformed what I had in v3:
> 
>     ASSERT(!arch_hvm_check(d, hdr));
> 
> I.e. it is now the call above plus ...
> 
> >> +    if ( real )
> >> +    {
> >> +        struct vcpu *v;
> >> +
> >> +        ASSERT(!rc);
> 
> ... this assertion. Really the little brother of the call site assertion
> you're asking for (see above).
> 
> >> +        arch_hvm_load(d, hdr);
> >>  
> >> -    /* Down all the vcpus: we only re-enable the ones that had state saved. */
> >> -    for_each_vcpu(d, v)
> >> -        if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
> >> -            vcpu_sleep_nosync(v);
> >> +        /*
> >> +         * Down all the vcpus: we only re-enable the ones that had state
> >> +         * saved.
> >> +         */
> >> +        for_each_vcpu(d, v)
> >> +            if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
> >> +                vcpu_sleep_nosync(v);
> >> +    }
> >> +    else if ( rc )
> >> +        return rc;

The issue I see with this is that when built with debug=n the call to
arch_hvm_check() with real == true is useless, as the result is never
evaluated - IOW: would be clearer to just avoid the call altogether.

> >>      for ( ; ; )
> >>      {
> >> +        const char *name;
> >> +        hvm_load_handler load;
> >> +
> >>          if ( h->size - h->cur < sizeof(struct hvm_save_descriptor) )
> >>          {
> >>              /* Run out of data */
> >>              printk(XENLOG_G_ERR
> >>                     "HVM%d restore: save did not end with a null entry\n",
> >>                     d->domain_id);
> >> +            ASSERT(!real);
> >>              return -ENODATA;
> >>          }
> >>  
> >>          /* Read the typecode of the next entry  and check for the end-marker */
> >>          desc = (struct hvm_save_descriptor *)(&h->data[h->cur]);
> >> -        if ( desc->typecode == 0 )
> >> +        if ( desc->typecode == HVM_SAVE_CODE(END) )
> >> +        {
> >> +            /* Reset cursor for hvm_load(, true, ). */
> >> +            if ( !real )
> >> +                h->cur = 0;
> >>              return 0;
> >> +        }
> >>  
> >>          /* Find the handler for this entry */
> >> -        if ( (desc->typecode > HVM_SAVE_CODE_MAX) ||
> >> -             ((handler = hvm_sr_handlers[desc->typecode].load) == NULL) )
> >> +        if ( desc->typecode >= ARRAY_SIZE(hvm_sr_handlers) ||
> >> +             !(name = hvm_sr_handlers[desc->typecode].name) ||
> >> +             !(load = hvm_sr_handlers[desc->typecode].load) )
> >>          {
> >>              printk(XENLOG_G_ERR "HVM%d restore: unknown entry typecode %u\n",
> >>                     d->domain_id, desc->typecode);
> > 
> > The message is not very accurate here, it does fail when the typecode
> > is unknown, but also fails when such typecode has no name or load
> > function setup.
> 
> Yes and no, and it's not changing in this patch. Are you suggesting I should
> change it despite being unrelated? If so, there not being a name (which is
> the new check I'm adding) still suggests the code is unknown. There not being
> a load handler really indicates a bug in Xen (yet no reason to e.g. BUG() in
> that case, the failed loading will hopefully be noticeable enough).

Right, so not a lot of room for improvement anyway.  Let's leave as-is
then.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading
  2024-01-09 10:26       ` Roger Pau Monné
@ 2024-01-09 10:58         ` Jan Beulich
  2024-01-09 12:33           ` Roger Pau Monné
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2024-01-09 10:58 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Wei Liu

On 09.01.2024 11:26, Roger Pau Monné wrote:
> On Tue, Dec 19, 2023 at 04:24:02PM +0100, Jan Beulich wrote:
>> On 19.12.2023 15:36, Roger Pau Monné wrote:
>>> On Mon, Dec 18, 2023 at 03:39:55PM +0100, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/domctl.c
>>>> +++ b/xen/arch/x86/domctl.c
>>>> @@ -379,8 +379,12 @@ long arch_do_domctl(
>>>>          if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) != 0 )
>>>>              goto sethvmcontext_out;
>>>>  
>>>> +        ret = hvm_load(d, false, &c);
>>>> +        if ( ret )
>>>> +            goto sethvmcontext_out;
>>>> +
>>>>          domain_pause(d);
>>>> -        ret = hvm_load(d, &c);
>>>> +        ret = hvm_load(d, true, &c);
>>>
>>> Now that the check has been done ahead, do we want to somehow assert
>>> that this cannot fail?  AIUI that's the expectation.
>>
>> We certainly can't until all checking was moved out of the load handlers.
>> And even then I think there are still cases where load might produce an
>> error. (In fact I would have refused a little more strongly to folding
>> the prior hvm_check() into hvm_load() if indeed a separate hvm_load()
>> could have ended up returning void in the long run.)
> 
> I see, _load could fail even if all the data provided was correct, for
> example because the hypervisor is OoM?

That's the primary hypothetical cause for such a failure, yes.

>>>> @@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai
>>>>      if ( !hdr )
>>>>          return -ENODATA;
>>>>  
>>>> -    rc = arch_hvm_load(d, hdr);
>>>> -    if ( rc )
>>>> -        return rc;
>>>> +    rc = arch_hvm_check(d, hdr);
>>>
>>> Shouldn't this _check function only be called when real == false?
>>
>> Possibly. In v4 I directly transformed what I had in v3:
>>
>>     ASSERT(!arch_hvm_check(d, hdr));
>>
>> I.e. it is now the call above plus ...
>>
>>>> +    if ( real )
>>>> +    {
>>>> +        struct vcpu *v;
>>>> +
>>>> +        ASSERT(!rc);
>>
>> ... this assertion. Really the little brother of the call site assertion
>> you're asking for (see above).
>>
>>>> +        arch_hvm_load(d, hdr);
>>>>  
>>>> -    /* Down all the vcpus: we only re-enable the ones that had state saved. */
>>>> -    for_each_vcpu(d, v)
>>>> -        if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>>>> -            vcpu_sleep_nosync(v);
>>>> +        /*
>>>> +         * Down all the vcpus: we only re-enable the ones that had state
>>>> +         * saved.
>>>> +         */
>>>> +        for_each_vcpu(d, v)
>>>> +            if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>>>> +                vcpu_sleep_nosync(v);
>>>> +    }
>>>> +    else if ( rc )
>>>> +        return rc;
> 
> The issue I see with this is that when built with debug=n the call to
> arch_hvm_check() with real == true is useless, as the result is never
> evaluated - IOW: would be clearer to just avoid the call altogether.

Which, besides being imo slightly worse for then having two call sites,
puts me in a difficult position: It may not have been here, but on
another patch (but I think it was an earlier version of this one)
where Andrew commented on

    ASSERT(func());

as generally being a disliked pattern, for having a "side effect" in
the expression of an assertion. Plus the call isn't pointless even in
release builds, because of the log messages issued: Them appearing
twice in close succession might be a good hint of something fishy
going on.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading
  2024-01-09 10:58         ` Jan Beulich
@ 2024-01-09 12:33           ` Roger Pau Monné
  2024-01-09 12:38             ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monné @ 2024-01-09 12:33 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Wei Liu

On Tue, Jan 09, 2024 at 11:58:48AM +0100, Jan Beulich wrote:
> On 09.01.2024 11:26, Roger Pau Monné wrote:
> > On Tue, Dec 19, 2023 at 04:24:02PM +0100, Jan Beulich wrote:
> >> On 19.12.2023 15:36, Roger Pau Monné wrote:
> >>> On Mon, Dec 18, 2023 at 03:39:55PM +0100, Jan Beulich wrote:
> >>>> @@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai
> >>>>      if ( !hdr )
> >>>>          return -ENODATA;
> >>>>  
> >>>> -    rc = arch_hvm_load(d, hdr);
> >>>> -    if ( rc )
> >>>> -        return rc;
> >>>> +    rc = arch_hvm_check(d, hdr);
> >>>
> >>> Shouldn't this _check function only be called when real == false?
> >>
> >> Possibly. In v4 I directly transformed what I had in v3:
> >>
> >>     ASSERT(!arch_hvm_check(d, hdr));
> >>
> >> I.e. it is now the call above plus ...
> >>
> >>>> +    if ( real )
> >>>> +    {
> >>>> +        struct vcpu *v;
> >>>> +
> >>>> +        ASSERT(!rc);
> >>
> >> ... this assertion. Really the little brother of the call site assertion
> >> you're asking for (see above).
> >>
> >>>> +        arch_hvm_load(d, hdr);
> >>>>  
> >>>> -    /* Down all the vcpus: we only re-enable the ones that had state saved. */
> >>>> -    for_each_vcpu(d, v)
> >>>> -        if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
> >>>> -            vcpu_sleep_nosync(v);
> >>>> +        /*
> >>>> +         * Down all the vcpus: we only re-enable the ones that had state
> >>>> +         * saved.
> >>>> +         */
> >>>> +        for_each_vcpu(d, v)
> >>>> +            if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
> >>>> +                vcpu_sleep_nosync(v);
> >>>> +    }
> >>>> +    else if ( rc )
> >>>> +        return rc;
> > 
> > The issue I see with this is that when built with debug=n the call to
> > arch_hvm_check() with real == true is useless, as the result is never
> > evaluated - IOW: would be clearer to just avoid the call altogether.
> 
> Which, besides being imo slightly worse for then having two call sites,
> puts me in a difficult position: It may not have been here, but on
> another patch (but I think it was an earlier version of this one)
> where Andrew commented on
> 
>     ASSERT(func());
> 
> as generally being a disliked pattern, for having a "side effect" in
> the expression of an assertion.

I was going to suggest to add the pure attribute to the function, but
it does have side effects as it prints to the console.

> Plus the call isn't pointless even in
> release builds, because of the log messages issued: Them appearing
> twice in close succession might be a good hint of something fishy
> going on.

Why do you mention the messages appearing twice?  Won't the first
check call return error and thus should prevent the caller from
attempting to load the state?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading
  2024-01-09 12:33           ` Roger Pau Monné
@ 2024-01-09 12:38             ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2024-01-09 12:38 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Wei Liu

On 09.01.2024 13:33, Roger Pau Monné wrote:
> On Tue, Jan 09, 2024 at 11:58:48AM +0100, Jan Beulich wrote:
>> Plus the call isn't pointless even in
>> release builds, because of the log messages issued: Them appearing
>> twice in close succession might be a good hint of something fishy
>> going on.
> 
> Why do you mention the messages appearing twice?  Won't the first
> check call return error and thus should prevent the caller from
> attempting to load the state?

Well, what exactly is going to be the reason for a failure on the
"real" invocation when the "dry-run" one supposedly failed is
unknown. But you're right, messages occurring twice indeed may not
be very likely. It's more like these messages appearing and then
loading still continuing which might make obvious that something
went wrong.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-01-09 12:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-18 14:37 [PATCH v4 0/5] x86/HVM: load state checking Jan Beulich
2023-12-18 14:39 ` [PATCH v4 1/5] x86/HVM: split restore state checking from state loading Jan Beulich
2023-12-19 14:36   ` Roger Pau Monné
2023-12-19 15:24     ` Jan Beulich
2024-01-09 10:26       ` Roger Pau Monné
2024-01-09 10:58         ` Jan Beulich
2024-01-09 12:33           ` Roger Pau Monné
2024-01-09 12:38             ` Jan Beulich
2023-12-18 14:40 ` [PATCH v4 2/5] x86/HVM: adjust save/restore hook registration for optional check handler Jan Beulich
2023-12-18 14:40 ` [PATCH v4 3/5] x86/vPIT: check values loaded from state save record Jan Beulich
2023-12-18 14:40 ` [PATCH v4 4/5] x86/vPIC: " Jan Beulich
2023-12-18 14:41 ` [PATCH v4 5/5] x86/vIRQ: split PCI link load state checking from actual loading Jan Beulich
2023-12-19 14:50   ` Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.