qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/6] target/loongarch: Fix some issues reported from coccinelle
@ 2025-03-21  3:12 Bibo Mao
  2025-03-21  3:12 ` [PATCH v6 1/6] target/loongarch: Fix error handling of KVM feature checks Bibo Mao
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Bibo Mao @ 2025-03-21  3:12 UTC (permalink / raw)
  To: Song Gao, Markus Armbruster; +Cc: Jiaxun Yang, qemu-devel, Paolo Bonzini

This patch set solves errors reported by coccinelle tool with commands:
  spatch --sp-file scripts/coccinelle/*.cocci --dir target/loongarch/
  spatch --sp-file scripts/coccinelle/*.cocci --dir hw/loongarch/

The main problem is that qemu should fail to run when feature is forced
to enabled however KVM does not support it, rather than report error and
continue to run.

Also there is fixup for cpu plug and unplug. If there is error when cpu
is plug/unplug at runtime,  system should restore to previous state and
continue to run.

---
  v5 ... v6:
    1. If there is nested error report when restore from error in function
       virt_cpu_plug(), set output Error object with &error_abort rather
       than NULL, since it is almost impossible now.
    2. If there is nested error report when restore from error in function
       virt_cpu_unplug(), set output Error object with &error_abort rather
       than NULL, since it is almost impossible now.

  v4 ... v5:
    1. Split patch2 in v4 into three small patches, two are fixup for error
       handing when cpu plug/unplug fails so that system can continue to
       run, one is to remove error_propagate() and refresh title.
    2. Refresh changelog in last patch and remove fixes information
       since it is impossible to happen.

  v3 ... v4:
    1. Add missed this cleanup with error and remove some local error
       object.
    2. Replace local error object with error_abort object in
       virt_cpu_irq_init(), since its return value is not checked.

  v2 ... v3:
    1. Add missing modification replacing error_propagate() + error_setg()
      with error_setg().
    2. Some enhancement about error handling, handling error
       symmetrically in many places

  v1 ... v2:
    1. Add fixes tag and change title with fix prefix in patch 1.
    2. Replace error_propagate() with error_setg(), and return directly
       for any error.
---
Bibo Mao (6):
  target/loongarch: Fix error handling of KVM feature checks
  hw/loongarch/virt: Fix error handling in cpu plug
  hw/loongarch/virt: Fix error handling in cpu unplug
  hw/loongarch/virt: Eliminate error_propagate()
  target/loongarch: Remove unnecessary temporary variable assignment
  target/loongarch: Clean up virt_cpu_irq_init() error handling

 hw/loongarch/virt.c               | 63 ++++++++++++++++++-------------
 target/loongarch/kvm/kvm.c        |  8 +++-
 target/loongarch/tcg/tlb_helper.c |  5 +--
 3 files changed, 45 insertions(+), 31 deletions(-)


base-commit: 1dae461a913f9da88df05de6e2020d3134356f2e
-- 
2.39.3



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/6] target/loongarch: Fix error handling of KVM feature checks
  2025-03-21  3:12 [PATCH v6 0/6] target/loongarch: Fix some issues reported from coccinelle Bibo Mao
@ 2025-03-21  3:12 ` Bibo Mao
  2025-03-21  3:12 ` [PATCH v6 2/6] hw/loongarch/virt: Fix error handling in cpu plug Bibo Mao
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Bibo Mao @ 2025-03-21  3:12 UTC (permalink / raw)
  To: Song Gao, Markus Armbruster
  Cc: Jiaxun Yang, qemu-devel, Paolo Bonzini,
	Philippe Mathieu-Daudé

For some paravirt KVM features, if user forces to enable it however
KVM does not support, qemu should fail to run and exit immediately,
rather than continue to run. Here set error message and return directly
in function kvm_arch_init_vcpu().

Fixes: 6edd2a9bec90 (target/loongarch/kvm: Implement LoongArch PMU extension)
Fixes: 936c3f4d7916 (target/loongarch: Use auto method with LSX feature)
Fixes: 5e360dabedb1 (target/loongarch: Use auto method with LASX feature)
Fixes: 620d9bd0022e (target/loongarch: Add paravirt ipi feature detection)
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 target/loongarch/kvm/kvm.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/target/loongarch/kvm/kvm.c b/target/loongarch/kvm/kvm.c
index 28735c80be..7f63e7c8fe 100644
--- a/target/loongarch/kvm/kvm.c
+++ b/target/loongarch/kvm/kvm.c
@@ -1081,7 +1081,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
     int ret;
     Error *local_err = NULL;
 
-    ret = 0;
     qemu_add_vm_change_state_handler(kvm_loongarch_vm_stage_change, cs);
 
     if (!kvm_get_one_reg(cs, KVM_REG_LOONGARCH_DEBUG_INST, &val)) {
@@ -1091,29 +1090,34 @@ int kvm_arch_init_vcpu(CPUState *cs)
     ret = kvm_cpu_check_lsx(cs, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
+        return ret;
     }
 
     ret = kvm_cpu_check_lasx(cs, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
+        return ret;
     }
 
     ret = kvm_cpu_check_lbt(cs, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
+        return ret;
     }
 
     ret = kvm_cpu_check_pmu(cs, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
+        return ret;
     }
 
     ret = kvm_cpu_check_pv_features(cs, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
+        return ret;
     }
 
-    return ret;
+    return 0;
 }
 
 static bool loongarch_get_lbt(Object *obj, Error **errp)
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/6] hw/loongarch/virt: Fix error handling in cpu plug
  2025-03-21  3:12 [PATCH v6 0/6] target/loongarch: Fix some issues reported from coccinelle Bibo Mao
  2025-03-21  3:12 ` [PATCH v6 1/6] target/loongarch: Fix error handling of KVM feature checks Bibo Mao
@ 2025-03-21  3:12 ` Bibo Mao
  2025-04-04 12:06   ` Igor Mammedov
  2025-03-21  3:12 ` [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug Bibo Mao
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Bibo Mao @ 2025-03-21  3:12 UTC (permalink / raw)
  To: Song Gao, Markus Armbruster; +Cc: Jiaxun Yang, qemu-devel, Paolo Bonzini

In function virt_cpu_plug(), it will send cpu plug message to interrupt
controller extioi and ipi irqchip. If there is problem in this function,
system should continue to run and keep state the same before cpu is
added.

Object cpuslot::cpu is set at last only when there is no any error.
If there is, send cpu unplug message to extioi and ipi irqchip, and then
return immediately.

Fixes: ab9935d2991e (hw/loongarch/virt: Implement cpu plug interface)
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 hw/loongarch/virt.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index a5840ff968..8563967c8b 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -981,8 +981,6 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
     LoongArchVirtMachineState *lvms = LOONGARCH_VIRT_MACHINE(hotplug_dev);
     Error *err = NULL;
 
-    cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id);
-    cpu_slot->cpu = CPU(dev);
     if (lvms->ipi) {
         hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &err);
         if (err) {
@@ -995,6 +993,10 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
         hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
         if (err) {
             error_propagate(errp, err);
+            if (lvms->ipi) {
+                hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev,
+                                       &error_abort);
+            }
             return;
         }
     }
@@ -1003,9 +1005,21 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
         hotplug_handler_plug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
         if (err) {
             error_propagate(errp, err);
+            if (lvms->ipi) {
+                hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev,
+                                       &error_abort);
+            }
+
+            if (lvms->extioi) {
+                hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev,
+                                       &error_abort);
+            }
+            return;
         }
     }
 
+    cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id);
+    cpu_slot->cpu = CPU(dev);
     return;
 }
 
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  3:12 [PATCH v6 0/6] target/loongarch: Fix some issues reported from coccinelle Bibo Mao
  2025-03-21  3:12 ` [PATCH v6 1/6] target/loongarch: Fix error handling of KVM feature checks Bibo Mao
  2025-03-21  3:12 ` [PATCH v6 2/6] hw/loongarch/virt: Fix error handling in cpu plug Bibo Mao
@ 2025-03-21  3:12 ` Bibo Mao
  2025-03-21  6:47   ` Markus Armbruster
  2025-04-04 12:08   ` Igor Mammedov
  2025-03-21  3:12 ` [PATCH v6 4/6] hw/loongarch/virt: Eliminate error_propagate() Bibo Mao
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 18+ messages in thread
From: Bibo Mao @ 2025-03-21  3:12 UTC (permalink / raw)
  To: Song Gao, Markus Armbruster; +Cc: Jiaxun Yang, qemu-devel, Paolo Bonzini

In function virt_cpu_unplug(), it will send cpu unplug message to
interrupt controller extioi and ipi irqchip. If there is problem in
this function, system should continue to run and keep state the same
before cpu is removed.

If error happends in cpu unplug stage, send cpu plug message to extioi
and ipi irqchip to restore to previous stage, and then return immediately.

Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 hw/loongarch/virt.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 8563967c8b..503362a69e 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
     hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
     if (err) {
         error_propagate(errp, err);
+        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
+                             &error_abort);
         return;
     }
 
@@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
     hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
     if (err) {
         error_propagate(errp, err);
+        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
+                             &error_abort);
+        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
+                             &error_abort);
         return;
     }
 
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 4/6] hw/loongarch/virt: Eliminate error_propagate()
  2025-03-21  3:12 [PATCH v6 0/6] target/loongarch: Fix some issues reported from coccinelle Bibo Mao
                   ` (2 preceding siblings ...)
  2025-03-21  3:12 ` [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug Bibo Mao
@ 2025-03-21  3:12 ` Bibo Mao
  2025-03-21  3:12 ` [PATCH v6 5/6] target/loongarch: Remove unnecessary temporary variable assignment Bibo Mao
  2025-03-21  3:12 ` [PATCH v6 6/6] target/loongarch: Clean up virt_cpu_irq_init() error handling Bibo Mao
  5 siblings, 0 replies; 18+ messages in thread
From: Bibo Mao @ 2025-03-21  3:12 UTC (permalink / raw)
  To: Song Gao, Markus Armbruster; +Cc: Jiaxun Yang, qemu-devel, Paolo Bonzini

When there is an error, it is put into a local variable and then
propagated to somewhere else. Instead the error can be set right
away, error propagation can be removed.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
---
 hw/loongarch/virt.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 503362a69e..562e44e112 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -859,30 +859,29 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev,
     LoongArchCPU *cpu = LOONGARCH_CPU(dev);
     CPUState *cs = CPU(dev);
     CPUArchId *cpu_slot;
-    Error *err = NULL;
     LoongArchCPUTopo topo;
     int arch_id;
 
     if (lvms->acpi_ged) {
         if ((cpu->thread_id < 0) || (cpu->thread_id >= ms->smp.threads)) {
-            error_setg(&err,
+            error_setg(errp,
                        "Invalid thread-id %u specified, must be in range 1:%u",
                        cpu->thread_id, ms->smp.threads - 1);
-            goto out;
+            return;
         }
 
         if ((cpu->core_id < 0) || (cpu->core_id >= ms->smp.cores)) {
-            error_setg(&err,
+            error_setg(errp,
                        "Invalid core-id %u specified, must be in range 1:%u",
                        cpu->core_id, ms->smp.cores - 1);
-            goto out;
+            return;
         }
 
         if ((cpu->socket_id < 0) || (cpu->socket_id >= ms->smp.sockets)) {
-            error_setg(&err,
+            error_setg(errp,
                        "Invalid socket-id %u specified, must be in range 1:%u",
                        cpu->socket_id, ms->smp.sockets - 1);
-            goto out;
+            return;
         }
 
         topo.socket_id = cpu->socket_id;
@@ -891,11 +890,11 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev,
         arch_id =  virt_get_arch_id_from_topo(ms, &topo);
         cpu_slot = virt_find_cpu_slot(ms, arch_id);
         if (CPU(cpu_slot->cpu)) {
-            error_setg(&err,
+            error_setg(errp,
                        "cpu(id%d=%d:%d:%d) with arch-id %" PRIu64 " exists",
                        cs->cpu_index, cpu->socket_id, cpu->core_id,
                        cpu->thread_id, cpu_slot->arch_id);
-            goto out;
+            return;
         }
     } else {
         /* For cold-add cpu, find empty cpu slot */
@@ -911,33 +910,24 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev,
     cpu->env.address_space_iocsr = &lvms->as_iocsr;
     cpu->phy_id = cpu_slot->arch_id;
     cs->cpu_index = cpu_slot - ms->possible_cpus->cpus;
-    numa_cpu_pre_plug(cpu_slot, dev, &err);
-out:
-    if (err) {
-        error_propagate(errp, err);
-    }
+    numa_cpu_pre_plug(cpu_slot, dev, errp);
 }
 
 static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
                                     DeviceState *dev, Error **errp)
 {
     LoongArchVirtMachineState *lvms = LOONGARCH_VIRT_MACHINE(hotplug_dev);
-    Error *err = NULL;
     LoongArchCPU *cpu = LOONGARCH_CPU(dev);
     CPUState *cs = CPU(dev);
 
     if (cs->cpu_index == 0) {
-        error_setg(&err, "hot-unplug of boot cpu(id%d=%d:%d:%d) not supported",
+        error_setg(errp, "hot-unplug of boot cpu(id%d=%d:%d:%d) not supported",
                    cs->cpu_index, cpu->socket_id,
                    cpu->core_id, cpu->thread_id);
-        error_propagate(errp, err);
         return;
     }
 
-    hotplug_handler_unplug_request(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
-    if (err) {
-        error_propagate(errp, err);
-    }
+    hotplug_handler_unplug_request(HOTPLUG_HANDLER(lvms->acpi_ged), dev, errp);
 }
 
 static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 5/6] target/loongarch: Remove unnecessary temporary variable assignment
  2025-03-21  3:12 [PATCH v6 0/6] target/loongarch: Fix some issues reported from coccinelle Bibo Mao
                   ` (3 preceding siblings ...)
  2025-03-21  3:12 ` [PATCH v6 4/6] hw/loongarch/virt: Eliminate error_propagate() Bibo Mao
@ 2025-03-21  3:12 ` Bibo Mao
  2025-03-21  3:12 ` [PATCH v6 6/6] target/loongarch: Clean up virt_cpu_irq_init() error handling Bibo Mao
  5 siblings, 0 replies; 18+ messages in thread
From: Bibo Mao @ 2025-03-21  3:12 UTC (permalink / raw)
  To: Song Gao, Markus Armbruster
  Cc: Jiaxun Yang, qemu-devel, Paolo Bonzini,
	Philippe Mathieu-Daudé

Temporary variable ret is assigned at last line and return, it can
be removed and return directly.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 target/loongarch/tcg/tlb_helper.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/target/loongarch/tcg/tlb_helper.c b/target/loongarch/tcg/tlb_helper.c
index 646dbf59de..182881a237 100644
--- a/target/loongarch/tcg/tlb_helper.c
+++ b/target/loongarch/tcg/tlb_helper.c
@@ -543,7 +543,7 @@ target_ulong helper_lddir(CPULoongArchState *env, target_ulong base,
                           target_ulong level, uint32_t mem_idx)
 {
     CPUState *cs = env_cpu(env);
-    target_ulong badvaddr, index, phys, ret;
+    target_ulong badvaddr, index, phys;
     uint64_t dir_base, dir_width;
 
     if (unlikely((level == 0) || (level > 4))) {
@@ -571,8 +571,7 @@ target_ulong helper_lddir(CPULoongArchState *env, target_ulong base,
     get_dir_base_width(env, &dir_base, &dir_width, level);
     index = (badvaddr >> dir_base) & ((1 << dir_width) - 1);
     phys = base | index << 3;
-    ret = ldq_phys(cs->as, phys) & TARGET_PHYS_MASK;
-    return ret;
+    return ldq_phys(cs->as, phys) & TARGET_PHYS_MASK;
 }
 
 void helper_ldpte(CPULoongArchState *env, target_ulong base, target_ulong odd,
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 6/6] target/loongarch: Clean up virt_cpu_irq_init() error handling
  2025-03-21  3:12 [PATCH v6 0/6] target/loongarch: Fix some issues reported from coccinelle Bibo Mao
                   ` (4 preceding siblings ...)
  2025-03-21  3:12 ` [PATCH v6 5/6] target/loongarch: Remove unnecessary temporary variable assignment Bibo Mao
@ 2025-03-21  3:12 ` Bibo Mao
  5 siblings, 0 replies; 18+ messages in thread
From: Bibo Mao @ 2025-03-21  3:12 UTC (permalink / raw)
  To: Song Gao, Markus Armbruster; +Cc: Jiaxun Yang, qemu-devel, Paolo Bonzini

The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL. Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.

virt_cpu_irq_init() is wrong that way: it passes &err to
hotplug_handler_plug() twice.  If both calls failed, this could trip
error_setv()'s assertion.  Moreover, if just one fails, the Error
object leaks. Fortunately, these calls can't actually fail.

Messed up in commit 50ebc3fc47f7 (hw/intc/loongarch_ipi: Notify ipi
object when cpu is plugged) and commit 087a23a87c57
(hw/intc/loongarch_extioi: Use cpu plug notification).

Clean this up by passing &error_abort instead.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Markus Armbruster <armbru@redhat.com>
---
 hw/loongarch/virt.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 562e44e112..d7ede30af3 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -327,7 +327,6 @@ static void virt_cpu_irq_init(LoongArchVirtMachineState *lvms)
     MachineClass *mc = MACHINE_GET_CLASS(ms);
     const CPUArchIdList *possible_cpus;
     CPUState *cs;
-    Error *err = NULL;
 
     /* cpu nodes */
     possible_cpus = mc->possible_cpu_arch_ids(ms);
@@ -337,8 +336,10 @@ static void virt_cpu_irq_init(LoongArchVirtMachineState *lvms)
             continue;
         }
 
-        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), DEVICE(cs), &err);
-        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), DEVICE(cs), &err);
+        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), DEVICE(cs),
+                             &error_abort);
+        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), DEVICE(cs),
+                             &error_abort);
     }
 }
 
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  3:12 ` [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug Bibo Mao
@ 2025-03-21  6:47   ` Markus Armbruster
  2025-03-21  7:00     ` bibo mao
  2025-03-21  7:09     ` bibo mao
  2025-04-04 12:08   ` Igor Mammedov
  1 sibling, 2 replies; 18+ messages in thread
From: Markus Armbruster @ 2025-03-21  6:47 UTC (permalink / raw)
  To: Bibo Mao; +Cc: Song Gao, Jiaxun Yang, qemu-devel, Paolo Bonzini

Bibo Mao <maobibo@loongson.cn> writes:

> In function virt_cpu_unplug(), it will send cpu unplug message to
> interrupt controller extioi and ipi irqchip. If there is problem in
> this function, system should continue to run and keep state the same
> before cpu is removed.
>
> If error happends in cpu unplug stage, send cpu plug message to extioi
> and ipi irqchip to restore to previous stage, and then return immediately.
>
> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> ---
>  hw/loongarch/virt.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
> index 8563967c8b..503362a69e 100644
> --- a/hw/loongarch/virt.c
> +++ b/hw/loongarch/virt.c
> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>      hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>      if (err) {
>          error_propagate(errp, err);
> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
> +                             &error_abort);
>          return;
>      }
>  
> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>      hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>      if (err) {
>          error_propagate(errp, err);
> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
> +                             &error_abort);
> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
> +                             &error_abort);
>          return;
>      }

virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
This must not fail.

virt_cpu_plug() does it the other way round (see previous patch).

So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
check for it to fail in virt_cpu_plug().

Can it really fail in virt_cpu_plug()?

If yes, why can't it fail in virt_cpu_unplug()?

Same questions for hotplug_handler_unplug().



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  6:47   ` Markus Armbruster
@ 2025-03-21  7:00     ` bibo mao
  2025-03-21  7:21       ` Markus Armbruster
  2025-03-21  7:09     ` bibo mao
  1 sibling, 1 reply; 18+ messages in thread
From: bibo mao @ 2025-03-21  7:00 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Song Gao, Jiaxun Yang, qemu-devel, Paolo Bonzini, Igor Mammedov

+Igor


On 2025/3/21 下午2:47, Markus Armbruster wrote:
> Bibo Mao <maobibo@loongson.cn> writes:
> 
>> In function virt_cpu_unplug(), it will send cpu unplug message to
>> interrupt controller extioi and ipi irqchip. If there is problem in
>> this function, system should continue to run and keep state the same
>> before cpu is removed.
>>
>> If error happends in cpu unplug stage, send cpu plug message to extioi
>> and ipi irqchip to restore to previous stage, and then return immediately.
>>
>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>> ---
>>   hw/loongarch/virt.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
>> index 8563967c8b..503362a69e 100644
>> --- a/hw/loongarch/virt.c
>> +++ b/hw/loongarch/virt.c
>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>       hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>>       if (err) {
>>           error_propagate(errp, err);
>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>> +                             &error_abort);
>>           return;
>>       }
>>   
>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>       hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>>       if (err) {
>>           error_propagate(errp, err);
>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>> +                             &error_abort);
>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
>> +                             &error_abort);
>>           return;
>>       }
> 
> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
> ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
> This must not fail.
> 
> virt_cpu_plug() does it the other way round (see previous patch).
> 
> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
> check for it to fail in virt_cpu_plug().
> 
> Can it really fail in virt_cpu_plug()?
> 
> If yes, why can't it fail in virt_cpu_unplug()?
you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that 
is cpuplug callback for acpi_ged and ipi. it will not fail.

If *virt_cpu_pre_plug()* pass, it will succeed.

Regards
Bibo Mao

> 
> Same questions for hotplug_handler_unplug().
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  6:47   ` Markus Armbruster
  2025-03-21  7:00     ` bibo mao
@ 2025-03-21  7:09     ` bibo mao
  1 sibling, 0 replies; 18+ messages in thread
From: bibo mao @ 2025-03-21  7:09 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Song Gao, Jiaxun Yang, qemu-devel, Paolo Bonzini, Igor Mammedov



On 2025/3/21 下午2:47, Markus Armbruster wrote:
> Bibo Mao <maobibo@loongson.cn> writes:
> 
>> In function virt_cpu_unplug(), it will send cpu unplug message to
>> interrupt controller extioi and ipi irqchip. If there is problem in
>> this function, system should continue to run and keep state the same
>> before cpu is removed.
>>
>> If error happends in cpu unplug stage, send cpu plug message to extioi
>> and ipi irqchip to restore to previous stage, and then return immediately.
>>
>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>> ---
>>   hw/loongarch/virt.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
>> index 8563967c8b..503362a69e 100644
>> --- a/hw/loongarch/virt.c
>> +++ b/hw/loongarch/virt.c
>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>       hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>>       if (err) {
>>           error_propagate(errp, err);
>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>> +                             &error_abort);
>>           return;
>>       }
>>   
>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>       hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>>       if (err) {
>>           error_propagate(errp, err);
>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>> +                             &error_abort);
>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
>> +                             &error_abort);
>>           return;
>>       }
> 
> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
> ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
> This must not fail.
> 
> virt_cpu_plug() does it the other way round (see previous patch).
> 
> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
> check for it to fail in virt_cpu_plug().
> 
> Can it really fail in virt_cpu_plug()?
> 
> If yes, why can't it fail in virt_cpu_unplug()?
I do not know what is you meaning.
In last email I said it was impossible. un-notify is for future use. And 
you reply such as:

*You assure us this can't happen today.  Because of that, broken error
recovery is not an actual problem.

However, if things change some day so it can happen, broken error
recovery becomes an actual problem.

so, broken error recovery just "for future use" is actually just for
silent future breakage.

But is it broken?  This is what I'm trying to find out with my "what
happens if" question.

If it is broken, then passing &error_abort would likely be less bad:
crash instead of silent breakage.  Also makes it completely obvious in
the code that these errors are not handled, whereas broken error
handling looks like it is until you actually think about it.*

Sorry for my bad English, so what is your option about here?

Regards
Bibo Mao
> 
> Same questions for hotplug_handler_unplug().
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  7:00     ` bibo mao
@ 2025-03-21  7:21       ` Markus Armbruster
  2025-03-21  7:35         ` bibo mao
  0 siblings, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2025-03-21  7:21 UTC (permalink / raw)
  To: bibo mao; +Cc: Song Gao, Jiaxun Yang, qemu-devel, Paolo Bonzini, Igor Mammedov

bibo mao <maobibo@loongson.cn> writes:

> +Igor
>
>
> On 2025/3/21 下午2:47, Markus Armbruster wrote:
>> Bibo Mao <maobibo@loongson.cn> writes:
>> 
>>> In function virt_cpu_unplug(), it will send cpu unplug message to
>>> interrupt controller extioi and ipi irqchip. If there is problem in
>>> this function, system should continue to run and keep state the same
>>> before cpu is removed.
>>>
>>> If error happends in cpu unplug stage, send cpu plug message to extioi
>>> and ipi irqchip to restore to previous stage, and then return immediately.
>>>
>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>>> ---
>>>   hw/loongarch/virt.c | 6 ++++++
>>>   1 file changed, 6 insertions(+)
>>>
>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
>>> index 8563967c8b..503362a69e 100644
>>> --- a/hw/loongarch/virt.c
>>> +++ b/hw/loongarch/virt.c
>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>       hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>>>       if (err) {
>>>           error_propagate(errp, err);
>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>> +                             &error_abort);
>>>           return;
>>>       }
>>>   
>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>       hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>>>       if (err) {
>>>           error_propagate(errp, err);
>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>> +                             &error_abort);
>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
>>> +                             &error_abort);
>>>           return;
>>>       }
>> 
>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
>> ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
>> This must not fail.
>> 
>> virt_cpu_plug() does it the other way round (see previous patch).
>> 
>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
>> check for it to fail in virt_cpu_plug().
>> 
>> Can it really fail in virt_cpu_plug()?
>> 
>> If yes, why can't it fail in virt_cpu_unplug()?
> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that 
> is cpuplug callback for acpi_ged and ipi. it will not fail.
>
> If *virt_cpu_pre_plug()* pass, it will succeed.
>
> Regards
> Bibo Mao
>
>> 
>> Same questions for hotplug_handler_unplug().

Let me restate my argument.

We call hotplug_handler_plug() on the happy path, and on error recovery
paths.  Four cases:

1. Can fail on the happy path

   Error recovery is required.

1.1 Can fail on the error recovery path

    Error recovery is required, but broken.

1.2 Can't fail on the error recovery path

    Error recovery is required and works, but why it works is not
    obvious.  Deserves a comment explaining why hotplug_handler_plug()
    can't fail here even though it can fail on the happy path next door.

2. Can't fail on the happy path

   Error recovery is unreachable.

2.1 Can fail on the error recovery path

    Error recovery is unreachable and broken.  Possibly a time bomb, and
    possibly misleading readers.

2.2 Can't fail on the error recovery path

    Error recovery is unreachable and would work, but why it would work
    is again a not obvious.

Which of the four cases is it?



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  7:21       ` Markus Armbruster
@ 2025-03-21  7:35         ` bibo mao
  2025-03-21  8:08           ` Markus Armbruster
  2025-04-04 11:59           ` Igor Mammedov
  0 siblings, 2 replies; 18+ messages in thread
From: bibo mao @ 2025-03-21  7:35 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Song Gao, Jiaxun Yang, qemu-devel, Paolo Bonzini, Igor Mammedov



On 2025/3/21 下午3:21, Markus Armbruster wrote:
> bibo mao <maobibo@loongson.cn> writes:
> 
>> +Igor
>>
>>
>> On 2025/3/21 下午2:47, Markus Armbruster wrote:
>>> Bibo Mao <maobibo@loongson.cn> writes:
>>>
>>>> In function virt_cpu_unplug(), it will send cpu unplug message to
>>>> interrupt controller extioi and ipi irqchip. If there is problem in
>>>> this function, system should continue to run and keep state the same
>>>> before cpu is removed.
>>>>
>>>> If error happends in cpu unplug stage, send cpu plug message to extioi
>>>> and ipi irqchip to restore to previous stage, and then return immediately.
>>>>
>>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>>>> ---
>>>>    hw/loongarch/virt.c | 6 ++++++
>>>>    1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
>>>> index 8563967c8b..503362a69e 100644
>>>> --- a/hw/loongarch/virt.c
>>>> +++ b/hw/loongarch/virt.c
>>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>>        hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>>>>        if (err) {
>>>>            error_propagate(errp, err);
>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>>> +                             &error_abort);
>>>>            return;
>>>>        }
>>>>    
>>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>>        hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>>>>        if (err) {
>>>>            error_propagate(errp, err);
>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>>> +                             &error_abort);
>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
>>>> +                             &error_abort);
>>>>            return;
>>>>        }
>>>
>>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
>>> ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
>>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
>>> This must not fail.
>>>
>>> virt_cpu_plug() does it the other way round (see previous patch).
>>>
>>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
>>> check for it to fail in virt_cpu_plug().
>>>
>>> Can it really fail in virt_cpu_plug()?
>>>
>>> If yes, why can't it fail in virt_cpu_unplug()?
>> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that
>> is cpuplug callback for acpi_ged and ipi. it will not fail.
>>
>> If *virt_cpu_pre_plug()* pass, it will succeed.
>>
>> Regards
>> Bibo Mao
>>
>>>
>>> Same questions for hotplug_handler_unplug().
> 
> Let me restate my argument.
> 
> We call hotplug_handler_plug() on the happy path, and on error recovery
> paths.  Four cases:
> 
> 1. Can fail on the happy path
> 
>     Error recovery is required.
> 
> 1.1 Can fail on the error recovery path
> 
>      Error recovery is required, but broken.
> 
> 1.2 Can't fail on the error recovery path
> 
>      Error recovery is required and works, but why it works is not
>      obvious.  Deserves a comment explaining why hotplug_handler_plug()
>      can't fail here even though it can fail on the happy path next door.
> 
> 2. Can't fail on the happy path
> 
>     Error recovery is unreachable.
> 
> 2.1 Can fail on the error recovery path
> 
>      Error recovery is unreachable and broken.  Possibly a time bomb, and
>      possibly misleading readers.
> 
> 2.2 Can't fail on the error recovery path
> 
>      Error recovery is unreachable and would work, but why it would work
>      is again a not obvious.
> 
> Which of the four cases is it?
By my understanding, it is "2. Can't fail on the happy path",  and Error 
recovery is unreachable.

I have said that it is impossible and recovery is only for future use.

do you mean recovery should be removed? And directly &error_abort is 
used in virt_cpu_plug() such as:
static void virt_cpu_plug(HotplugHandler *hotplug_dev,
                           DeviceState *dev, Error **errp)
{
   if (lvms->ipi) {
     hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort);

Regards
Bibo Mao



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  7:35         ` bibo mao
@ 2025-03-21  8:08           ` Markus Armbruster
  2025-03-21  8:21             ` bibo mao
  2025-04-04 11:59           ` Igor Mammedov
  1 sibling, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2025-03-21  8:08 UTC (permalink / raw)
  To: bibo mao; +Cc: Song Gao, Jiaxun Yang, qemu-devel, Paolo Bonzini, Igor Mammedov

bibo mao <maobibo@loongson.cn> writes:

> On 2025/3/21 下午3:21, Markus Armbruster wrote:
>> bibo mao <maobibo@loongson.cn> writes:
>> 
>>> +Igor
>>>
>>>
>>> On 2025/3/21 下午2:47, Markus Armbruster wrote:
>>>> Bibo Mao <maobibo@loongson.cn> writes:
>>>>
>>>>> In function virt_cpu_unplug(), it will send cpu unplug message to
>>>>> interrupt controller extioi and ipi irqchip. If there is problem in
>>>>> this function, system should continue to run and keep state the same
>>>>> before cpu is removed.
>>>>>
>>>>> If error happends in cpu unplug stage, send cpu plug message to extioi
>>>>> and ipi irqchip to restore to previous stage, and then return immediately.
>>>>>
>>>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
>>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>>>>> ---
>>>>>    hw/loongarch/virt.c | 6 ++++++
>>>>>    1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
>>>>> index 8563967c8b..503362a69e 100644
>>>>> --- a/hw/loongarch/virt.c
>>>>> +++ b/hw/loongarch/virt.c
>>>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>>>        hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>>>>>        if (err) {
>>>>>            error_propagate(errp, err);
>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>>>> +                             &error_abort);
>>>>>            return;
>>>>>        }
>>>>>    
>>>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>>>        hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>>>>>        if (err) {
>>>>>            error_propagate(errp, err);
>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>>>> +                             &error_abort);
>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
>>>>> +                             &error_abort);
>>>>>            return;
>>>>>        }
>>>>
>>>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
>>>> ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
>>>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
>>>> This must not fail.
>>>>
>>>> virt_cpu_plug() does it the other way round (see previous patch).
>>>>
>>>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
>>>> check for it to fail in virt_cpu_plug().
>>>>
>>>> Can it really fail in virt_cpu_plug()?
>>>>
>>>> If yes, why can't it fail in virt_cpu_unplug()?
>>> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that
>>> is cpuplug callback for acpi_ged and ipi. it will not fail.
>>>
>>> If *virt_cpu_pre_plug()* pass, it will succeed.
>>>
>>> Regards
>>> Bibo Mao
>>>
>>>>
>>>> Same questions for hotplug_handler_unplug().
>> 
>> Let me restate my argument.
>> 
>> We call hotplug_handler_plug() on the happy path, and on error recovery
>> paths.  Four cases:
>> 
>> 1. Can fail on the happy path
>> 
>>     Error recovery is required.
>> 
>> 1.1 Can fail on the error recovery path
>> 
>>      Error recovery is required, but broken.
>> 
>> 1.2 Can't fail on the error recovery path
>> 
>>      Error recovery is required and works, but why it works is not
>>      obvious.  Deserves a comment explaining why hotplug_handler_plug()
>>      can't fail here even though it can fail on the happy path next door.
>> 
>> 2. Can't fail on the happy path
>> 
>>     Error recovery is unreachable.
>> 
>> 2.1 Can fail on the error recovery path
>> 
>>      Error recovery is unreachable and broken.  Possibly a time bomb, and
>>      possibly misleading readers.
>> 
>> 2.2 Can't fail on the error recovery path
>> 
>>      Error recovery is unreachable and would work, but why it would work
>>      is again a not obvious.
>> 
>> Which of the four cases is it?
> By my understanding, it is "2. Can't fail on the happy path",  and Error 
> recovery is unreachable.

Got it.

> I have said that it is impossible and recovery is only for future use.
>
> do you mean recovery should be removed? And directly &error_abort is 
> used in virt_cpu_plug() such as:
> static void virt_cpu_plug(HotplugHandler *hotplug_dev,
>                            DeviceState *dev, Error **errp)
> {
>    if (lvms->ipi) {
>      hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort);

Yes, I prefer this.  Here's why.

Error recovery that is unreachable now but might become reachable at
some future time is untestable now.  Mind, this does not necessarily
make it a bad idea by itself.  But there's more.

Anything that makes this error recovery reachable either breaks it or
makes correctness locally unobvious.  Why?  To make it reachable, plug /
unplug must be able to fail on the happy path.  But then they either can
fail on the error recovery path as well (which breaks error recovery),
or they can't fail there for reasons that are not locally obvious.

This sets a trap for readers.  An attentive reader will see the problem
(like I did), but to see why the code is not broken right now will take
digging (like we did together).  And after such digging, we're left with
a queasy feeling about robustness of the code (like we are now).

Passing &error_abort on the happy path avoids all this.  Instead it
clearly tells the reader that this is not expected to fail.

If failure becomes possible at some future time, this should crash in
testing.  If we neglect to test the new failure (and we really
shouldn't), we crash on error in production right away instead of
risking botched error recovery messing up the program's state.
Both are bad outcomes, but which one's less bad I find impossible to
predict.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  8:08           ` Markus Armbruster
@ 2025-03-21  8:21             ` bibo mao
  0 siblings, 0 replies; 18+ messages in thread
From: bibo mao @ 2025-03-21  8:21 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Song Gao, Jiaxun Yang, qemu-devel, Paolo Bonzini, Igor Mammedov



On 2025/3/21 下午4:08, Markus Armbruster wrote:
> bibo mao <maobibo@loongson.cn> writes:
> 
>> On 2025/3/21 下午3:21, Markus Armbruster wrote:
>>> bibo mao <maobibo@loongson.cn> writes:
>>>
>>>> +Igor
>>>>
>>>>
>>>> On 2025/3/21 下午2:47, Markus Armbruster wrote:
>>>>> Bibo Mao <maobibo@loongson.cn> writes:
>>>>>
>>>>>> In function virt_cpu_unplug(), it will send cpu unplug message to
>>>>>> interrupt controller extioi and ipi irqchip. If there is problem in
>>>>>> this function, system should continue to run and keep state the same
>>>>>> before cpu is removed.
>>>>>>
>>>>>> If error happends in cpu unplug stage, send cpu plug message to extioi
>>>>>> and ipi irqchip to restore to previous stage, and then return immediately.
>>>>>>
>>>>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
>>>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>>>>>> ---
>>>>>>     hw/loongarch/virt.c | 6 ++++++
>>>>>>     1 file changed, 6 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
>>>>>> index 8563967c8b..503362a69e 100644
>>>>>> --- a/hw/loongarch/virt.c
>>>>>> +++ b/hw/loongarch/virt.c
>>>>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>>>>         hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>>>>>>         if (err) {
>>>>>>             error_propagate(errp, err);
>>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>>>>> +                             &error_abort);
>>>>>>             return;
>>>>>>         }
>>>>>>     
>>>>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>>>>         hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>>>>>>         if (err) {
>>>>>>             error_propagate(errp, err);
>>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>>>>> +                             &error_abort);
>>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
>>>>>> +                             &error_abort);
>>>>>>             return;
>>>>>>         }
>>>>>
>>>>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
>>>>> ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
>>>>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
>>>>> This must not fail.
>>>>>
>>>>> virt_cpu_plug() does it the other way round (see previous patch).
>>>>>
>>>>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
>>>>> check for it to fail in virt_cpu_plug().
>>>>>
>>>>> Can it really fail in virt_cpu_plug()?
>>>>>
>>>>> If yes, why can't it fail in virt_cpu_unplug()?
>>>> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that
>>>> is cpuplug callback for acpi_ged and ipi. it will not fail.
>>>>
>>>> If *virt_cpu_pre_plug()* pass, it will succeed.
>>>>
>>>> Regards
>>>> Bibo Mao
>>>>
>>>>>
>>>>> Same questions for hotplug_handler_unplug().
>>>
>>> Let me restate my argument.
>>>
>>> We call hotplug_handler_plug() on the happy path, and on error recovery
>>> paths.  Four cases:
>>>
>>> 1. Can fail on the happy path
>>>
>>>      Error recovery is required.
>>>
>>> 1.1 Can fail on the error recovery path
>>>
>>>       Error recovery is required, but broken.
>>>
>>> 1.2 Can't fail on the error recovery path
>>>
>>>       Error recovery is required and works, but why it works is not
>>>       obvious.  Deserves a comment explaining why hotplug_handler_plug()
>>>       can't fail here even though it can fail on the happy path next door.
>>>
>>> 2. Can't fail on the happy path
>>>
>>>      Error recovery is unreachable.
>>>
>>> 2.1 Can fail on the error recovery path
>>>
>>>       Error recovery is unreachable and broken.  Possibly a time bomb, and
>>>       possibly misleading readers.
>>>
>>> 2.2 Can't fail on the error recovery path
>>>
>>>       Error recovery is unreachable and would work, but why it would work
>>>       is again a not obvious.
>>>
>>> Which of the four cases is it?
>> By my understanding, it is "2. Can't fail on the happy path",  and Error
>> recovery is unreachable.
> 
> Got it.
> 
>> I have said that it is impossible and recovery is only for future use.
>>
>> do you mean recovery should be removed? And directly &error_abort is
>> used in virt_cpu_plug() such as:
>> static void virt_cpu_plug(HotplugHandler *hotplug_dev,
>>                             DeviceState *dev, Error **errp)
>> {
>>     if (lvms->ipi) {
>>       hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort);
> 
> Yes, I prefer this.  Here's why.
This is ok, however I do not think there is obvious advantage. V6 is ok 
also, there is recovery path only that recovery has problem, system will 
crash, through it is impossible now.

Maybe someone jumps out and say that there is error handling in cpu 
hotplug, however no such in pc_memory_plug(), why does not LoongArch in 
such way. If so, there will endless discussion.

void x86_cpu_plug(HotplugHandler *hotplug_dev,
                   DeviceState *dev, Error **errp)
{
     CPUArchId *found_cpu;
     Error *local_err = NULL;
     X86CPU *cpu = X86_CPU(dev);
     X86MachineState *x86ms = X86_MACHINE(hotplug_dev);

     if (x86ms->acpi_dev) {
         hotplug_handler_plug(x86ms->acpi_dev, dev, &local_err);

static void pc_memory_plug(HotplugHandler *hotplug_dev,
                            DeviceState *dev, Error **errp)
{
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
     X86MachineState *x86ms = X86_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
     bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);

     pc_dimm_plug(PC_DIMM(dev), MACHINE(pcms));

     if (is_nvdimm) {
         nvdimm_plug(ms->nvdimms_state);
     }

     hotplug_handler_plug(x86ms->acpi_dev, dev, &error_abort);
}

Regards
Bibo Mao
> 
> Error recovery that is unreachable now but might become reachable at
> some future time is untestable now.  Mind, this does not necessarily
> make it a bad idea by itself.  But there's more.
> 
> Anything that makes this error recovery reachable either breaks it or
> makes correctness locally unobvious.  Why?  To make it reachable, plug /
> unplug must be able to fail on the happy path.  But then they either can
> fail on the error recovery path as well (which breaks error recovery),
> or they can't fail there for reasons that are not locally obvious.
> 
> This sets a trap for readers.  An attentive reader will see the problem
> (like I did), but to see why the code is not broken right now will take
> digging (like we did together).  And after such digging, we're left with
> a queasy feeling about robustness of the code (like we are now).
> 
> Passing &error_abort on the happy path avoids all this.  Instead it
> clearly tells the reader that this is not expected to fail.
> 
> If failure becomes possible at some future time, this should crash in
> testing.  If we neglect to test the new failure (and we really
> shouldn't), we crash on error in production right away instead of
> risking botched error recovery messing up the program's state.
> Both are bad outcomes, but which one's less bad I find impossible to
> predict.
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  7:35         ` bibo mao
  2025-03-21  8:08           ` Markus Armbruster
@ 2025-04-04 11:59           ` Igor Mammedov
  2025-04-08  2:07             ` bibo mao
  1 sibling, 1 reply; 18+ messages in thread
From: Igor Mammedov @ 2025-04-04 11:59 UTC (permalink / raw)
  To: bibo mao
  Cc: Markus Armbruster, Song Gao, Jiaxun Yang, qemu-devel,
	Paolo Bonzini

On Fri, 21 Mar 2025 15:35:37 +0800
bibo mao <maobibo@loongson.cn> wrote:

> On 2025/3/21 下午3:21, Markus Armbruster wrote:
> > bibo mao <maobibo@loongson.cn> writes:
> >   
> >> +Igor
> >>
> >>
> >> On 2025/3/21 下午2:47, Markus Armbruster wrote:  
> >>> Bibo Mao <maobibo@loongson.cn> writes:
> >>>  
> >>>> In function virt_cpu_unplug(), it will send cpu unplug message to
> >>>> interrupt controller extioi and ipi irqchip. If there is problem in
> >>>> this function, system should continue to run and keep state the same
> >>>> before cpu is removed.
> >>>>
> >>>> If error happends in cpu unplug stage, send cpu plug message to extioi
> >>>> and ipi irqchip to restore to previous stage, and then return immediately.
> >>>>
> >>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
> >>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> >>>> ---
> >>>>    hw/loongarch/virt.c | 6 ++++++
> >>>>    1 file changed, 6 insertions(+)
> >>>>
> >>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
> >>>> index 8563967c8b..503362a69e 100644
> >>>> --- a/hw/loongarch/virt.c
> >>>> +++ b/hw/loongarch/virt.c
> >>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
> >>>>        hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
> >>>>        if (err) {
> >>>>            error_propagate(errp, err);
> >>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
> >>>> +                             &error_abort);
> >>>>            return;
> >>>>        }
> >>>>    
> >>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
> >>>>        hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
> >>>>        if (err) {
> >>>>            error_propagate(errp, err);
> >>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
> >>>> +                             &error_abort);
> >>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
> >>>> +                             &error_abort);
> >>>>            return;
> >>>>        }  
> >>>
> >>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
> >>> ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
> >>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
> >>> This must not fail.
> >>>
> >>> virt_cpu_plug() does it the other way round (see previous patch).
> >>>
> >>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
> >>> check for it to fail in virt_cpu_plug().
> >>>
> >>> Can it really fail in virt_cpu_plug()?
> >>>
> >>> If yes, why can't it fail in virt_cpu_unplug()?  
> >> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that
> >> is cpuplug callback for acpi_ged and ipi. it will not fail.
> >>
> >> If *virt_cpu_pre_plug()* pass, it will succeed.
> >>
> >> Regards
> >> Bibo Mao
> >>  
> >>>
> >>> Same questions for hotplug_handler_unplug().  
> > 
> > Let me restate my argument.
> > 
> > We call hotplug_handler_plug() on the happy path, and on error recovery
> > paths.  Four cases:
> > 
> > 1. Can fail on the happy path
> > 
> >     Error recovery is required.
> > 
> > 1.1 Can fail on the error recovery path
> > 
> >      Error recovery is required, but broken.
> > 
> > 1.2 Can't fail on the error recovery path
> > 
> >      Error recovery is required and works, but why it works is not
> >      obvious.  Deserves a comment explaining why hotplug_handler_plug()
> >      can't fail here even though it can fail on the happy path next door.
> > 
> > 2. Can't fail on the happy path
> > 
> >     Error recovery is unreachable.
> > 
> > 2.1 Can fail on the error recovery path
> > 
> >      Error recovery is unreachable and broken.  Possibly a time bomb, and
> >      possibly misleading readers.
> > 
> > 2.2 Can't fail on the error recovery path
> > 
> >      Error recovery is unreachable and would work, but why it would work
> >      is again a not obvious.
> > 
> > Which of the four cases is it?  
> By my understanding, it is "2. Can't fail on the happy path",  and Error 
> recovery is unreachable.
> 
> I have said that it is impossible and recovery is only for future use.

_plug() handler can't fail hence error_abort.
the same likely goes for _unplug() though I haven't audited it's usage so I won't bet here.
In cpu/mem cases it shall not fail, but there were cases where device_del bypasses
_unplug_request and calls _unplug directly (and those should be re-checked)

handlers that can fail and require graceful recovery are _pre and _request variants.

wrt: _plug() handler, we shall drop errp argument across the tree so
it won't confuse anyone, smth like this:
   hotplug_handler_plug(otplugHandler *hotplug_dev, DeviceState *dev)
and then fixup callers to do the same.

Bibo,
can you take care of that?


Perhaps also check _unplug path and if it shall not fail either, clean it up as well.

> 
> do you mean recovery should be removed? And directly &error_abort is 
> used in virt_cpu_plug() such as:
> static void virt_cpu_plug(HotplugHandler *hotplug_dev,
>                            DeviceState *dev, Error **errp)
> {
>    if (lvms->ipi) {
>      hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort);
> 
> Regards
> Bibo Mao
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 2/6] hw/loongarch/virt: Fix error handling in cpu plug
  2025-03-21  3:12 ` [PATCH v6 2/6] hw/loongarch/virt: Fix error handling in cpu plug Bibo Mao
@ 2025-04-04 12:06   ` Igor Mammedov
  0 siblings, 0 replies; 18+ messages in thread
From: Igor Mammedov @ 2025-04-04 12:06 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Song Gao, Markus Armbruster, Jiaxun Yang, qemu-devel,
	Paolo Bonzini

On Fri, 21 Mar 2025 11:12:55 +0800
Bibo Mao <maobibo@loongson.cn> wrote:

> In function virt_cpu_plug(), it will send cpu plug message to interrupt
> controller extioi and ipi irqchip. If there is problem in this function,
> system should continue to run and keep state the same before cpu is
> added.


_plug is not supposed to fail ever,
hence we are using error_abort.

so I'd drop those chunks 

> Object cpuslot::cpu is set at last only when there is no any error.
> If there is, send cpu unplug message to extioi and ipi irqchip, and then
> return immediately.

it doesn't matter when it's set since you are already in _plug() handler.
both way are correct.

(I'd set it as the last if calls to chained handlers do not depend on it)

> 
> Fixes: ab9935d2991e (hw/loongarch/virt: Implement cpu plug interface)
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> ---
>  hw/loongarch/virt.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
> index a5840ff968..8563967c8b 100644
> --- a/hw/loongarch/virt.c
> +++ b/hw/loongarch/virt.c
> @@ -981,8 +981,6 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
>      LoongArchVirtMachineState *lvms = LOONGARCH_VIRT_MACHINE(hotplug_dev);
>      Error *err = NULL;
>  
> -    cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id);
> -    cpu_slot->cpu = CPU(dev);
>      if (lvms->ipi) {
>          hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &err);
>          if (err) {
> @@ -995,6 +993,10 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
>          hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>          if (err) {
>              error_propagate(errp, err);
> +            if (lvms->ipi) {
> +                hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev,
> +                                       &error_abort);
> +            }
>              return;
>          }
>      }
> @@ -1003,9 +1005,21 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
>          hotplug_handler_plug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>          if (err) {
>              error_propagate(errp, err);
> +            if (lvms->ipi) {
> +                hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev,
> +                                       &error_abort);
> +            }
> +
> +            if (lvms->extioi) {
> +                hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev,
> +                                       &error_abort);
> +            }
> +            return;
>          }
>      }
>  
> +    cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id);
> +    cpu_slot->cpu = CPU(dev);
>      return;
>  }
>  



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-03-21  3:12 ` [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug Bibo Mao
  2025-03-21  6:47   ` Markus Armbruster
@ 2025-04-04 12:08   ` Igor Mammedov
  1 sibling, 0 replies; 18+ messages in thread
From: Igor Mammedov @ 2025-04-04 12:08 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Song Gao, Markus Armbruster, Jiaxun Yang, qemu-devel,
	Paolo Bonzini

On Fri, 21 Mar 2025 11:12:56 +0800
Bibo Mao <maobibo@loongson.cn> wrote:

> In function virt_cpu_unplug(), it will send cpu unplug message to
> interrupt controller extioi and ipi irqchip. If there is problem in
> this function, system should continue to run and keep state the same
> before cpu is removed.

see my comment to 2/6, the same applies here.
(unless there is a good reason why it can fail)
 
> 
> If error happends in cpu unplug stage, send cpu plug message to extioi
> and ipi irqchip to restore to previous stage, and then return immediately.
> 
> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> ---
>  hw/loongarch/virt.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
> index 8563967c8b..503362a69e 100644
> --- a/hw/loongarch/virt.c
> +++ b/hw/loongarch/virt.c
> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>      hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>      if (err) {
>          error_propagate(errp, err);
> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
> +                             &error_abort);
>          return;
>      }
>  
> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>      hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>      if (err) {
>          error_propagate(errp, err);
> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
> +                             &error_abort);
> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
> +                             &error_abort);
>          return;
>      }
>  



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug
  2025-04-04 11:59           ` Igor Mammedov
@ 2025-04-08  2:07             ` bibo mao
  0 siblings, 0 replies; 18+ messages in thread
From: bibo mao @ 2025-04-08  2:07 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Markus Armbruster, Song Gao, Jiaxun Yang, qemu-devel,
	Paolo Bonzini



On 2025/4/4 下午7:59, Igor Mammedov wrote:
> On Fri, 21 Mar 2025 15:35:37 +0800
> bibo mao <maobibo@loongson.cn> wrote:
> 
>> On 2025/3/21 下午3:21, Markus Armbruster wrote:
>>> bibo mao <maobibo@loongson.cn> writes:
>>>    
>>>> +Igor
>>>>
>>>>
>>>> On 2025/3/21 下午2:47, Markus Armbruster wrote:
>>>>> Bibo Mao <maobibo@loongson.cn> writes:
>>>>>   
>>>>>> In function virt_cpu_unplug(), it will send cpu unplug message to
>>>>>> interrupt controller extioi and ipi irqchip. If there is problem in
>>>>>> this function, system should continue to run and keep state the same
>>>>>> before cpu is removed.
>>>>>>
>>>>>> If error happends in cpu unplug stage, send cpu plug message to extioi
>>>>>> and ipi irqchip to restore to previous stage, and then return immediately.
>>>>>>
>>>>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
>>>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>>>>>> ---
>>>>>>     hw/loongarch/virt.c | 6 ++++++
>>>>>>     1 file changed, 6 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
>>>>>> index 8563967c8b..503362a69e 100644
>>>>>> --- a/hw/loongarch/virt.c
>>>>>> +++ b/hw/loongarch/virt.c
>>>>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>>>>         hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
>>>>>>         if (err) {
>>>>>>             error_propagate(errp, err);
>>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>>>>> +                             &error_abort);
>>>>>>             return;
>>>>>>         }
>>>>>>     
>>>>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
>>>>>>         hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
>>>>>>         if (err) {
>>>>>>             error_propagate(errp, err);
>>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
>>>>>> +                             &error_abort);
>>>>>> +        hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
>>>>>> +                             &error_abort);
>>>>>>             return;
>>>>>>         }
>>>>>
>>>>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
>>>>> ipi, extioi, and acpi_get.  If any notification fails, virt_cpu_unplug()
>>>>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
>>>>> This must not fail.
>>>>>
>>>>> virt_cpu_plug() does it the other way round (see previous patch).
>>>>>
>>>>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
>>>>> check for it to fail in virt_cpu_plug().
>>>>>
>>>>> Can it really fail in virt_cpu_plug()?
>>>>>
>>>>> If yes, why can't it fail in virt_cpu_unplug()?
>>>> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that
>>>> is cpuplug callback for acpi_ged and ipi. it will not fail.
>>>>
>>>> If *virt_cpu_pre_plug()* pass, it will succeed.
>>>>
>>>> Regards
>>>> Bibo Mao
>>>>   
>>>>>
>>>>> Same questions for hotplug_handler_unplug().
>>>
>>> Let me restate my argument.
>>>
>>> We call hotplug_handler_plug() on the happy path, and on error recovery
>>> paths.  Four cases:
>>>
>>> 1. Can fail on the happy path
>>>
>>>      Error recovery is required.
>>>
>>> 1.1 Can fail on the error recovery path
>>>
>>>       Error recovery is required, but broken.
>>>
>>> 1.2 Can't fail on the error recovery path
>>>
>>>       Error recovery is required and works, but why it works is not
>>>       obvious.  Deserves a comment explaining why hotplug_handler_plug()
>>>       can't fail here even though it can fail on the happy path next door.
>>>
>>> 2. Can't fail on the happy path
>>>
>>>      Error recovery is unreachable.
>>>
>>> 2.1 Can fail on the error recovery path
>>>
>>>       Error recovery is unreachable and broken.  Possibly a time bomb, and
>>>       possibly misleading readers.
>>>
>>> 2.2 Can't fail on the error recovery path
>>>
>>>       Error recovery is unreachable and would work, but why it would work
>>>       is again a not obvious.
>>>
>>> Which of the four cases is it?
>> By my understanding, it is "2. Can't fail on the happy path",  and Error
>> recovery is unreachable.
>>
>> I have said that it is impossible and recovery is only for future use.
> 
> _plug() handler can't fail hence error_abort.
> the same likely goes for _unplug() though I haven't audited it's usage so I won't bet here.
> In cpu/mem cases it shall not fail, but there were cases where device_del bypasses
> _unplug_request and calls _unplug directly (and those should be re-checked)
> 
> handlers that can fail and require graceful recovery are _pre and _request variants.
> 
> wrt: _plug() handler, we shall drop errp argument across the tree so
> it won't confuse anyone, smth like this:
>     hotplug_handler_plug(otplugHandler *hotplug_dev, DeviceState *dev)
> and then fixup callers to do the same.
> 
> Bibo,
> can you take care of that?

Sure, I will try to understand the plug/unplug workflow about all 
devices, and remove parameter error in generic function 
hotplug_handler_plug() after qemu 10.0 is released, the same for 
_unplug() function.

Regards
Bibo Mao

> 
> 
> Perhaps also check _unplug path and if it shall not fail either, clean it up as well.
> 
>>
>> do you mean recovery should be removed? And directly &error_abort is
>> used in virt_cpu_plug() such as:
>> static void virt_cpu_plug(HotplugHandler *hotplug_dev,
>>                             DeviceState *dev, Error **errp)
>> {
>>     if (lvms->ipi) {
>>       hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort);
>>
>> Regards
>> Bibo Mao
>>
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-04-08  2:09 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-21  3:12 [PATCH v6 0/6] target/loongarch: Fix some issues reported from coccinelle Bibo Mao
2025-03-21  3:12 ` [PATCH v6 1/6] target/loongarch: Fix error handling of KVM feature checks Bibo Mao
2025-03-21  3:12 ` [PATCH v6 2/6] hw/loongarch/virt: Fix error handling in cpu plug Bibo Mao
2025-04-04 12:06   ` Igor Mammedov
2025-03-21  3:12 ` [PATCH v6 3/6] hw/loongarch/virt: Fix error handling in cpu unplug Bibo Mao
2025-03-21  6:47   ` Markus Armbruster
2025-03-21  7:00     ` bibo mao
2025-03-21  7:21       ` Markus Armbruster
2025-03-21  7:35         ` bibo mao
2025-03-21  8:08           ` Markus Armbruster
2025-03-21  8:21             ` bibo mao
2025-04-04 11:59           ` Igor Mammedov
2025-04-08  2:07             ` bibo mao
2025-03-21  7:09     ` bibo mao
2025-04-04 12:08   ` Igor Mammedov
2025-03-21  3:12 ` [PATCH v6 4/6] hw/loongarch/virt: Eliminate error_propagate() Bibo Mao
2025-03-21  3:12 ` [PATCH v6 5/6] target/loongarch: Remove unnecessary temporary variable assignment Bibo Mao
2025-03-21  3:12 ` [PATCH v6 6/6] target/loongarch: Clean up virt_cpu_irq_init() error handling Bibo Mao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).