qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [0/2] Allow machine to control ordering of reset
@ 2012-08-02  2:10 David Gibson
  2012-08-02  2:10 ` [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing David Gibson
  2012-08-02  2:10 ` [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence David Gibson
  0 siblings, 2 replies; 35+ messages in thread
From: David Gibson @ 2012-08-02  2:10 UTC (permalink / raw)
  To: anthony; +Cc: qemu-devel, agraf

As discussed with Anthony on a call this morning, there are cases
where the machine really needs to explicitly control the ordering of
various steps of a system_reset.  This is currently possible only very
indirectly and to a limited extent.  These patches add support for
having the machine take explicit control of the reset ordering, and
use it to fix a number of reset bugs and near-bugs (i.e. works now,
but largely by accident) in the pseries code.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02  2:10 [Qemu-devel] [0/2] Allow machine to control ordering of reset David Gibson
@ 2012-08-02  2:10 ` David Gibson
  2012-08-02  2:37   ` Anthony Liguori
  2012-08-02 15:00   ` Lluís Vilanova
  2012-08-02  2:10 ` [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence David Gibson
  1 sibling, 2 replies; 35+ messages in thread
From: David Gibson @ 2012-08-02  2:10 UTC (permalink / raw)
  To: anthony; +Cc: David Gibson, qemu-devel, agraf

At present the qemu_system_reset() function always performs the same basic
actions on all machines.  This includes running all the reset handler
hooks, however the order in which they run is not controlled by the board
logic.

This is incorrect: any modern real hardware will generally have some sort
of reset controller, sometimes a full microcontroller or even service
processor which will control the order in which components on the board are
reset.

For one specific example, the machine level reset handlers may need to
re-establish certain things in memory to meet the emulated platform's
specified entry conditions, before reexecuting the guest software.
re-establish the correct specified entry conditions for the emulated
platform.  This must happen _after_ resetting peripheral devices, or they
could still be in the middle of DMA operations which would clobber the new
memory content.  However with the normal ordering of reset hooks, the
machine is not able to register a hook which will run after the normal
qdev reset registered by generic code.

This patch allows the machine to take control of the sequence of operations
during reset, by adding a new reset hook to the QEMUMachine.  This hook is
optional - without it we just use the same reset sequence as always.  That
logic is available in qemu_default_system_reset() to make it easy to
implement the case of the machine logic just needing to do some things
either before or after the normal reset.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/boards.h |    3 +++
 sysemu.h    |    1 +
 vl.c        |   10 +++++++++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/boards.h b/hw/boards.h
index 59c01d0..f2bfd3e 100644
--- a/hw/boards.h
+++ b/hw/boards.h
@@ -12,11 +12,14 @@ typedef void QEMUMachineInitFunc(ram_addr_t ram_size,
                                  const char *initrd_filename,
                                  const char *cpu_model);
 
+typedef void QEMUMachineResetFunc(bool report);
+
 typedef struct QEMUMachine {
     const char *name;
     const char *alias;
     const char *desc;
     QEMUMachineInitFunc *init;
+    QEMUMachineResetFunc *reset;
     int use_scsi;
     int max_cpus;
     unsigned int no_serial:1,
diff --git a/sysemu.h b/sysemu.h
index 6540c79..f74319b 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -62,6 +62,7 @@ int qemu_powerdown_requested(void);
 void qemu_system_killed(int signal, pid_t pid);
 void qemu_kill_report(void);
 extern qemu_irq qemu_system_powerdown;
+void qemu_default_system_reset(bool report);
 void qemu_system_reset(bool report);
 
 void qemu_add_exit_notifier(Notifier *notify);
diff --git a/vl.c b/vl.c
index 9fea320..ac47a7c 100644
--- a/vl.c
+++ b/vl.c
@@ -1396,7 +1396,7 @@ void qemu_unregister_reset(QEMUResetHandler *func, void *opaque)
     }
 }
 
-void qemu_system_reset(bool report)
+void qemu_default_system_reset(bool report)
 {
     QEMUResetEntry *re, *nre;
 
@@ -1410,6 +1410,14 @@ void qemu_system_reset(bool report)
     cpu_synchronize_all_post_reset();
 }
 
+void qemu_system_reset(bool report)
+{
+    if (current_machine->reset)
+        current_machine->reset(report);
+    else
+        qemu_default_system_reset(report);
+}
+
 void qemu_system_reset_request(void)
 {
     if (no_reboot) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02  2:10 [Qemu-devel] [0/2] Allow machine to control ordering of reset David Gibson
  2012-08-02  2:10 ` [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing David Gibson
@ 2012-08-02  2:10 ` David Gibson
  2012-08-02 15:44   ` Andreas Färber
  1 sibling, 1 reply; 35+ messages in thread
From: David Gibson @ 2012-08-02  2:10 UTC (permalink / raw)
  To: anthony; +Cc: David Gibson, qemu-devel, agraf

A number of things need to occur during reset of the PAPR paravirtualized
platform in a specific order.  For example, the hash table needs to be
cleared before the CPUs are reset, so that they initialize their register
state correctly, and the CPUs need to have their main reset called before
we set up the entry point state on the boot cpu.  We also need to have
the main qdev reset happen before the creation and installation of the
device tree for the new boot, because we need the state of the devices
settled to correctly construct the device tree.

Currently reset of pseries is broken in a number of ways, and in other
cases works largely by accident. This patch uses the new QEMUMachine reset
hook to correct these problems, by replacing the several existing spapr
reset hooks with one new machine hook which ensures that the various stages
happen in the correct order.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/spapr.c |   66 +++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 36 insertions(+), 30 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 2453bae..1e60ec1 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -582,29 +582,22 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
     }
 }
 
-static void spapr_reset(void *opaque)
+static void spapr_reset_cpu(CPUPPCState *env)
 {
-    sPAPREnvironment *spapr = (sPAPREnvironment *)opaque;
-
-    /* Reset the hash table & recalc the RMA */
-    spapr_reset_htab(spapr);
-
-    /* Load the fdt */
-    spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr,
-                       spapr->rtas_size);
-}
-
-static void spapr_cpu_reset(void *opaque)
-{
-    PowerPCCPU *cpu = opaque;
-    CPUPPCState *env = &cpu->env;
+    PowerPCCPU *cpu = container_of(env, PowerPCCPU, env);
 
     cpu_reset(CPU(cpu));
 
     env->external_htab = spapr->htab;
     env->htab_base = -1;
     env->htab_mask = HTAB_SIZE(spapr) - 1;
+    /* CPUs need to start halted at reset, the platform reset code
+     * will activate CPU0 then the rest are explicitly started by the
+     * guest using RTAS */
+    env->halted = 1;
 
+    /* Secondary CPUs get the CPU ID in r3 on entry */
+    env->gpr[3] = env->cpu_index;
     env->spr[SPR_SDR1] = (unsigned long)spapr->htab |
         (spapr->htab_shift - 18);
 
@@ -612,14 +605,35 @@ static void spapr_cpu_reset(void *opaque)
         kvmppc_update_sdr1(env);
     }
 
-    /* Set up the entry state */
-    if (env == first_cpu) {
-        env->gpr[3] = spapr->fdt_addr;
-        env->gpr[5] = 0;
-        env->halted = 0;
-        env->nip = spapr->entry_point;
+    tb_flush(env);
+}
+
+static void spapr_reset(bool report)
+{
+    CPUPPCState *env = first_cpu;
+
+    /* Reset the qdevs */
+    qemu_default_system_reset(report);
+
+    /* Reset the hash table & recalc the RMA */
+    spapr_reset_htab(spapr);
+
+    /* Reset the CPUs */
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        spapr_reset_cpu(env);
     }
 
+    /* Load the fdt */
+    spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr,
+                       spapr->rtas_size);
+
+    /* Set up the entry state on CPU0 */
+    env = first_cpu;
+
+    env->gpr[3] = spapr->fdt_addr;
+    env->gpr[5] = 0;
+    env->halted = 0;
+    env->nip = spapr->entry_point;
     tb_flush(env);
 }
 
@@ -718,8 +732,6 @@ static void ppc_spapr_init(ram_addr_t ram_size,
     /* FIXME: we should change this default based on RAM size */
     spapr->htab_shift = 24;
 
-    qemu_register_reset(spapr_reset, spapr);
-
     /* init CPUs */
     if (cpu_model == NULL) {
         cpu_model = kvm_enabled() ? "host" : "POWER7";
@@ -734,11 +746,9 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 
         /* Set time-base frequency to 512 MHz */
         cpu_ppc_tb_init(env, TIMEBASE_FREQ);
-        qemu_register_reset(spapr_cpu_reset, cpu);
 
         env->hreset_vector = 0x60;
         env->hreset_excp_prefix = 0;
-        env->gpr[3] = env->cpu_index;
     }
 
     /* allocate RAM */
@@ -883,11 +893,6 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 
     spapr->entry_point = 0x100;
 
-    /* SLOF will startup the secondary CPUs using RTAS */
-    for (env = first_cpu; env != NULL; env = env->next_cpu) {
-        env->halted = 1;
-    }
-
     /* Prepare the device tree */
     spapr->fdt_skel = spapr_create_fdt_skel(cpu_model,
                                             initrd_base, initrd_size,
@@ -900,6 +905,7 @@ static QEMUMachine spapr_machine = {
     .name = "pseries",
     .desc = "pSeries Logical Partition (PAPR compliant)",
     .init = ppc_spapr_init,
+    .reset = spapr_reset,
     .max_cpus = MAX_CPUS,
     .no_parallel = 1,
     .use_scsi = 1,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02  2:10 ` [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing David Gibson
@ 2012-08-02  2:37   ` Anthony Liguori
  2012-08-02  3:50     ` Benjamin Herrenschmidt
  2012-08-03  3:08     ` David Gibson
  2012-08-02 15:00   ` Lluís Vilanova
  1 sibling, 2 replies; 35+ messages in thread
From: Anthony Liguori @ 2012-08-02  2:37 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-devel, agraf

David Gibson <david@gibson.dropbear.id.au> writes:

> At present the qemu_system_reset() function always performs the same basic
> actions on all machines.  This includes running all the reset handler
> hooks, however the order in which they run is not controlled by the board
> logic.

Let's be careful here in referring to "board logic".

Reset should propagate--there's no argument there.  Having all devices
register in a list with no control over how reset is dispatched is
wrong.  It's workable because of qdev imposes per-bus reset functions
and typically we have a small number of busses and ordering doesn't
matter.

> This is incorrect: any modern real hardware will generally have some sort
> of reset controller, sometimes a full microcontroller or even service
> processor which will control the order in which components on the board are
> reset.

I think this is true only with a loose definition of "modern", but okay.

I don't think this is the right argument.  The machine should serve as
the entry point for reset.  What components get reset in what order will
need to be a machine hook for practical purposes (since not everything
will be fully QOMized all at once).

So I think this is the right approach for now.

But all of the DT initialization stuff that is leading to this
discussion in the first place is a gross hack to make a PV architecture
work that took far too many short cuts.

Building a DT in memory representing hardware instead of making things
discoverable is not how hardware works.  This sort of stuff is either
(1) hard coded in a firmware/flashrom or (2) built dynamically in
firmware.  Let's not pretend like we're doing this because it's needed
for real hardware.

Regards,

Anthony Liguori

> For one specific example, the machine level reset handlers may need to
> re-establish certain things in memory to meet the emulated platform's
> specified entry conditions, before reexecuting the guest software.
> re-establish the correct specified entry conditions for the emulated
> platform.  This must happen _after_ resetting peripheral devices, or they
> could still be in the middle of DMA operations which would clobber the new
> memory content.  However with the normal ordering of reset hooks, the
> machine is not able to register a hook which will run after the normal
> qdev reset registered by generic code.
>
> This patch allows the machine to take control of the sequence of operations
> during reset, by adding a new reset hook to the QEMUMachine.  This hook is
> optional - without it we just use the same reset sequence as always.  That
> logic is available in qemu_default_system_reset() to make it easy to
> implement the case of the machine logic just needing to do some things
> either before or after the normal reset.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/boards.h |    3 +++
>  sysemu.h    |    1 +
>  vl.c        |   10 +++++++++-
>  3 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/hw/boards.h b/hw/boards.h
> index 59c01d0..f2bfd3e 100644
> --- a/hw/boards.h
> +++ b/hw/boards.h
> @@ -12,11 +12,14 @@ typedef void QEMUMachineInitFunc(ram_addr_t ram_size,
>                                   const char *initrd_filename,
>                                   const char *cpu_model);
>  
> +typedef void QEMUMachineResetFunc(bool report);
> +
>  typedef struct QEMUMachine {
>      const char *name;
>      const char *alias;
>      const char *desc;
>      QEMUMachineInitFunc *init;
> +    QEMUMachineResetFunc *reset;
>      int use_scsi;
>      int max_cpus;
>      unsigned int no_serial:1,
> diff --git a/sysemu.h b/sysemu.h
> index 6540c79..f74319b 100644
> --- a/sysemu.h
> +++ b/sysemu.h
> @@ -62,6 +62,7 @@ int qemu_powerdown_requested(void);
>  void qemu_system_killed(int signal, pid_t pid);
>  void qemu_kill_report(void);
>  extern qemu_irq qemu_system_powerdown;
> +void qemu_default_system_reset(bool report);
>  void qemu_system_reset(bool report);
>  
>  void qemu_add_exit_notifier(Notifier *notify);
> diff --git a/vl.c b/vl.c
> index 9fea320..ac47a7c 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1396,7 +1396,7 @@ void qemu_unregister_reset(QEMUResetHandler *func, void *opaque)
>      }
>  }
>  
> -void qemu_system_reset(bool report)
> +void qemu_default_system_reset(bool report)
>  {
>      QEMUResetEntry *re, *nre;
>  
> @@ -1410,6 +1410,14 @@ void qemu_system_reset(bool report)
>      cpu_synchronize_all_post_reset();
>  }

Not a huge fan of the naming here.  How about qemu_devices_reset()?

> +void qemu_system_reset(bool report)
> +{
> +    if (current_machine->reset)
> +        current_machine->reset(report);
> +    else
> +        qemu_default_system_reset(report);
> +}

Missing curly braces.

Regards,

Anthony Liguori

> +
>  void qemu_system_reset_request(void)
>  {
>      if (no_reboot) {
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02  2:37   ` Anthony Liguori
@ 2012-08-02  3:50     ` Benjamin Herrenschmidt
  2012-08-02 15:17       ` Paolo Bonzini
  2012-08-03  3:08     ` David Gibson
  1 sibling, 1 reply; 35+ messages in thread
From: Benjamin Herrenschmidt @ 2012-08-02  3:50 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: agraf, qemu-devel, David Gibson

On Wed, 2012-08-01 at 21:37 -0500, Anthony Liguori wrote:
> 
> But all of the DT initialization stuff that is leading to this
> discussion in the first place is a gross hack to make a PV
> architecture
> work that took far too many short cuts.
> 
> Building a DT in memory representing hardware instead of making things
> discoverable is not how hardware works. 

It depends :-)

Take a Power7 machine for example... the system & processor are actually
initialized by the service processor which loads some kind of similar
data structure into memory before starting the initial firmware :-)

So from the P7 itself point of view, it starts with a memory based data
structure pre-build.

Also some processors have a pretty much turing complete "POR engine"
which executes code from some kind of EEPROM to initialize the system,
cores & busses on reset, which can involve writing things to memory as
well (though it generally doesn't).

>  This sort of stuff is either
> (1) hard coded in a firmware/flashrom or (2) built dynamically in
> firmware.  Let's not pretend like we're doing this because it's needed
> for real hardware.

Doesn't matter, we do things like -kernel which means pre-loading the
kernel from qemu, even on x86. That doesn't match real HW either, but
it's convenient to have. But overall there are real HW reasons to
control the reset as well so the hook makes sense both ways.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02  2:10 ` [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing David Gibson
  2012-08-02  2:37   ` Anthony Liguori
@ 2012-08-02 15:00   ` Lluís Vilanova
  2012-08-03  2:25     ` David Gibson
  1 sibling, 1 reply; 35+ messages in thread
From: Lluís Vilanova @ 2012-08-02 15:00 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-devel, anthony, agraf

David Gibson writes:
[...]
> diff --git a/vl.c b/vl.c
> index 9fea320..ac47a7c 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1410,6 +1410,14 @@ void qemu_system_reset(bool report)
>      cpu_synchronize_all_post_reset();
>  }
 
> +void qemu_system_reset(bool report)
> +{
> +    if (current_machine->reset)
> +        current_machine->reset(report);
> +    else
> +        qemu_default_system_reset(report);
> +}
> +
>  void qemu_system_reset_request(void)
>  {
>      if (no_reboot) {

Nitpick: you're missing the braces.


Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02  3:50     ` Benjamin Herrenschmidt
@ 2012-08-02 15:17       ` Paolo Bonzini
  2012-08-02 20:46         ` Benjamin Herrenschmidt
  2012-08-03  2:54         ` David Gibson
  0 siblings, 2 replies; 35+ messages in thread
From: Paolo Bonzini @ 2012-08-02 15:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: David Gibson, agraf, Anthony Liguori, qemu-devel

Il 02/08/2012 05:50, Benjamin Herrenschmidt ha scritto:
> 
>> >  This sort of stuff is either
>> > (1) hard coded in a firmware/flashrom or (2) built dynamically in
>> > firmware.  Let's not pretend like we're doing this because it's needed
>> > for real hardware.
> Doesn't matter, we do things like -kernel which means pre-loading the
> kernel from qemu, even on x86. That doesn't match real HW either, but
> it's convenient to have. But overall there are real HW reasons to
> control the reset as well so the hook makes sense both ways.

On x86 we do not pre-load the kernel.  Neither the kernel/initramfs and
the option ROM that loads the kernel are written in memory, they are
passed to the guest via fw_cfg.  Then the option ROM is loaded by the
BIOS, and loads the kernel/initramfs.

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02  2:10 ` [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence David Gibson
@ 2012-08-02 15:44   ` Andreas Färber
  2012-08-02 18:29     ` Anthony Liguori
  2012-08-03  2:31     ` David Gibson
  0 siblings, 2 replies; 35+ messages in thread
From: Andreas Färber @ 2012-08-02 15:44 UTC (permalink / raw)
  To: David Gibson; +Cc: Igor Mammedov, qemu-devel, anthony, agraf

Am 02.08.2012 04:10, schrieb David Gibson:
> A number of things need to occur during reset of the PAPR paravirtualized
> platform in a specific order.  For example, the hash table needs to be
> cleared before the CPUs are reset, so that they initialize their register
> state correctly, and the CPUs need to have their main reset called before
> we set up the entry point state on the boot cpu.  We also need to have
> the main qdev reset happen before the creation and installation of the
> device tree for the new boot, because we need the state of the devices
> settled to correctly construct the device tree.
> 
> Currently reset of pseries is broken in a number of ways, and in other
> cases works largely by accident. This patch uses the new QEMUMachine reset
> hook to correct these problems, by replacing the several existing spapr
> reset hooks with one new machine hook which ensures that the various stages
> happen in the correct order.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/spapr.c |   66 +++++++++++++++++++++++++++++++++---------------------------
>  1 file changed, 36 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/spapr.c b/hw/spapr.c
> index 2453bae..1e60ec1 100644
> --- a/hw/spapr.c
> +++ b/hw/spapr.c
> @@ -582,29 +582,22 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
>      }
>  }
>  
> -static void spapr_reset(void *opaque)
> +static void spapr_reset_cpu(CPUPPCState *env)
>  {
> -    sPAPREnvironment *spapr = (sPAPREnvironment *)opaque;
> -
> -    /* Reset the hash table & recalc the RMA */
> -    spapr_reset_htab(spapr);
> -
> -    /* Load the fdt */
> -    spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr,
> -                       spapr->rtas_size);
> -}
> -
> -static void spapr_cpu_reset(void *opaque)
> -{
> -    PowerPCCPU *cpu = opaque;
> -    CPUPPCState *env = &cpu->env;
> +    PowerPCCPU *cpu = container_of(env, PowerPCCPU, env);

NACK. Please don't undo the cleanups I have applied! Functions should
take a QOM PowerPCCPU, not its internal CPUPPCState. Fields are
gradually being moved from CPUxxxState into CPUState.

>  
>      cpu_reset(CPU(cpu));

Also note the current discussion about CPU reset and ordering, e.g.:
http://patchwork.ozlabs.org/patch/174602/

Anthony was favoring moving reset code out of machines and expressed
dislike for looping through CPUs, which my above patch took into
account. The ordering issue between CPU and devices is still unsolved there.

Some on-list comments from Anthony would be nice, since we are moving
into opposing directions here - having the sPAPR machine be more in
control vs. moving code away from the PC machine into target-i386 CPU
and/or common CPU code.

Cheers,
Andreas

>  
>      env->external_htab = spapr->htab;
>      env->htab_base = -1;
>      env->htab_mask = HTAB_SIZE(spapr) - 1;
> +    /* CPUs need to start halted at reset, the platform reset code
> +     * will activate CPU0 then the rest are explicitly started by the
> +     * guest using RTAS */
> +    env->halted = 1;
>  
> +    /* Secondary CPUs get the CPU ID in r3 on entry */
> +    env->gpr[3] = env->cpu_index;
>      env->spr[SPR_SDR1] = (unsigned long)spapr->htab |
>          (spapr->htab_shift - 18);
>  
> @@ -612,14 +605,35 @@ static void spapr_cpu_reset(void *opaque)
>          kvmppc_update_sdr1(env);
>      }
>  
> -    /* Set up the entry state */
> -    if (env == first_cpu) {
> -        env->gpr[3] = spapr->fdt_addr;
> -        env->gpr[5] = 0;
> -        env->halted = 0;
> -        env->nip = spapr->entry_point;
> +    tb_flush(env);
> +}
> +
> +static void spapr_reset(bool report)
> +{
> +    CPUPPCState *env = first_cpu;
> +
> +    /* Reset the qdevs */
> +    qemu_default_system_reset(report);
> +
> +    /* Reset the hash table & recalc the RMA */
> +    spapr_reset_htab(spapr);
> +
> +    /* Reset the CPUs */
> +    for (env = first_cpu; env != NULL; env = env->next_cpu) {
> +        spapr_reset_cpu(env);
>      }
>  
> +    /* Load the fdt */
> +    spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr,
> +                       spapr->rtas_size);
> +
> +    /* Set up the entry state on CPU0 */
> +    env = first_cpu;
> +
> +    env->gpr[3] = spapr->fdt_addr;
> +    env->gpr[5] = 0;
> +    env->halted = 0;
> +    env->nip = spapr->entry_point;
>      tb_flush(env);
>  }
>  
> @@ -718,8 +732,6 @@ static void ppc_spapr_init(ram_addr_t ram_size,
>      /* FIXME: we should change this default based on RAM size */
>      spapr->htab_shift = 24;
>  
> -    qemu_register_reset(spapr_reset, spapr);
> -
>      /* init CPUs */
>      if (cpu_model == NULL) {
>          cpu_model = kvm_enabled() ? "host" : "POWER7";
> @@ -734,11 +746,9 @@ static void ppc_spapr_init(ram_addr_t ram_size,
>  
>          /* Set time-base frequency to 512 MHz */
>          cpu_ppc_tb_init(env, TIMEBASE_FREQ);
> -        qemu_register_reset(spapr_cpu_reset, cpu);
>  
>          env->hreset_vector = 0x60;
>          env->hreset_excp_prefix = 0;
> -        env->gpr[3] = env->cpu_index;
>      }
>  
>      /* allocate RAM */
> @@ -883,11 +893,6 @@ static void ppc_spapr_init(ram_addr_t ram_size,
>  
>      spapr->entry_point = 0x100;
>  
> -    /* SLOF will startup the secondary CPUs using RTAS */
> -    for (env = first_cpu; env != NULL; env = env->next_cpu) {
> -        env->halted = 1;
> -    }
> -
>      /* Prepare the device tree */
>      spapr->fdt_skel = spapr_create_fdt_skel(cpu_model,
>                                              initrd_base, initrd_size,
> @@ -900,6 +905,7 @@ static QEMUMachine spapr_machine = {
>      .name = "pseries",
>      .desc = "pSeries Logical Partition (PAPR compliant)",
>      .init = ppc_spapr_init,
> +    .reset = spapr_reset,
>      .max_cpus = MAX_CPUS,
>      .no_parallel = 1,
>      .use_scsi = 1,
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02 15:44   ` Andreas Färber
@ 2012-08-02 18:29     ` Anthony Liguori
  2012-08-02 18:38       ` Andreas Färber
  2012-08-03  2:31     ` David Gibson
  1 sibling, 1 reply; 35+ messages in thread
From: Anthony Liguori @ 2012-08-02 18:29 UTC (permalink / raw)
  To: Andreas Färber, David Gibson; +Cc: Igor Mammedov, qemu-devel, agraf

Andreas Färber <afaerber@suse.de> writes:

> Am 02.08.2012 04:10, schrieb David Gibson:
>> A number of things need to occur during reset of the PAPR paravirtualized
>> platform in a specific order.  For example, the hash table needs to be
>> cleared before the CPUs are reset, so that they initialize their register
>> state correctly, and the CPUs need to have their main reset called before
>> we set up the entry point state on the boot cpu.  We also need to have
>> the main qdev reset happen before the creation and installation of the
>> device tree for the new boot, because we need the state of the devices
>> settled to correctly construct the device tree.
>> 
>> Currently reset of pseries is broken in a number of ways, and in other
>> cases works largely by accident. This patch uses the new QEMUMachine reset
>> hook to correct these problems, by replacing the several existing spapr
>> reset hooks with one new machine hook which ensures that the various stages
>> happen in the correct order.
>> 
>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>> ---
>>  hw/spapr.c |   66 +++++++++++++++++++++++++++++++++---------------------------
>>  1 file changed, 36 insertions(+), 30 deletions(-)
>> 
>> diff --git a/hw/spapr.c b/hw/spapr.c
>> index 2453bae..1e60ec1 100644
>> --- a/hw/spapr.c
>> +++ b/hw/spapr.c
>> @@ -582,29 +582,22 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
>>      }
>>  }
>>  
>> -static void spapr_reset(void *opaque)
>> +static void spapr_reset_cpu(CPUPPCState *env)
>>  {
>> -    sPAPREnvironment *spapr = (sPAPREnvironment *)opaque;
>> -
>> -    /* Reset the hash table & recalc the RMA */
>> -    spapr_reset_htab(spapr);
>> -
>> -    /* Load the fdt */
>> -    spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr,
>> -                       spapr->rtas_size);
>> -}
>> -
>> -static void spapr_cpu_reset(void *opaque)
>> -{
>> -    PowerPCCPU *cpu = opaque;
>> -    CPUPPCState *env = &cpu->env;
>> +    PowerPCCPU *cpu = container_of(env, PowerPCCPU, env);
>
> NACK. Please don't undo the cleanups I have applied! Functions should
> take a QOM PowerPCCPU, not its internal CPUPPCState. Fields are
> gradually being moved from CPUxxxState into CPUState.
>
>>  
>>      cpu_reset(CPU(cpu));
>
> Also note the current discussion about CPU reset and ordering, e.g.:
> http://patchwork.ozlabs.org/patch/174602/
>
> Anthony was favoring moving reset code out of machines and expressed
> dislike for looping through CPUs, which my above patch took into
> account. The ordering issue between CPU and devices is still unsolved there.
>
> Some on-list comments from Anthony would be nice, since we are moving
> into opposing directions here - having the sPAPR machine be more in
> control vs. moving code away from the PC machine into target-i386 CPU
> and/or common CPU code.

I already commented on the first patch because I had a feeling you'd
post something like this ;-)

Regarding reset:

1) Devices should implement DeviceState::reset()

2) If a device doesn't implement ::reset(), it should call
qemu_register_reset()

3) Reset should propagate through the device model, starting with the
top-level machine which is logically what's plugged into the wall and
is the source of power in the first place.

Regards,

Anthony Liguori

>
> Cheers,
> Andreas
>
>>  
>>      env->external_htab = spapr->htab;
>>      env->htab_base = -1;
>>      env->htab_mask = HTAB_SIZE(spapr) - 1;
>> +    /* CPUs need to start halted at reset, the platform reset code
>> +     * will activate CPU0 then the rest are explicitly started by the
>> +     * guest using RTAS */
>> +    env->halted = 1;
>>  
>> +    /* Secondary CPUs get the CPU ID in r3 on entry */
>> +    env->gpr[3] = env->cpu_index;
>>      env->spr[SPR_SDR1] = (unsigned long)spapr->htab |
>>          (spapr->htab_shift - 18);
>>  
>> @@ -612,14 +605,35 @@ static void spapr_cpu_reset(void *opaque)
>>          kvmppc_update_sdr1(env);
>>      }
>>  
>> -    /* Set up the entry state */
>> -    if (env == first_cpu) {
>> -        env->gpr[3] = spapr->fdt_addr;
>> -        env->gpr[5] = 0;
>> -        env->halted = 0;
>> -        env->nip = spapr->entry_point;
>> +    tb_flush(env);
>> +}
>> +
>> +static void spapr_reset(bool report)
>> +{
>> +    CPUPPCState *env = first_cpu;
>> +
>> +    /* Reset the qdevs */
>> +    qemu_default_system_reset(report);
>> +
>> +    /* Reset the hash table & recalc the RMA */
>> +    spapr_reset_htab(spapr);
>> +
>> +    /* Reset the CPUs */
>> +    for (env = first_cpu; env != NULL; env = env->next_cpu) {
>> +        spapr_reset_cpu(env);
>>      }
>>  
>> +    /* Load the fdt */
>> +    spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr,
>> +                       spapr->rtas_size);
>> +
>> +    /* Set up the entry state on CPU0 */
>> +    env = first_cpu;
>> +
>> +    env->gpr[3] = spapr->fdt_addr;
>> +    env->gpr[5] = 0;
>> +    env->halted = 0;
>> +    env->nip = spapr->entry_point;
>>      tb_flush(env);
>>  }
>>  
>> @@ -718,8 +732,6 @@ static void ppc_spapr_init(ram_addr_t ram_size,
>>      /* FIXME: we should change this default based on RAM size */
>>      spapr->htab_shift = 24;
>>  
>> -    qemu_register_reset(spapr_reset, spapr);
>> -
>>      /* init CPUs */
>>      if (cpu_model == NULL) {
>>          cpu_model = kvm_enabled() ? "host" : "POWER7";
>> @@ -734,11 +746,9 @@ static void ppc_spapr_init(ram_addr_t ram_size,
>>  
>>          /* Set time-base frequency to 512 MHz */
>>          cpu_ppc_tb_init(env, TIMEBASE_FREQ);
>> -        qemu_register_reset(spapr_cpu_reset, cpu);
>>  
>>          env->hreset_vector = 0x60;
>>          env->hreset_excp_prefix = 0;
>> -        env->gpr[3] = env->cpu_index;
>>      }
>>  
>>      /* allocate RAM */
>> @@ -883,11 +893,6 @@ static void ppc_spapr_init(ram_addr_t ram_size,
>>  
>>      spapr->entry_point = 0x100;
>>  
>> -    /* SLOF will startup the secondary CPUs using RTAS */
>> -    for (env = first_cpu; env != NULL; env = env->next_cpu) {
>> -        env->halted = 1;
>> -    }
>> -
>>      /* Prepare the device tree */
>>      spapr->fdt_skel = spapr_create_fdt_skel(cpu_model,
>>                                              initrd_base, initrd_size,
>> @@ -900,6 +905,7 @@ static QEMUMachine spapr_machine = {
>>      .name = "pseries",
>>      .desc = "pSeries Logical Partition (PAPR compliant)",
>>      .init = ppc_spapr_init,
>> +    .reset = spapr_reset,
>>      .max_cpus = MAX_CPUS,
>>      .no_parallel = 1,
>>      .use_scsi = 1,
>> 
>
>
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02 18:29     ` Anthony Liguori
@ 2012-08-02 18:38       ` Andreas Färber
  2012-08-02 19:40         ` Anthony Liguori
  0 siblings, 1 reply; 35+ messages in thread
From: Andreas Färber @ 2012-08-02 18:38 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: agraf, Igor Mammedov, qemu-devel, David Gibson

Am 02.08.2012 20:29, schrieb Anthony Liguori:
> Andreas Färber <afaerber@suse.de> writes:
> 
>> Anthony was favoring moving reset code out of machines and expressed
>> dislike for looping through CPUs, which my above patch took into
>> account. The ordering issue between CPU and devices is still unsolved there.
>>
>> Some on-list comments from Anthony would be nice, since we are moving
>> into opposing directions here - having the sPAPR machine be more in
>> control vs. moving code away from the PC machine into target-i386 CPU
>> and/or common CPU code.
> 
> I already commented on the first patch because I had a feeling you'd
> post something like this ;-)

I was not cc'ed. :(

> Regarding reset:
> 
> 1) Devices should implement DeviceState::reset()
> 
> 2) If a device doesn't implement ::reset(), it should call
> qemu_register_reset()
> 
> 3) Reset should propagate through the device model, starting with the
> top-level machine which is logically what's plugged into the wall and
> is the source of power in the first place.

So you changed your opinion over night?

I wanted to keep the reset callbacks in the machine. You applied a patch
breaking that pattern and argued you wanted to move reset code *out* of
the machine. Now you say the machine should *propagate* reset. Sorry,
that's unlogical to me...

If the machine should propagate reset then the disputed i386 patch is
doing The Wrong Thing.

If reset code should vanish from machine code AFAP then this patch is
doing The Wrong Thing.

No?

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02 18:38       ` Andreas Färber
@ 2012-08-02 19:40         ` Anthony Liguori
  2012-08-03  2:37           ` David Gibson
                             ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Anthony Liguori @ 2012-08-02 19:40 UTC (permalink / raw)
  To: Andreas Färber; +Cc: agraf, Igor Mammedov, qemu-devel, David Gibson

Andreas Färber <afaerber@suse.de> writes:

> Am 02.08.2012 20:29, schrieb Anthony Liguori:
>> Andreas Färber <afaerber@suse.de> writes:
>> 
>>> Anthony was favoring moving reset code out of machines and expressed
>>> dislike for looping through CPUs, which my above patch took into
>>> account. The ordering issue between CPU and devices is still unsolved there.
>>>
>>> Some on-list comments from Anthony would be nice, since we are moving
>>> into opposing directions here - having the sPAPR machine be more in
>>> control vs. moving code away from the PC machine into target-i386 CPU
>>> and/or common CPU code.
>> 
>> I already commented on the first patch because I had a feeling you'd
>> post something like this ;-)
>
> I was not cc'ed. :(
>
>> Regarding reset:
>> 
>> 1) Devices should implement DeviceState::reset()
>> 
>> 2) If a device doesn't implement ::reset(), it should call
>> qemu_register_reset()
>> 
>> 3) Reset should propagate through the device model, starting with the
>> top-level machine which is logically what's plugged into the wall and
>> is the source of power in the first place.
>
> So you changed your opinion over night?

No.

> I wanted to keep the reset callbacks in the machine. You applied a patch
> breaking that pattern and argued you wanted to move reset code *out* of
> the machine. Now you say the machine should *propagate* reset. Sorry,
> that's unlogical to me...

You're not listening carefully.  Just a friendly piece of advise--
instead of sending knee-jerk emails, spend some time going back and
re-reading these discussions.

This has been discussed literally to death now for years.

Reset propagates.  There is unanimous consensus that this is the Right
Way to model reset.  There is also wide consensus that reset typically
will propagate through the composition tree although in some cases,
reset actually propagates through the bus (this mostly affects devices
that are children of /peripheral paths though).

The "root" of the composition tree is the machine.  The machine in the
abstract sense, not the QEMUMachine sense.  QEMUMachine::init() should
eventually become trivial--just create a handful of devices that
represent the core components of the machine with everything else being
created through composition.

Open coded logic in QEMUMachine::init is always bad.  Handling reset for
a specific device in QEMUMachine::init is bad.  That goes against the
idea of making QEMUMachine::init trivial.

However, reset does logically start at QEMUMachine.  That doesn't mean
that QEMUMachine should be explicitly resetting devices in a specific
order.  This is why I was quick to comment on David's patch because the
argument about having a controller that determines reset ordering was
silly.  While this does exist on some architectures, it's not at all
typical.  Reset should flow with QEMUMachine::reset just playing the
role of deciding whether it starts propagating from.

The only machines that can have complex reset logic are ones that can
afford to take an extremely long time to startup--typically doing a
tremendous amount of self-checks in the process.  These are not common
among the types of machines QEMU simulates.

Regards,

Anthony Liguori

>
> If the machine should propagate reset then the disputed i386 patch is
> doing The Wrong Thing.
>
> If reset code should vanish from machine code AFAP then this patch is
> doing The Wrong Thing.
>
> No?
>
> Regards,
> Andreas
>
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02 15:17       ` Paolo Bonzini
@ 2012-08-02 20:46         ` Benjamin Herrenschmidt
  2012-08-02 20:58           ` Anthony Liguori
  2012-08-03  2:54         ` David Gibson
  1 sibling, 1 reply; 35+ messages in thread
From: Benjamin Herrenschmidt @ 2012-08-02 20:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: David Gibson, agraf, Anthony Liguori, qemu-devel

On Thu, 2012-08-02 at 17:17 +0200, Paolo Bonzini wrote:
> 
> On x86 we do not pre-load the kernel.  Neither the kernel/initramfs
> and
> the option ROM that loads the kernel are written in memory, they are
> passed to the guest via fw_cfg.  Then the option ROM is loaded by the
> BIOS, and loads the kernel/initramfs.

How does it do ? IE. how does it access a random pair of files on the
host side ? Via some magic ports ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02 20:46         ` Benjamin Herrenschmidt
@ 2012-08-02 20:58           ` Anthony Liguori
  0 siblings, 0 replies; 35+ messages in thread
From: Anthony Liguori @ 2012-08-02 20:58 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paolo Bonzini; +Cc: David Gibson, agraf, qemu-devel

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Thu, 2012-08-02 at 17:17 +0200, Paolo Bonzini wrote:
>> 
>> On x86 we do not pre-load the kernel.  Neither the kernel/initramfs
>> and
>> the option ROM that loads the kernel are written in memory, they are
>> passed to the guest via fw_cfg.  Then the option ROM is loaded by the
>> BIOS, and loads the kernel/initramfs.
>
> How does it do ? IE. how does it access a random pair of files on the
> host side ? Via some magic ports ?

Yes, see hw/fw_cfg.c

Regards,

Anthony Liguori

>
> Cheers,
> Ben.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02 15:00   ` Lluís Vilanova
@ 2012-08-03  2:25     ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2012-08-03  2:25 UTC (permalink / raw)
  To: Lluís Vilanova; +Cc: qemu-devel, anthony, agraf

On Thu, Aug 02, 2012 at 06:00:06PM +0300, Lluís Vilanova wrote:
> David Gibson writes:
> [...]
> > diff --git a/vl.c b/vl.c
> > index 9fea320..ac47a7c 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -1410,6 +1410,14 @@ void qemu_system_reset(bool report)
> >      cpu_synchronize_all_post_reset();
> >  }
>  
> > +void qemu_system_reset(bool report)
> > +{
> > +    if (current_machine->reset)
> > +        current_machine->reset(report);
> > +    else
> > +        qemu_default_system_reset(report);
> > +}
> > +
> >  void qemu_system_reset_request(void)
> >  {
> >      if (no_reboot) {
> 
> Nitpick: you're missing the braces.

Ah, oops, thank you.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02 15:44   ` Andreas Färber
  2012-08-02 18:29     ` Anthony Liguori
@ 2012-08-03  2:31     ` David Gibson
  2012-08-03 15:13       ` Andreas Färber
  1 sibling, 1 reply; 35+ messages in thread
From: David Gibson @ 2012-08-03  2:31 UTC (permalink / raw)
  To: Andreas Färber; +Cc: Igor Mammedov, qemu-devel, anthony, agraf

On Thu, Aug 02, 2012 at 05:44:49PM +0200, Andreas Färber wrote:
> Am 02.08.2012 04:10, schrieb David Gibson:
> > A number of things need to occur during reset of the PAPR paravirtualized
> > platform in a specific order.  For example, the hash table needs to be
> > cleared before the CPUs are reset, so that they initialize their register
> > state correctly, and the CPUs need to have their main reset called before
> > we set up the entry point state on the boot cpu.  We also need to have
> > the main qdev reset happen before the creation and installation of the
> > device tree for the new boot, because we need the state of the devices
> > settled to correctly construct the device tree.
> > 
> > Currently reset of pseries is broken in a number of ways, and in other
> > cases works largely by accident. This patch uses the new QEMUMachine reset
> > hook to correct these problems, by replacing the several existing spapr
> > reset hooks with one new machine hook which ensures that the various stages
> > happen in the correct order.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/spapr.c |   66 +++++++++++++++++++++++++++++++++---------------------------
> >  1 file changed, 36 insertions(+), 30 deletions(-)
> > 
> > diff --git a/hw/spapr.c b/hw/spapr.c
> > index 2453bae..1e60ec1 100644
> > --- a/hw/spapr.c
> > +++ b/hw/spapr.c
> > @@ -582,29 +582,22 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
> >      }
> >  }
> >  
> > -static void spapr_reset(void *opaque)
> > +static void spapr_reset_cpu(CPUPPCState *env)
> >  {
> > -    sPAPREnvironment *spapr = (sPAPREnvironment *)opaque;
> > -
> > -    /* Reset the hash table & recalc the RMA */
> > -    spapr_reset_htab(spapr);
> > -
> > -    /* Load the fdt */
> > -    spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr,
> > -                       spapr->rtas_size);
> > -}
> > -
> > -static void spapr_cpu_reset(void *opaque)
> > -{
> > -    PowerPCCPU *cpu = opaque;
> > -    CPUPPCState *env = &cpu->env;
> > +    PowerPCCPU *cpu = container_of(env, PowerPCCPU, env);
> 
> NACK. Please don't undo the cleanups I have applied! Functions should
> take a QOM PowerPCCPU, not its internal CPUPPCState. Fields are
> gradually being moved from CPUxxxState into CPUState.

Um, ok.  So how do I iterate the PowerPCCPUs instead of the CPUPPCStates?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02 19:40         ` Anthony Liguori
@ 2012-08-03  2:37           ` David Gibson
  2012-08-03 13:50             ` Anthony Liguori
  2012-08-03 14:51           ` Andreas Färber
  2012-08-03 15:01           ` Andreas Färber
  2 siblings, 1 reply; 35+ messages in thread
From: David Gibson @ 2012-08-03  2:37 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: agraf, Igor Mammedov, Andreas Färber, qemu-devel

On Thu, Aug 02, 2012 at 02:40:19PM -0500, Anthony Liguori wrote:
> Andreas Färber <afaerber@suse.de> writes:
> 
> > Am 02.08.2012 20:29, schrieb Anthony Liguori:
> >> Andreas Färber <afaerber@suse.de> writes:
> >> 
> >>> Anthony was favoring moving reset code out of machines and expressed
> >>> dislike for looping through CPUs, which my above patch took into
> >>> account. The ordering issue between CPU and devices is still unsolved there.
> >>>
> >>> Some on-list comments from Anthony would be nice, since we are moving
> >>> into opposing directions here - having the sPAPR machine be more in
> >>> control vs. moving code away from the PC machine into target-i386 CPU
> >>> and/or common CPU code.
> >> 
> >> I already commented on the first patch because I had a feeling you'd
> >> post something like this ;-)
> >
> > I was not cc'ed. :(
> >
> >> Regarding reset:
> >> 
> >> 1) Devices should implement DeviceState::reset()
> >> 
> >> 2) If a device doesn't implement ::reset(), it should call
> >> qemu_register_reset()
> >> 
> >> 3) Reset should propagate through the device model, starting with the
> >> top-level machine which is logically what's plugged into the wall and
> >> is the source of power in the first place.
> >
> > So you changed your opinion over night?
> 
> No.
> 
> > I wanted to keep the reset callbacks in the machine. You applied a patch
> > breaking that pattern and argued you wanted to move reset code *out* of
> > the machine. Now you say the machine should *propagate* reset. Sorry,
> > that's unlogical to me...
> 
> You're not listening carefully.  Just a friendly piece of advise--
> instead of sending knee-jerk emails, spend some time going back and
> re-reading these discussions.
> 
> This has been discussed literally to death now for years.
> 
> Reset propagates.  There is unanimous consensus that this is the Right
> Way to model reset.  There is also wide consensus that reset typically
> will propagate through the composition tree although in some cases,
> reset actually propagates through the bus (this mostly affects devices
> that are children of /peripheral paths though).
> 
> The "root" of the composition tree is the machine.  The machine in the
> abstract sense, not the QEMUMachine sense.  QEMUMachine::init() should
> eventually become trivial--just create a handful of devices that
> represent the core components of the machine with everything else being
> created through composition.

So what code controls the order in which "the machine in the abstract
sense" initiates the reset at the top-leve?

> Open coded logic in QEMUMachine::init is always bad.  Handling reset for
> a specific device in QEMUMachine::init is bad.  That goes against the
> idea of making QEMUMachine::init trivial.
> 
> However, reset does logically start at QEMUMachine.  That doesn't mean
> that QEMUMachine should be explicitly resetting devices in a specific
> order.  This is why I was quick to comment on David's patch because the
> argument about having a controller that determines reset ordering was
> silly.  While this does exist on some architectures,

Some platforms; architecture does not imply a particular platform -
this is one of the more subtle and pervasive x86-isms around.

> it's not at all
> typical.

So?  If it sometimes exists, we need to support that model.  The
argument that "real" hardware never has reset order dependencies is
simply incorrect.

>  Reset should flow with QEMUMachine::reset just playing the
> role of deciding whether it starts propagating from.
> 
> The only machines that can have complex reset logic are ones that can
> afford to take an extremely long time to startup--typically doing a
> tremendous amount of self-checks in the process.  These are not common
> among the types of machines QEMU simulates.

"having at least one order dependency in reset" != "complex and slow
reset logic".

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02 15:17       ` Paolo Bonzini
  2012-08-02 20:46         ` Benjamin Herrenschmidt
@ 2012-08-03  2:54         ` David Gibson
  1 sibling, 0 replies; 35+ messages in thread
From: David Gibson @ 2012-08-03  2:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: agraf, Anthony Liguori, qemu-devel

On Thu, Aug 02, 2012 at 05:17:27PM +0200, Paolo Bonzini wrote:
> Il 02/08/2012 05:50, Benjamin Herrenschmidt ha scritto:
> > 
> >> >  This sort of stuff is either
> >> > (1) hard coded in a firmware/flashrom or (2) built dynamically in
> >> > firmware.  Let's not pretend like we're doing this because it's needed
> >> > for real hardware.
> > Doesn't matter, we do things like -kernel which means pre-loading the
> > kernel from qemu, even on x86. That doesn't match real HW either, but
> > it's convenient to have. But overall there are real HW reasons to
> > control the reset as well so the hook makes sense both ways.
> 
> On x86 we do not pre-load the kernel.  Neither the kernel/initramfs and
> the option ROM that loads the kernel are written in memory, they are
> passed to the guest via fw_cfg.  Then the option ROM is loaded by the
> BIOS, and loads the kernel/initramfs.

On x86 (or rather PC platform), yes.  But spapr is far from the only
one to load the kernel images direct into RAM - arm_load_kernel() does
it too, so does milkymist, sun4[mu] and a bunch of others.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing
  2012-08-02  2:37   ` Anthony Liguori
  2012-08-02  3:50     ` Benjamin Herrenschmidt
@ 2012-08-03  3:08     ` David Gibson
  1 sibling, 0 replies; 35+ messages in thread
From: David Gibson @ 2012-08-03  3:08 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel, agraf

On Wed, Aug 01, 2012 at 09:37:39PM -0500, Anthony Liguori wrote:
> David Gibson <david@gibson.dropbear.id.au> writes:
> 
> > At present the qemu_system_reset() function always performs the same basic
> > actions on all machines.  This includes running all the reset handler
> > hooks, however the order in which they run is not controlled by the board
> > logic.
> 
> Let's be careful here in referring to "board logic".
> 
> Reset should propagate--there's no argument there.  Having all devices
> register in a list with no control over how reset is dispatched is
> wrong.  It's workable because of qdev imposes per-bus reset functions
> and typically we have a small number of busses and ordering doesn't
> matter.
> 
> > This is incorrect: any modern real hardware will generally have some sort
> > of reset controller, sometimes a full microcontroller or even service
> > processor which will control the order in which components on the board are
> > reset.
> 
> I think this is true only with a loose definition of "modern", but okay.
> 
> I don't think this is the right argument.  The machine should serve as
> the entry point for reset.  What components get reset in what order will
> need to be a machine hook for practical purposes (since not everything
> will be fully QOMized all at once).
> 
> So I think this is the right approach for now.
> 
> But all of the DT initialization stuff that is leading to this
> discussion in the first place is a gross hack to make a PV architecture
> work that took far too many short cuts.

Hrm.  The "short cuts" you seem to refer to is the fact that we do a
bunch of our machine initialization direct from qemu, rather than from
firmware executed within the guest.  Doing it this way is both easier
to write, easier to follow and faster to execute, so if that counts as
a gross hack then I think your design priorities need
re-consideration.

Because of the arcane history of the PC boot sequence I can see why
executing a BIOS image within the guest is the preferred model there,
but that's no reason to put the same restriction on platforms that
have a clear and well documented firmware to OS handoff state.

[snip]
> > -void qemu_system_reset(bool report)
> > +void qemu_default_system_reset(bool report)
> >  {
> >      QEMUResetEntry *re, *nre;
> >  
> > @@ -1410,6 +1410,14 @@ void qemu_system_reset(bool report)
> >      cpu_synchronize_all_post_reset();
> >  }
> 
> Not a huge fan of the naming here.  How about qemu_devices_reset()?

Ok, I'll change that.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-03  2:37           ` David Gibson
@ 2012-08-03 13:50             ` Anthony Liguori
  2012-08-03 13:57               ` Peter Maydell
  0 siblings, 1 reply; 35+ messages in thread
From: Anthony Liguori @ 2012-08-03 13:50 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-devel, Igor Mammedov, agraf, Andreas Färber

David Gibson <david@gibson.dropbear.id.au> writes:

> On Thu, Aug 02, 2012 at 02:40:19PM -0500, Anthony Liguori wrote:
>> The "root" of the composition tree is the machine.  The machine in the
>> abstract sense, not the QEMUMachine sense.  QEMUMachine::init() should
>> eventually become trivial--just create a handful of devices that
>> represent the core components of the machine with everything else being
>> created through composition.
>
> So what code controls the order in which "the machine in the abstract
> sense" initiates the reset at the top-leve?

There ought to be a hierarchy (based on composition) that reset flows
through.  In the case of the PC, at the top level of the hierarchy are
the CPUs and Northbridge (i440fx).  They either are going to sit on the
same bus or within a container of some kind.  Reset flows through the
bus/container to the CPUs and the Northbridge.  The northbridge then
flows reset through the PCI bus and then to the southbridge and all of
the devices behind it.

>> Open coded logic in QEMUMachine::init is always bad.  Handling reset for
>> a specific device in QEMUMachine::init is bad.  That goes against the
>> idea of making QEMUMachine::init trivial.
>> 
>> However, reset does logically start at QEMUMachine.  That doesn't mean
>> that QEMUMachine should be explicitly resetting devices in a specific
>> order.  This is why I was quick to comment on David's patch because the
>> argument about having a controller that determines reset ordering was
>> silly.  While this does exist on some architectures,
>
> Some platforms; architecture does not imply a particular platform -
> this is one of the more subtle and pervasive x86-isms around.
>
>> it's not at all
>> typical.
>
> So?  If it sometimes exists, we need to support that model.  The
> argument that "real" hardware never has reset order dependencies is
> simply incorrect.

"Support the model" is different from "make it a first class
abstraction".

-M pseries is not a real machine.  The things it has to do--initialize
all devices, build a device tree, then initialize the CPU with where the
device tree lives, is unique to !not a real machine.

This is what I mean by complex reset logic.  It's not just a matter of
an ordering of some kind, it's reset X, do A, reset Y, do B, reset Z, do
C.  The "do [ABC]" is the part that we shouldn't be trying to model as a
general mechanism.

I'm not saying that we shouldn't support being able to do this, but this
is an exception, not the way all other platforms should behave.

>>  Reset should flow with QEMUMachine::reset just playing the
>> role of deciding whether it starts propagating from.
>> 
>> The only machines that can have complex reset logic are ones that can
>> afford to take an extremely long time to startup--typically doing a
>> tremendous amount of self-checks in the process.  These are not common
>> among the types of machines QEMU simulates.
>
> "having at least one order dependency in reset" != "complex and slow
> reset logic".

It's not an issue of dependency.

We're trying to move to a model where everything is a device.
QEMUMachine essentially goes away because a user can create the machine
by just creating individual devices and tying them together.

But this will never be possible with -M pseries because of the "do ABC"
logic that it requires which doesn't fit within the model of everything
is a device.

That's okay.  We have the same problem with Xen and I anticipate we'll
have the same problem with S390.  We should support this model, but that
doesn't mean we shouldn't work toward moving everything else to
"everything is a device".

Regards,

Anthony Liguori

>
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-03 13:50             ` Anthony Liguori
@ 2012-08-03 13:57               ` Peter Maydell
  2012-08-03 14:22                 ` Anthony Liguori
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Maydell @ 2012-08-03 13:57 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Igor Mammedov, agraf, qemu-devel, Andreas Färber,
	David Gibson

On 3 August 2012 14:50, Anthony Liguori <anthony@codemonkey.ws> wrote:
> There ought to be a hierarchy (based on composition) that reset flows
> through.

I think saying "the reset tree is isomorphic to the composition tree"
is making the same mistake that qbus did with "the bus tree is
isomorphic to the composition tree". The stakes are lower for reset
and we can probably get away with it, but it really isn't how the
hardware works...

-- PMM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-03 13:57               ` Peter Maydell
@ 2012-08-03 14:22                 ` Anthony Liguori
  2012-08-03 14:35                   ` Peter Maydell
  0 siblings, 1 reply; 35+ messages in thread
From: Anthony Liguori @ 2012-08-03 14:22 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Igor Mammedov, agraf, qemu-devel, Andreas Färber,
	David Gibson

Peter Maydell <peter.maydell@linaro.org> writes:

> On 3 August 2012 14:50, Anthony Liguori <anthony@codemonkey.ws> wrote:
>> There ought to be a hierarchy (based on composition) that reset flows
>> through.
>
> I think saying "the reset tree is isomorphic to the composition tree"
> is making the same mistake that qbus did with "the bus tree is
> isomorphic to the composition tree". The stakes are lower for reset
> and we can probably get away with it, but it really isn't how the
> hardware works...

It flows through the composition tree by default, but can be overridden
at any point.

For instance, the i440fx will absolutely want to override this behavior
such that it can flow reset through the PCI bus (which is how the PIIX3
would be reset).  However, the PIIX3 has no need to override this
behavior.

So this model should work very well for most types of virtual hardware.
But it doesn't provide for a mechanism to "after all devices are
initialized, build FDT in guest memory, then set the CPU registers to
point to it".

There's no logical device that has a scope like that that also has the
mechanism to get that type of hook in the reset path.  That's why we
need to have the QEMUMachine::reset() hook.

Regards,

Anthony Liguori

>
> -- PMM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-03 14:22                 ` Anthony Liguori
@ 2012-08-03 14:35                   ` Peter Maydell
  0 siblings, 0 replies; 35+ messages in thread
From: Peter Maydell @ 2012-08-03 14:35 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Igor Mammedov, agraf, qemu-devel, Andreas Färber,
	David Gibson

On 3 August 2012 15:22, Anthony Liguori <anthony@codemonkey.ws> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
>> On 3 August 2012 14:50, Anthony Liguori <anthony@codemonkey.ws> wrote:
>>> There ought to be a hierarchy (based on composition) that reset flows
>>> through.
>>
>> I think saying "the reset tree is isomorphic to the composition tree"
>> is making the same mistake that qbus did with "the bus tree is
>> isomorphic to the composition tree". The stakes are lower for reset
>> and we can probably get away with it, but it really isn't how the
>> hardware works...
>
> It flows through the composition tree by default, but can be overridden
> at any point.

That doesn't let you model situations where reset doesn't start at
the root of the tree, though. (eg, reset controller wants to trigger
a reset of just the CPUs, or of CPUs + board devices).

> So this model should work very well for most types of virtual hardware.
> But it doesn't provide for a mechanism to "after all devices are
> initialized, build FDT in guest memory, then set the CPU registers to
> point to it".
>
> There's no logical device that has a scope like that that also has the
> mechanism to get that type of hook in the reset path.  That's why we
> need to have the QEMUMachine::reset() hook.

Yeah, I see the need, but I wonder if calling it 'reset' is confusing:
maybe it should be 'post-reset', 'post-realize' or something?

The arm_boot code needs to do set up and run at this point too.

The other oddball case for reset is ARM M-profile cores, where the
initial PC is read from a vector table at reset rather than being
a fixed value. At the moment the mechanism we use for this is deeply
hacky: some more generic mechanism for "do this when we come out of
reset but before starting to execute" might be useful there.

-- PMM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02 19:40         ` Anthony Liguori
  2012-08-03  2:37           ` David Gibson
@ 2012-08-03 14:51           ` Andreas Färber
  2012-08-03 15:01           ` Andreas Färber
  2 siblings, 0 replies; 35+ messages in thread
From: Andreas Färber @ 2012-08-03 14:51 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: agraf, Igor Mammedov, qemu-devel, David Gibson

Am 02.08.2012 21:40, schrieb Anthony Liguori:
> Andreas Färber <afaerber@suse.de> writes:
> 
>> Am 02.08.2012 20:29, schrieb Anthony Liguori:
>>> Andreas Färber <afaerber@suse.de> writes:
>>>
>>>> Anthony was favoring moving reset code out of machines and expressed
>>>> dislike for looping through CPUs, which my above patch took into
>>>> account. The ordering issue between CPU and devices is still unsolved there.
>>>>
>>>> Some on-list comments from Anthony would be nice, since we are moving
>>>> into opposing directions here - having the sPAPR machine be more in
>>>> control vs. moving code away from the PC machine into target-i386 CPU
>>>> and/or common CPU code.
>>>
>>> I already commented on the first patch because I had a feeling you'd
>>> post something like this ;-)
>>
>> I was not cc'ed. :(

I did read the reply wrt reset controller chip btw in case you meant
that one, but it doesn't discuss QEMU API at all, only wording changes
to the commit message.

>>> Regarding reset:
>>>
>>> 1) Devices should implement DeviceState::reset()
>>>
>>> 2) If a device doesn't implement ::reset(), it should call
>>> qemu_register_reset()
>>>
>>> 3) Reset should propagate through the device model, starting with the
>>> top-level machine which is logically what's plugged into the wall and
>>> is the source of power in the first place.
>>
>> So you changed your opinion over night?
> 
> No.

Ben's cover letter indicated "as discussed with Anthony on a call",
suggesting to me that you agree to the solution presented here! Bad
choice of words then.

>> I wanted to keep the reset callbacks in the machine. You applied a patch
>> breaking that pattern and argued you wanted to move reset code *out* of
>> the machine. Now you say the machine should *propagate* reset. Sorry,
>> that's unlogical to me...
> 
> You're not listening carefully.  Just a friendly piece of advise--
> instead of sending knee-jerk emails, spend some time going back and
> re-reading these discussions.
> 
> This has been discussed literally to death now for years.

Mind you, you are communicating with non-native speakers and I had to
look up "knee-jerk". If you have a point to make, do it clearly. Your
replies have been anything but helpful to me.

You find my emails knee-jerked, I find your applying Igor's second patch
just before the 1.2 freeze a knee-jerk reaction. Especially considering
that you apply that series but not his earlier initfn one that did not
get objections any more. Two opinions.

Now, I have close to 20,000 unread qemu-devel mails alone. If you have
time to re-read the discussions from several years then I wonder why you
are not processing more uncontroversial patches and PULLs and replying
to mails. Otherwise don't ask people to do what you don't humanly manage
yourself.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-02 19:40         ` Anthony Liguori
  2012-08-03  2:37           ` David Gibson
  2012-08-03 14:51           ` Andreas Färber
@ 2012-08-03 15:01           ` Andreas Färber
  2012-08-03 16:21             ` Anthony Liguori
  2012-08-07 22:02             ` Benjamin Herrenschmidt
  2 siblings, 2 replies; 35+ messages in thread
From: Andreas Färber @ 2012-08-03 15:01 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: agraf, Igor Mammedov, qemu-devel, David Gibson

Am 02.08.2012 21:40, schrieb Anthony Liguori:
> Reset propagates.  There is unanimous consensus that this is the Right
> Way to model reset.  There is also wide consensus that reset typically
> will propagate through the composition tree although in some cases,
> reset actually propagates through the bus (this mostly affects devices
> that are children of /peripheral paths though).
> 
> The "root" of the composition tree is the machine.  The machine in the
> abstract sense, not the QEMUMachine sense.  QEMUMachine::init() should
> eventually become trivial--just create a handful of devices that
> represent the core components of the machine with everything else being
> created through composition.
> 
> Open coded logic in QEMUMachine::init is always bad.  Handling reset for
> a specific device in QEMUMachine::init is bad.  That goes against the
> idea of making QEMUMachine::init trivial.

We don't seem in disagreement so far. No one is questioning bus resets.
The issue at hand is specifically CPU reset, for which there is no bus,
no container and thus must happen somehow at machine level.

I have posted a suggestion where CPU reset is triggered by "the machine
as an abstract concept" (needs a bit of tweaking still, but the general
idea is there).
Based on that, shouldn't it be rather easy to add a Notifier similar to
"machine init done" that lets individual machines do post-reset setup?
I.e. not have QEMUMachine trigger and control the reset.

An alternative would be to have a CPUState::reset callback (in addition
to CPUClass::reset) that would by default be NULL but could be used by
the odd machines to piggy-back reset code. I think this is the safest
solution, assuring that on every cpu_reset() the custom reset code is
executed immediately.

The other issue wrt reset callback placement is CPU hotplug, where I
believe we need a callback at machine level in lack of a bus CPUs are
attached to. When the CPU is plugged we need to assure it later gets
reset by someone and added as a QOM child in the proper place. Currently
we don't have that. If we iterate through CPUs as done here we would get
that for free, otherwise we may need to register reset callbacks on
hotplug and unregister on hot-unplug at QEMUMachine level.

I am all ears for practical solutions, but theoretical talk about
containers and reset propagation doesn't seem to get us a solution.
Please say what container you mean and how/where your solutions are
supposed to work in code and how which of the proposals should be improved.

Thanks,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-03  2:31     ` David Gibson
@ 2012-08-03 15:13       ` Andreas Färber
  2012-08-06  0:31         ` David Gibson
  0 siblings, 1 reply; 35+ messages in thread
From: Andreas Färber @ 2012-08-03 15:13 UTC (permalink / raw)
  To: David Gibson; +Cc: Igor Mammedov, qemu-devel, anthony, agraf

Am 03.08.2012 04:31, schrieb David Gibson:
> On Thu, Aug 02, 2012 at 05:44:49PM +0200, Andreas Färber wrote:
>> Am 02.08.2012 04:10, schrieb David Gibson:
>>> A number of things need to occur during reset of the PAPR paravirtualized
>>> platform in a specific order.  For example, the hash table needs to be
>>> cleared before the CPUs are reset, so that they initialize their register
>>> state correctly, and the CPUs need to have their main reset called before
>>> we set up the entry point state on the boot cpu.  We also need to have
>>> the main qdev reset happen before the creation and installation of the
>>> device tree for the new boot, because we need the state of the devices
>>> settled to correctly construct the device tree.
>>>
>>> Currently reset of pseries is broken in a number of ways, and in other
>>> cases works largely by accident. This patch uses the new QEMUMachine reset
>>> hook to correct these problems, by replacing the several existing spapr
>>> reset hooks with one new machine hook which ensures that the various stages
>>> happen in the correct order.
>>>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>> ---
>>>  hw/spapr.c |   66 +++++++++++++++++++++++++++++++++---------------------------
>>>  1 file changed, 36 insertions(+), 30 deletions(-)
>>>
>>> diff --git a/hw/spapr.c b/hw/spapr.c
>>> index 2453bae..1e60ec1 100644
>>> --- a/hw/spapr.c
>>> +++ b/hw/spapr.c
>>> @@ -582,29 +582,22 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
>>>      }
>>>  }
>>>  
>>> -static void spapr_reset(void *opaque)
>>> +static void spapr_reset_cpu(CPUPPCState *env)
>>>  {
>>> -    sPAPREnvironment *spapr = (sPAPREnvironment *)opaque;
>>> -
>>> -    /* Reset the hash table & recalc the RMA */
>>> -    spapr_reset_htab(spapr);
>>> -
>>> -    /* Load the fdt */
>>> -    spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr,
>>> -                       spapr->rtas_size);
>>> -}
>>> -
>>> -static void spapr_cpu_reset(void *opaque)
>>> -{
>>> -    PowerPCCPU *cpu = opaque;
>>> -    CPUPPCState *env = &cpu->env;
>>> +    PowerPCCPU *cpu = container_of(env, PowerPCCPU, env);
>>
>> NACK. Please don't undo the cleanups I have applied! Functions should
>> take a QOM PowerPCCPU, not its internal CPUPPCState. Fields are
>> gradually being moved from CPUxxxState into CPUState.
> 
> Um, ok.  So how do I iterate the PowerPCCPUs instead of the CPUPPCStates?

You can't, yet. The QOM CPUState part 4 series (that got stalled due to
APIC modelling) moved quite some fields to CPUState but not enough to
change the first_cpu type despite the really long (74?) series.

So the solution here is to iterate the CPUPPCState, call
ppc_env_get_cpu() on it and pass that as opaque as before.

I could add a cpu_foreach() function though if that helps in the
meantime? Either way the idea is to limit the number of places to touch
in the upcoming refactorings.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-03 15:01           ` Andreas Färber
@ 2012-08-03 16:21             ` Anthony Liguori
  2012-08-07 22:02             ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 35+ messages in thread
From: Anthony Liguori @ 2012-08-03 16:21 UTC (permalink / raw)
  To: Andreas Färber; +Cc: agraf, Igor Mammedov, qemu-devel, David Gibson

Andreas Färber <afaerber@suse.de> writes:

> Am 02.08.2012 21:40, schrieb Anthony Liguori:
>> Reset propagates.  There is unanimous consensus that this is the Right
>> Way to model reset.  There is also wide consensus that reset typically
>> will propagate through the composition tree although in some cases,
>> reset actually propagates through the bus (this mostly affects devices
>> that are children of /peripheral paths though).
>> 
>> The "root" of the composition tree is the machine.  The machine in the
>> abstract sense, not the QEMUMachine sense.  QEMUMachine::init() should
>> eventually become trivial--just create a handful of devices that
>> represent the core components of the machine with everything else being
>> created through composition.
>> 
>> Open coded logic in QEMUMachine::init is always bad.  Handling reset for
>> a specific device in QEMUMachine::init is bad.  That goes against the
>> idea of making QEMUMachine::init trivial.
>
> We don't seem in disagreement so far. No one is questioning bus resets.
> The issue at hand is specifically CPU reset, for which there is no bus,
> no container and thus must happen somehow at machine level.
>
> I have posted a suggestion where CPU reset is triggered by "the machine
> as an abstract concept" (needs a bit of tweaking still, but the general
> idea is there).
> Based on that, shouldn't it be rather easy to add a Notifier similar to
> "machine init done" that lets individual machines do post-reset setup?
> I.e. not have QEMUMachine trigger and control the reset.

This means that the reset logic will be spread out.  A single hook in
QEMUMachine is much nicer.

> An alternative would be to have a CPUState::reset callback (in addition
> to CPUClass::reset) that would by default be NULL but could be used by
> the odd machines to piggy-back reset code. I think this is the safest
> solution, assuring that on every cpu_reset() the custom reset code is
> executed immediately.

I think the right way to handle reset for CPU's is exactly what's done
today.  The CPUState registers a reset handler via
qemu_register_reset().

Eventually, we would model CPUSocket, CPUCore, CPUThread (which is
essentially what is CPUState today).

Reset of CPUState would propagate its CPUCores, which would then
propagate to CPUThread.

The machine/board device would have 1-N link<CPUSocket> properties and
the reset handler for the board would propagate reset to the CPUSocket links.

> The other issue wrt reset callback placement is CPU hotplug, where I
> believe we need a callback at machine level in lack of a bus CPUs are
> attached to. When the CPU is plugged we need to assure it later gets
> reset by someone and added as a QOM child in the proper place.

Reset does two things today: tear down current state and establish
initial state.

The fact that we rely on an external call to reset to establish initial
state after construction is not ideal.  Initial state should be
established during construction either by explicitly calling a function
(it could be reset) or by setting initial state.

> Currently
> we don't have that. If we iterate through CPUs as done here we would get
> that for free, otherwise we may need to register reset callbacks on
> hotplug and unregister on hot-unplug at QEMUMachine level.

If you hotplug a CPUSocket by setting a link<> on the machine device,
then this all Just Works without any special handling.

This is what makes propagation so attractive.  You don't have to
maintain a list of everything that needs to be reset along with data
descriptions of the order.  That gets figured out lazily during reset.

Regards,

Anthony Liguori

>
> I am all ears for practical solutions, but theoretical talk about
> containers and reset propagation doesn't seem to get us a solution.
> Please say what container you mean and how/where your solutions are
> supposed to work in code and how which of the proposals should be improved.
>
> Thanks,
> Andreas
>
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-03 15:13       ` Andreas Färber
@ 2012-08-06  0:31         ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2012-08-06  0:31 UTC (permalink / raw)
  To: Andreas Färber; +Cc: Igor Mammedov, qemu-devel, anthony, agraf

On Fri, Aug 03, 2012 at 05:13:48PM +0200, Andreas Färber wrote:
> Am 03.08.2012 04:31, schrieb David Gibson:
> > On Thu, Aug 02, 2012 at 05:44:49PM +0200, Andreas Färber wrote:
> >> Am 02.08.2012 04:10, schrieb David Gibson:
[snip]
> >>> -static void spapr_cpu_reset(void *opaque)
> >>> -{
> >>> -    PowerPCCPU *cpu = opaque;
> >>> -    CPUPPCState *env = &cpu->env;
> >>> +    PowerPCCPU *cpu = container_of(env, PowerPCCPU, env);
> >>
> >> NACK. Please don't undo the cleanups I have applied! Functions should
> >> take a QOM PowerPCCPU, not its internal CPUPPCState. Fields are
> >> gradually being moved from CPUxxxState into CPUState.
> > 
> > Um, ok.  So how do I iterate the PowerPCCPUs instead of the CPUPPCStates?
> 
> You can't, yet. The QOM CPUState part 4 series (that got stalled due to
> APIC modelling) moved quite some fields to CPUState but not enough to
> change the first_cpu type despite the really long (74?) series.
> 
> So the solution here is to iterate the CPUPPCState, call
> ppc_env_get_cpu() on it and pass that as opaque as before.

So, move the container_of to the caller and use the wrapper for it (I
thought one might exist, but I hadn't spotted it).  Ok, I can do that.

> I could add a cpu_foreach() function though if that helps in the
> meantime? Either way the idea is to limit the number of places to touch
> in the upcoming refactorings.

If you like, but I'm not planning to delay these patches to wait for
it.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-03 15:01           ` Andreas Färber
  2012-08-03 16:21             ` Anthony Liguori
@ 2012-08-07 22:02             ` Benjamin Herrenschmidt
  2012-08-07 22:32               ` Andreas Färber
  1 sibling, 1 reply; 35+ messages in thread
From: Benjamin Herrenschmidt @ 2012-08-07 22:02 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Igor Mammedov, David Gibson, agraf, Anthony Liguori, qemu-devel

On Fri, 2012-08-03 at 17:01 +0200, Andreas Färber wrote:
> 
> I have posted a suggestion where CPU reset is triggered by "the
> machine
> as an abstract concept" (needs a bit of tweaking still, but the
> general
> idea is there).
> Based on that, shouldn't it be rather easy to add a Notifier similar
> to
> "machine init done" that lets individual machines do post-reset setup?
> I.e. not have QEMUMachine trigger and control the reset.
> 

Note that we really want pre and post reset vs the device reset.

That's why the machine should be the one in charge. The top level of the
reset sequencing is -not- the CPU, it's the machine. All machines (or
SoCs) have some kind of reset controller and provide facilities for
resetting individual devices, busses, processor cores.... the global
"system" reset (when it exists) itself might have interesting ordering
or sequencing requirements.

Now, to fix our immediate problem on ppc for 1.2 the hook proposed by
Anthony for which David sent a patch does the job just fine, it allows
us to clean out all our iommu tables before the device-reset, meaning
that in-flights DMA cannot overwrite the various "files" (SLOF image
etc.... that are auto-loaded via reset handlers implicitely created by
load_image_targphys), and we can then do some post-initializations as
well to get things ready for a restart (rebuild the device-tree, etc...)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-07 22:02             ` Benjamin Herrenschmidt
@ 2012-08-07 22:32               ` Andreas Färber
  2012-08-08  0:00                 ` Anthony Liguori
  2012-08-08  1:45                 ` David Gibson
  0 siblings, 2 replies; 35+ messages in thread
From: Andreas Färber @ 2012-08-07 22:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Igor Mammedov, David Gibson, agraf, Anthony Liguori, qemu-devel

Am 08.08.2012 00:02, schrieb Benjamin Herrenschmidt:
> On Fri, 2012-08-03 at 17:01 +0200, Andreas Färber wrote:
>>
>> I have posted a suggestion where CPU reset is triggered by "the
>> machine
>> as an abstract concept" (needs a bit of tweaking still, but the
>> general
>> idea is there).
>> Based on that, shouldn't it be rather easy to add a Notifier similar
>> to
>> "machine init done" that lets individual machines do post-reset setup?
>> I.e. not have QEMUMachine trigger and control the reset.
>>
> 
> Note that we really want pre and post reset vs the device reset.
> 
> That's why the machine should be the one in charge. The top level of the
> reset sequencing is -not- the CPU, it's the machine. All machines (or
> SoCs) have some kind of reset controller and provide facilities for
> resetting individual devices, busses, processor cores.... the global
> "system" reset (when it exists) itself might have interesting ordering
> or sequencing requirements.
> 
> Now, to fix our immediate problem on ppc for 1.2 the hook proposed by
> Anthony for which David sent a patch does the job just fine, it allows
> us to clean out all our iommu tables before the device-reset, meaning
> that in-flights DMA cannot overwrite the various "files" (SLOF image
> etc.... that are auto-loaded via reset handlers implicitely created by
> load_image_targphys), and we can then do some post-initializations as
> well to get things ready for a restart (rebuild the device-tree, etc...)

That's all good, except for embedded machines without such implicit
reset handling. It does contradict the "a machine is just a config file,
setting up QOM objects" concept, but I was not the one to push that! :)

What I was thinking about however were those mentioned individual cores
being reset using cpu_reset(). If we want to piggy-back some
machine-specific register initialization for individual CPUStates then
QEMUMachine::reset is not going to be enough because it only gets
triggered for complete system reset. My suggestion was thus to just call
cpu_reset() in your QEMUMachine::reset and have cpu_reset() take care of
its initialization wherever called from. Any of these solutions are easy
to implement for 1.2 if agreement is reached what people want.

What I am missing from Anthony's side is some communication to machine
maintainers on the course to adopt before applying random patches. Right
now x86 and ppc are moving into opposite directions and arm, mips, etc.
maintainers may not even be aware of ongoing changes, and there's a
pending uc32 machine that should be reviewed in this light.

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-07 22:32               ` Andreas Färber
@ 2012-08-08  0:00                 ` Anthony Liguori
  2012-08-08  7:58                   ` Peter Maydell
  2012-08-08  1:45                 ` David Gibson
  1 sibling, 1 reply; 35+ messages in thread
From: Anthony Liguori @ 2012-08-08  0:00 UTC (permalink / raw)
  To: Andreas Färber; +Cc: agraf, Igor Mammedov, qemu-devel, David Gibson

[-- Attachment #1: Type: text/plain, Size: 3587 bytes --]

On Aug 7, 2012 5:32 PM, "Andreas Färber" <afaerber@suse.de> wrote:
>
> Am 08.08.2012 00:02, schrieb Benjamin Herrenschmidt:
> > On Fri, 2012-08-03 at 17:01 +0200, Andreas Färber wrote:
> >>
> >> I have posted a suggestion where CPU reset is triggered by "the
> >> machine
> >> as an abstract concept" (needs a bit of tweaking still, but the
> >> general
> >> idea is there).
> >> Based on that, shouldn't it be rather easy to add a Notifier similar
> >> to
> >> "machine init done" that lets individual machines do post-reset setup?
> >> I.e. not have QEMUMachine trigger and control the reset.
> >>
> >
> > Note that we really want pre and post reset vs the device reset.
> >
> > That's why the machine should be the one in charge. The top level of the
> > reset sequencing is -not- the CPU, it's the machine. All machines (or
> > SoCs) have some kind of reset controller and provide facilities for
> > resetting individual devices, busses, processor cores.... the global
> > "system" reset (when it exists) itself might have interesting ordering
> > or sequencing requirements.
> >
> > Now, to fix our immediate problem on ppc for 1.2 the hook proposed by
> > Anthony for which David sent a patch does the job just fine, it allows
> > us to clean out all our iommu tables before the device-reset, meaning
> > that in-flights DMA cannot overwrite the various "files" (SLOF image
> > etc.... that are auto-loaded via reset handlers implicitely created by
> > load_image_targphys), and we can then do some post-initializations as
> > well to get things ready for a restart (rebuild the device-tree, etc...)
>
> That's all good, except for embedded machines without such implicit
> reset handling. It does contradict the "a machine is just a config file,
> setting up QOM objects" concept, but I was not the one to push that! :)
>

I will be without a proper email client for 24 hours so I'll keep this
brief for now.  What Ben et al are trying to model is something that exists
outside of the model of virtual hardware that QOM is designed for.  Since
they are trying to model something that exists outside the scope of virtual
hardware, they need something that exists at a higher level.

They need a per machine hook before and after devices are created.  This is
okay and it turns out it can be handy for other machines too that do
similiar could not exist outside of a simulator features.

Regards,

Anthony Liguori

> What I was thinking about however were those mentioned individual cores
> being reset using cpu_reset(). If we want to piggy-back some
> machine-specific register initialization for individual CPUStates then
> QEMUMachine::reset is not going to be enough because it only gets
> triggered for complete system reset. My suggestion was thus to just call
> cpu_reset() in your QEMUMachine::reset and have cpu_reset() take care of
> its initialization wherever called from. Any of these solutions are easy
> to implement for 1.2 if agreement is reached what people want.
>
> What I am missing from Anthony's side is some communication to machine
> maintainers on the course to adopt before applying random patches. Right
> now x86 and ppc are moving into opposite directions and arm, mips, etc.
> maintainers may not even be aware of ongoing changes, and there's a
> pending uc32 machine that should be reviewed in this light.
>
> Cheers,
> Andreas
>
> --
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

[-- Attachment #2: Type: text/html, Size: 4274 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-07 22:32               ` Andreas Färber
  2012-08-08  0:00                 ` Anthony Liguori
@ 2012-08-08  1:45                 ` David Gibson
  2012-08-08 15:22                   ` Andreas Färber
  1 sibling, 1 reply; 35+ messages in thread
From: David Gibson @ 2012-08-08  1:45 UTC (permalink / raw)
  To: Andreas Färber; +Cc: qemu-devel, agraf, Anthony Liguori, Igor Mammedov

On Wed, Aug 08, 2012 at 12:32:39AM +0200, Andreas Färber wrote:
> Am 08.08.2012 00:02, schrieb Benjamin Herrenschmidt:
> > On Fri, 2012-08-03 at 17:01 +0200, Andreas Färber wrote:
> >>
> >> I have posted a suggestion where CPU reset is triggered by "the
> >> machine
> >> as an abstract concept" (needs a bit of tweaking still, but the
> >> general
> >> idea is there).
> >> Based on that, shouldn't it be rather easy to add a Notifier similar
> >> to
> >> "machine init done" that lets individual machines do post-reset setup?
> >> I.e. not have QEMUMachine trigger and control the reset.
> >>
> > 
> > Note that we really want pre and post reset vs the device reset.
> > 
> > That's why the machine should be the one in charge. The top level of the
> > reset sequencing is -not- the CPU, it's the machine. All machines (or
> > SoCs) have some kind of reset controller and provide facilities for
> > resetting individual devices, busses, processor cores.... the global
> > "system" reset (when it exists) itself might have interesting ordering
> > or sequencing requirements.
> > 
> > Now, to fix our immediate problem on ppc for 1.2 the hook proposed by
> > Anthony for which David sent a patch does the job just fine, it allows
> > us to clean out all our iommu tables before the device-reset, meaning
> > that in-flights DMA cannot overwrite the various "files" (SLOF image
> > etc.... that are auto-loaded via reset handlers implicitely created by
> > load_image_targphys), and we can then do some post-initializations as
> > well to get things ready for a restart (rebuild the device-tree, etc...)
> 
> That's all good, except for embedded machines without such implicit
> reset handling. It does contradict the "a machine is just a config file,
> setting up QOM objects" concept, but I was not the one to push that! :)
> 
> What I was thinking about however were those mentioned individual cores
> being reset using cpu_reset(). If we want to piggy-back some
> machine-specific register initialization for individual CPUStates then
> QEMUMachine::reset is not going to be enough because it only gets
> triggered for complete system reset. My suggestion was thus to just call
> cpu_reset() in your QEMUMachine::reset and have cpu_reset() take care of
> its initialization wherever called from. Any of these solutions are easy
> to implement for 1.2 if agreement is reached what people want.

So, I more or less reaslied that myself and my new version of the
reset patch (which I expect to send out later today) kind of does
that.  I no longer do the machine specific CPU state setup from the
QEMUMachine::reset, it's done from the per-cpu reset handler.  The
QEMUMachine::reset just does the special setup that's only for the
CPU0 entry conditions, which *is* specific to a full system reset (not
that I think we can get an individual CPU reset on pseries, anyway).

> What I am missing from Anthony's side is some communication to machine
> maintainers on the course to adopt before applying random patches. Right
> now x86 and ppc are moving into opposite directions and arm, mips, etc.
> maintainers may not even be aware of ongoing changes, and there's a
> pending uc32 machine that should be reviewed in this light.

So.. having the CPU reset at the top of the tree definitely makes no
sense - if nothing else, *which* cpu when there's more than one.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-08  0:00                 ` Anthony Liguori
@ 2012-08-08  7:58                   ` Peter Maydell
  2012-08-08  8:44                     ` David Gibson
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Maydell @ 2012-08-08  7:58 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: qemu-devel, Igor Mammedov, David Gibson, Andreas Färber,
	agraf

On 8 August 2012 01:00, Anthony Liguori <anthony@codemonkey.ws> wrote:
>
> They need a per machine hook before and after devices are created.  This is
> okay and it turns out it can be handy for other machines too that do
> similiar could not exist outside of a simulator features.

If it's 'before and after device creation' maybe we could call it something
less confusing than "reset" ?

-- PMM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-08  7:58                   ` Peter Maydell
@ 2012-08-08  8:44                     ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2012-08-08  8:44 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-devel, Igor Mammedov, Andreas Färber, Anthony Liguori,
	agraf

On Wed, Aug 08, 2012 at 08:58:07AM +0100, Peter Maydell wrote:
> On 8 August 2012 01:00, Anthony Liguori <anthony@codemonkey.ws> wrote:
> >
> > They need a per machine hook before and after devices are created.  This is
> > okay and it turns out it can be handy for other machines too that do
> > similiar could not exist outside of a simulator features.
> 
> If it's 'before and after device creation' maybe we could call it something
> less confusing than "reset" ?

It's not actually before and after devices are created.  It's before
and after devices are reset.  At least that's the easiest way to use
it, but it can also be used to take over control of the device reset
order itself.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-08  1:45                 ` David Gibson
@ 2012-08-08 15:22                   ` Andreas Färber
  2012-08-09  0:12                     ` David Gibson
  0 siblings, 1 reply; 35+ messages in thread
From: Andreas Färber @ 2012-08-08 15:22 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-devel, agraf, Anthony Liguori, Igor Mammedov

Am 08.08.2012 03:45, schrieb David Gibson:
> On Wed, Aug 08, 2012 at 12:32:39AM +0200, Andreas Färber wrote:
>> Am 08.08.2012 00:02, schrieb Benjamin Herrenschmidt:
>>> On Fri, 2012-08-03 at 17:01 +0200, Andreas Färber wrote:
>>>>
>>>> I have posted a suggestion where CPU reset is triggered by "the
>>>> machine
>>>> as an abstract concept" (needs a bit of tweaking still, but the
>>>> general
>>>> idea is there).
>>>> Based on that, shouldn't it be rather easy to add a Notifier similar
>>>> to
>>>> "machine init done" that lets individual machines do post-reset setup?
>>>> I.e. not have QEMUMachine trigger and control the reset.
>>>>
>>>
>>> Note that we really want pre and post reset vs the device reset.
>>>
>>> That's why the machine should be the one in charge. The top level of the
>>> reset sequencing is -not- the CPU, it's the machine. All machines (or
>>> SoCs) have some kind of reset controller and provide facilities for
>>> resetting individual devices, busses, processor cores.... the global
>>> "system" reset (when it exists) itself might have interesting ordering
>>> or sequencing requirements.
>>>
>>> Now, to fix our immediate problem on ppc for 1.2 the hook proposed by
>>> Anthony for which David sent a patch does the job just fine, it allows
>>> us to clean out all our iommu tables before the device-reset, meaning
>>> that in-flights DMA cannot overwrite the various "files" (SLOF image
>>> etc.... that are auto-loaded via reset handlers implicitely created by
>>> load_image_targphys), and we can then do some post-initializations as
>>> well to get things ready for a restart (rebuild the device-tree, etc...)
>>
>> That's all good, except for embedded machines without such implicit
>> reset handling. It does contradict the "a machine is just a config file,
>> setting up QOM objects" concept, but I was not the one to push that! :)
>>
>> What I was thinking about however were those mentioned individual cores
>> being reset using cpu_reset(). If we want to piggy-back some
>> machine-specific register initialization for individual CPUStates then
>> QEMUMachine::reset is not going to be enough because it only gets
>> triggered for complete system reset. My suggestion was thus to just call
>> cpu_reset() in your QEMUMachine::reset and have cpu_reset() take care of
>> its initialization wherever called from. Any of these solutions are easy
>> to implement for 1.2 if agreement is reached what people want.
> 
> So, I more or less reaslied that myself and my new version of the
> reset patch (which I expect to send out later today) kind of does
> that.  I no longer do the machine specific CPU state setup from the
> QEMUMachine::reset, it's done from the per-cpu reset handler.  The
> QEMUMachine::reset just does the special setup that's only for the
> CPU0 entry conditions, which *is* specific to a full system reset (not
> that I think we can get an individual CPU reset on pseries, anyway).
> 
>> What I am missing from Anthony's side is some communication to machine
>> maintainers on the course to adopt before applying random patches. Right
>> now x86 and ppc are moving into opposite directions and arm, mips, etc.
>> maintainers may not even be aware of ongoing changes, and there's a
>> pending uc32 machine that should be reviewed in this light.
> 
> So.. having the CPU reset at the top of the tree definitely makes no
> sense - if nothing else, *which* cpu when there's more than one.

Maybe let me restate clearly what I am looking for in this discussion:

I would like a clear definition of
* what is the "normal" case, and
* what is the special case.

The special case sPAPR seems uncontroversial.

So, a bonus would be if we can have a default implementation (of
QEMUMachine::reset or whatever we end up doing) so that the average
machine does not need to fiddle with reset callbacks in
QEMUMachine::init. For example, have a machine_default_reset() as
fallback for QEMUMachine::reset == NULL that resets all CPUs (in order
of the singly linked list) and then does qemu_devices_reset()? sPAPR
would then override that default implementation by specifying its own
implementation and we could get rid of reset callbacks in an estimated
70% of QEMUMachine::init. (The less people fiddle at that level the
easier to refactor for me.) That could well be a later follow-up to your
v2, which looked okay on brief sight.

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence
  2012-08-08 15:22                   ` Andreas Färber
@ 2012-08-09  0:12                     ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2012-08-09  0:12 UTC (permalink / raw)
  To: Andreas Färber; +Cc: qemu-devel, agraf, Anthony Liguori, Igor Mammedov

On Wed, Aug 08, 2012 at 05:22:11PM +0200, Andreas Färber wrote:
> Am 08.08.2012 03:45, schrieb David Gibson:
> > On Wed, Aug 08, 2012 at 12:32:39AM +0200, Andreas Färber wrote:
> >> Am 08.08.2012 00:02, schrieb Benjamin Herrenschmidt:
> >>> On Fri, 2012-08-03 at 17:01 +0200, Andreas Färber wrote:
> >>>>
> >>>> I have posted a suggestion where CPU reset is triggered by "the
> >>>> machine
> >>>> as an abstract concept" (needs a bit of tweaking still, but the
> >>>> general
> >>>> idea is there).
> >>>> Based on that, shouldn't it be rather easy to add a Notifier similar
> >>>> to
> >>>> "machine init done" that lets individual machines do post-reset setup?
> >>>> I.e. not have QEMUMachine trigger and control the reset.
> >>>>
> >>>
> >>> Note that we really want pre and post reset vs the device reset.
> >>>
> >>> That's why the machine should be the one in charge. The top level of the
> >>> reset sequencing is -not- the CPU, it's the machine. All machines (or
> >>> SoCs) have some kind of reset controller and provide facilities for
> >>> resetting individual devices, busses, processor cores.... the global
> >>> "system" reset (when it exists) itself might have interesting ordering
> >>> or sequencing requirements.
> >>>
> >>> Now, to fix our immediate problem on ppc for 1.2 the hook proposed by
> >>> Anthony for which David sent a patch does the job just fine, it allows
> >>> us to clean out all our iommu tables before the device-reset, meaning
> >>> that in-flights DMA cannot overwrite the various "files" (SLOF image
> >>> etc.... that are auto-loaded via reset handlers implicitely created by
> >>> load_image_targphys), and we can then do some post-initializations as
> >>> well to get things ready for a restart (rebuild the device-tree, etc...)
> >>
> >> That's all good, except for embedded machines without such implicit
> >> reset handling. It does contradict the "a machine is just a config file,
> >> setting up QOM objects" concept, but I was not the one to push that! :)
> >>
> >> What I was thinking about however were those mentioned individual cores
> >> being reset using cpu_reset(). If we want to piggy-back some
> >> machine-specific register initialization for individual CPUStates then
> >> QEMUMachine::reset is not going to be enough because it only gets
> >> triggered for complete system reset. My suggestion was thus to just call
> >> cpu_reset() in your QEMUMachine::reset and have cpu_reset() take care of
> >> its initialization wherever called from. Any of these solutions are easy
> >> to implement for 1.2 if agreement is reached what people want.
> > 
> > So, I more or less reaslied that myself and my new version of the
> > reset patch (which I expect to send out later today) kind of does
> > that.  I no longer do the machine specific CPU state setup from the
> > QEMUMachine::reset, it's done from the per-cpu reset handler.  The
> > QEMUMachine::reset just does the special setup that's only for the
> > CPU0 entry conditions, which *is* specific to a full system reset (not
> > that I think we can get an individual CPU reset on pseries, anyway).
> > 
> >> What I am missing from Anthony's side is some communication to machine
> >> maintainers on the course to adopt before applying random patches. Right
> >> now x86 and ppc are moving into opposite directions and arm, mips, etc.
> >> maintainers may not even be aware of ongoing changes, and there's a
> >> pending uc32 machine that should be reviewed in this light.
> > 
> > So.. having the CPU reset at the top of the tree definitely makes no
> > sense - if nothing else, *which* cpu when there's more than one.
> 
> Maybe let me restate clearly what I am looking for in this discussion:
> 
> I would like a clear definition of
> * what is the "normal" case, and
> * what is the special case.
> 
> The special case sPAPR seems uncontroversial.
> 
> So, a bonus would be if we can have a default implementation (of
> QEMUMachine::reset or whatever we end up doing) so that the average
> machine does not need to fiddle with reset callbacks in
> QEMUMachine::init. For example, have a machine_default_reset() as
> fallback for QEMUMachine::reset == NULL that resets all CPUs (in order
> of the singly linked list) and then does qemu_devices_reset()? sPAPR
> would then override that default implementation by specifying its own
> implementation and we could get rid of reset callbacks in an estimated
> 70% of QEMUMachine::init. (The less people fiddle at that level the
> easier to refactor for me.) That could well be a later follow-up to your
> v2, which looked okay on brief sight.

We already have that.  If QEMUMachine::reset is NULL,
qemu_system_reset() does qemu_devices_reset() which is exactly the
same as what it did before.  qemu_devices_reset() calls all the reset
callback handlers, so it will also reset the CPUs if a suitable CPU
reset handler has been registered.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2012-08-09  0:16 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-02  2:10 [Qemu-devel] [0/2] Allow machine to control ordering of reset David Gibson
2012-08-02  2:10 ` [Qemu-devel] [PATCH 1/2] Allow QEMUMachine to override reset sequencing David Gibson
2012-08-02  2:37   ` Anthony Liguori
2012-08-02  3:50     ` Benjamin Herrenschmidt
2012-08-02 15:17       ` Paolo Bonzini
2012-08-02 20:46         ` Benjamin Herrenschmidt
2012-08-02 20:58           ` Anthony Liguori
2012-08-03  2:54         ` David Gibson
2012-08-03  3:08     ` David Gibson
2012-08-02 15:00   ` Lluís Vilanova
2012-08-03  2:25     ` David Gibson
2012-08-02  2:10 ` [Qemu-devel] [PATCH 2/2] pseries: Use new hook to correct reset sequence David Gibson
2012-08-02 15:44   ` Andreas Färber
2012-08-02 18:29     ` Anthony Liguori
2012-08-02 18:38       ` Andreas Färber
2012-08-02 19:40         ` Anthony Liguori
2012-08-03  2:37           ` David Gibson
2012-08-03 13:50             ` Anthony Liguori
2012-08-03 13:57               ` Peter Maydell
2012-08-03 14:22                 ` Anthony Liguori
2012-08-03 14:35                   ` Peter Maydell
2012-08-03 14:51           ` Andreas Färber
2012-08-03 15:01           ` Andreas Färber
2012-08-03 16:21             ` Anthony Liguori
2012-08-07 22:02             ` Benjamin Herrenschmidt
2012-08-07 22:32               ` Andreas Färber
2012-08-08  0:00                 ` Anthony Liguori
2012-08-08  7:58                   ` Peter Maydell
2012-08-08  8:44                     ` David Gibson
2012-08-08  1:45                 ` David Gibson
2012-08-08 15:22                   ` Andreas Färber
2012-08-09  0:12                     ` David Gibson
2012-08-03  2:31     ` David Gibson
2012-08-03 15:13       ` Andreas Färber
2012-08-06  0:31         ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).