* [Qemu-devel] [0/7] pseries: Patches to fix system reset @ 2012-08-15 4:33 David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 1/7] Allow QEMUMachine to override reset sequencing David Gibson ` (6 more replies) 0 siblings, 7 replies; 10+ messages in thread From: David Gibson @ 2012-08-15 4:33 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, paulus Hi Alex, Here is a string of patches which fix most of the many problems with system reset on the pseriss machine. They apply on top of my other string of pseries patches which you already merged. They apply before Li Zhang's usb and vga patches, since it looks like those will go another iteration, I can easily rebase after those if that would be more convenient. 1/7 is a generic patch which I have already sent to Anthony, but it hasn't gone into mainline yet, the rest of the series is dependent on it, though, so it's included here. It's also dependent on newer kernel headers than are in mainline, but which I think you already have in your tree This does quite a pit of rework to the pseries reset sequence, with some influence on the ppc target at large. It fixes both general and kvm specific bugs, although a number of the general bugs were very difficult to actually trigger without kvm anyway (because full emu SMP is so achingly slow, and I think has some other bugs I haven't had time to investigate yet). There are some known reset problems still remaining, specifically: * We need to reset the VPA registration in KVM as well. I have a tentative patch for that, but I'm waiting for Paul to send the necessary KVM bits upstream. * We should reset the TCE table on the emulated PCI host bridge as well as VIO devices. I just haven't yet had a chance to figure out the right place to wire in that reset yet. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 1/7] Allow QEMUMachine to override reset sequencing 2012-08-15 4:33 [Qemu-devel] [0/7] pseries: Patches to fix system reset David Gibson @ 2012-08-15 4:33 ` David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 2/7] ppc: Make kvm_arch_put_registers() put *all* the registers David Gibson ` (5 subsequent siblings) 6 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2012-08-15 4:33 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, paulus, David Gibson qemu_system_reset() function always performs the same basic actions on all machines. This includes running all the reset handler hooks, however the order in which these will run is not always easily predictable. This patch splits the core of qemu_system_reset() - the invocation of the reset handlers - out into a new qemu_devices_reset() function. qemu_system_reset() will usually call qemu_devices_reset(), but that can be now overriden by a new reset method in the QEMUMachine structure. Individual machines can use this reset method, if necessary, to perform any extra, machine specific initializations which have to occur before or after the bulk of the reset handlers. It's expected that the method will call qemu_devices_reset() at some point, but if the machine has really strange ordering requirements between devices resets it could even override that with it's own reset sequence (with great care, obviously). For a specific example of when this might be needed: a number of machines (but not PC) load images specified with -kernel or -initrd directly into the machine RAM before booting the guest. This mostly works at the moment, but to make this actually safe requires that this load occurs after peripheral devices are reset - otherwise they could have active DMAs in progress which would clobber the in memory images. Some machines (notably pseries) also have other entry conditions which need to be set up as the last thing before executing in guest space - some of this could be considered "emulated firmware" in the sense that the actions of the firmware are emulated directly by qemu rather than by executing a firmware image within the guest. When the platform's firmware to OS interface is sufficiently well specified, this saves time both in implementing the "firmware" and executing it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- hw/boards.h | 3 +++ sysemu.h | 1 + vl.c | 11 ++++++++++- 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/hw/boards.h b/hw/boards.h index 59c01d0..a2e0a54 100644 --- a/hw/boards.h +++ b/hw/boards.h @@ -12,11 +12,14 @@ typedef void QEMUMachineInitFunc(ram_addr_t ram_size, const char *initrd_filename, const char *cpu_model); +typedef void QEMUMachineResetFunc(void); + typedef struct QEMUMachine { const char *name; const char *alias; const char *desc; QEMUMachineInitFunc *init; + QEMUMachineResetFunc *reset; int use_scsi; int max_cpus; unsigned int no_serial:1, diff --git a/sysemu.h b/sysemu.h index 4669348..65552ac 100644 --- a/sysemu.h +++ b/sysemu.h @@ -62,6 +62,7 @@ int qemu_powerdown_requested(void); void qemu_system_killed(int signal, pid_t pid); void qemu_kill_report(void); extern qemu_irq qemu_system_powerdown; +void qemu_devices_reset(void); void qemu_system_reset(bool report); void qemu_add_exit_notifier(Notifier *notify); diff --git a/vl.c b/vl.c index d01256a..757d84a 100644 --- a/vl.c +++ b/vl.c @@ -1439,7 +1439,7 @@ void qemu_unregister_reset(QEMUResetHandler *func, void *opaque) } } -void qemu_system_reset(bool report) +void qemu_devices_reset(void) { QEMUResetEntry *re, *nre; @@ -1447,6 +1447,15 @@ void qemu_system_reset(bool report) QTAILQ_FOREACH_SAFE(re, &reset_handlers, entry, nre) { re->func(re->opaque); } +} + +void qemu_system_reset(bool report) +{ + if (current_machine->reset) { + current_machine->reset(); + } else { + qemu_devices_reset(); + } if (report) { monitor_protocol_event(QEVENT_RESET, NULL); } -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 2/7] ppc: Make kvm_arch_put_registers() put *all* the registers 2012-08-15 4:33 [Qemu-devel] [0/7] pseries: Patches to fix system reset David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 1/7] Allow QEMUMachine to override reset sequencing David Gibson @ 2012-08-15 4:33 ` David Gibson 2012-08-17 13:58 ` Alexander Graf 2012-08-15 4:33 ` [Qemu-devel] [PATCH 3/7] pseries: Fix and cleanup CPU initialization and reset David Gibson ` (4 subsequent siblings) 6 siblings, 1 reply; 10+ messages in thread From: David Gibson @ 2012-08-15 4:33 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, paulus, David Gibson At least when invoked with high enough 'level' arguments, kvm_arch_put_registers() is supposed to copy essentially all the cpu state as encoded in qemu's internal structures into the kvm state. Currently the ppc version does not do this - it never calls KVM_SET_SREGS, for example, and therefore never sets the SDR1 and various other important though rarely changed registers. Instead, the code paths which need to set these registers need to explicitly make (conditional) kvm calls which transfer the changes to kvm. This breaks the usual model of handling state updates in qemu, where code just changes the internal model and has it flushed out to kvm automatically at some later point. This patch fixes this for Book S ppc CPUs by adding a suitable call to KVM_SET_SREGS and als to KVM_SET_ONE_REG to set the HIOR (the only register that is set with that call so far). This lets us remove the hacks to explicitly set these registers from the kvmppc_set_papr() function. The problem still exists for Book E CPUs (which use a different version of the kvm_sregs structure). But fixing that has some complications of its own so can be left to another day. Lkewise, there is still some ugly code for setting the PVR through special calls to SET_SREGS which is left in for now. The PVR needs to be set especially early because it can affect what other features are available on the CPU, so I need to do more thinking to see if it can be integrated into the normal paths or not. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- target-ppc/kvm.c | 89 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 50 insertions(+), 39 deletions(-) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index a31d278..1a7489b 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -60,6 +60,7 @@ static int cap_booke_sregs; static int cap_ppc_smt; static int cap_ppc_rma; static int cap_spapr_tce; +static int cap_hior; /* XXX We have a race condition where we actually have a level triggered * interrupt, but the infrastructure can't expose that yet, so the guest @@ -86,6 +87,7 @@ int kvm_arch_init(KVMState *s) cap_ppc_smt = kvm_check_extension(s, KVM_CAP_PPC_SMT); cap_ppc_rma = kvm_check_extension(s, KVM_CAP_PPC_RMA); cap_spapr_tce = kvm_check_extension(s, KVM_CAP_SPAPR_TCE); + cap_hior = kvm_check_extension(s, KVM_CAP_PPC_HIOR); if (!cap_interrupt_level) { fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the " @@ -469,6 +471,53 @@ int kvm_arch_put_registers(CPUPPCState *env, int level) env->tlb_dirty = false; } + if (cap_segstate && (level >= KVM_PUT_RESET_STATE)) { + struct kvm_sregs sregs; + + sregs.pvr = env->spr[SPR_PVR]; + + sregs.u.s.sdr1 = env->spr[SPR_SDR1]; + + /* Sync SLB */ +#ifdef TARGET_PPC64 + for (i = 0; i < 64; i++) { + sregs.u.s.ppc64.slb[i].slbe = env->slb[i].esid; + sregs.u.s.ppc64.slb[i].slbv = env->slb[i].vsid; + } +#endif + + /* Sync SRs */ + for (i = 0; i < 16; i++) { + sregs.u.s.ppc32.sr[i] = env->sr[i]; + } + + /* Sync BATs */ + for (i = 0; i < 8; i++) { + sregs.u.s.ppc32.dbat[i] = ((uint64_t)env->DBAT[1][i] << 32) + | env->DBAT[0][i]; + sregs.u.s.ppc32.ibat[i] = ((uint64_t)env->IBAT[1][i] << 32) + | env->IBAT[0][i]; + } + + ret = kvm_vcpu_ioctl(env, KVM_SET_SREGS, &sregs); + if (ret) { + return ret; + } + } + + if (cap_hior && (level >= KVM_PUT_RESET_STATE)) { + uint64_t hior = env->spr[SPR_HIOR]; + struct kvm_one_reg reg = { + .id = KVM_REG_PPC_HIOR, + .addr = (uintptr_t) &hior, + }; + + ret = kvm_vcpu_ioctl(env, KVM_SET_ONE_REG, ®); + if (ret) { + return ret; + } + } + return ret; } @@ -946,52 +995,14 @@ int kvmppc_get_hypercall(CPUPPCState *env, uint8_t *buf, int buf_len) void kvmppc_set_papr(CPUPPCState *env) { struct kvm_enable_cap cap = {}; - struct kvm_one_reg reg = {}; - struct kvm_sregs sregs = {}; int ret; - uint64_t hior = env->spr[SPR_HIOR]; cap.cap = KVM_CAP_PPC_PAPR; ret = kvm_vcpu_ioctl(env, KVM_ENABLE_CAP, &cap); if (ret) { - goto fail; - } - - /* - * XXX We set HIOR here. It really should be a qdev property of - * the CPU node, but we don't have CPUs converted to qdev yet. - * - * Once we have qdev CPUs, move HIOR to a qdev property and - * remove this chunk. - */ - reg.id = KVM_REG_PPC_HIOR; - reg.addr = (uintptr_t)&hior; - ret = kvm_vcpu_ioctl(env, KVM_SET_ONE_REG, ®); - if (ret) { - fprintf(stderr, "Couldn't set HIOR. Maybe you're running an old \n" - "kernel with support for HV KVM but no PAPR PR \n" - "KVM in which case things will work. If they don't \n" - "please update your host kernel!\n"); - } - - /* Set SDR1 so kernel space finds the HTAB */ - ret = kvm_vcpu_ioctl(env, KVM_GET_SREGS, &sregs); - if (ret) { - goto fail; - } - - sregs.u.s.sdr1 = env->spr[SPR_SDR1]; - - ret = kvm_vcpu_ioctl(env, KVM_SET_SREGS, &sregs); - if (ret) { - goto fail; + cpu_abort(env, "This KVM version does not support PAPR\n"); } - - return; - -fail: - cpu_abort(env, "This KVM version does not support PAPR\n"); } int kvmppc_smt_threads(void) -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 2/7] ppc: Make kvm_arch_put_registers() put *all* the registers 2012-08-15 4:33 ` [Qemu-devel] [PATCH 2/7] ppc: Make kvm_arch_put_registers() put *all* the registers David Gibson @ 2012-08-17 13:58 ` Alexander Graf 2012-08-17 16:24 ` [Qemu-devel] [Qemu-ppc] " David Gibson 0 siblings, 1 reply; 10+ messages in thread From: Alexander Graf @ 2012-08-17 13:58 UTC (permalink / raw) To: David Gibson; +Cc: qemu-ppc, qemu-devel, paulus On 08/15/2012 06:33 AM, David Gibson wrote: > At least when invoked with high enough 'level' arguments, > kvm_arch_put_registers() is supposed to copy essentially all the cpu state > as encoded in qemu's internal structures into the kvm state. Currently > the ppc version does not do this - it never calls KVM_SET_SREGS, for > example, and therefore never sets the SDR1 and various other important > though rarely changed registers. > > Instead, the code paths which need to set these registers need to > explicitly make (conditional) kvm calls which transfer the changes to kvm. > This breaks the usual model of handling state updates in qemu, where code > just changes the internal model and has it flushed out to kvm automatically > at some later point. > > This patch fixes this for Book S ppc CPUs by adding a suitable call to > KVM_SET_SREGS and als to KVM_SET_ONE_REG to set the HIOR (the only register > that is set with that call so far). This lets us remove the hacks to > explicitly set these registers from the kvmppc_set_papr() function. HIOR is a read-only register from the guest's point of view when running in PAPR mode, so we don't need to sync it back again. The same goes for SDR1, though resetting that is valid for non-PAPR guests. Overall, does a normal system reset on PPC guarantee that the SRs and SLBs are reset? At least OpenBIOS boots up in real mode and overwrites all SR/SLB entries while still in real mode. Alex > > The problem still exists for Book E CPUs (which use a different version of > the kvm_sregs structure). But fixing that has some complications of its > own so can be left to another day. > > Lkewise, there is still some ugly code for setting the PVR through special > calls to SET_SREGS which is left in for now. The PVR needs to be set > especially early because it can affect what other features are available > on the CPU, so I need to do more thinking to see if it can be integrated > into the normal paths or not. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > target-ppc/kvm.c | 89 ++++++++++++++++++++++++++++++------------------------ > 1 file changed, 50 insertions(+), 39 deletions(-) > > diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c > index a31d278..1a7489b 100644 > --- a/target-ppc/kvm.c > +++ b/target-ppc/kvm.c > @@ -60,6 +60,7 @@ static int cap_booke_sregs; > static int cap_ppc_smt; > static int cap_ppc_rma; > static int cap_spapr_tce; > +static int cap_hior; > > /* XXX We have a race condition where we actually have a level triggered > * interrupt, but the infrastructure can't expose that yet, so the guest > @@ -86,6 +87,7 @@ int kvm_arch_init(KVMState *s) > cap_ppc_smt = kvm_check_extension(s, KVM_CAP_PPC_SMT); > cap_ppc_rma = kvm_check_extension(s, KVM_CAP_PPC_RMA); > cap_spapr_tce = kvm_check_extension(s, KVM_CAP_SPAPR_TCE); > + cap_hior = kvm_check_extension(s, KVM_CAP_PPC_HIOR); > > if (!cap_interrupt_level) { > fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the " > @@ -469,6 +471,53 @@ int kvm_arch_put_registers(CPUPPCState *env, int level) > env->tlb_dirty = false; > } > > + if (cap_segstate && (level >= KVM_PUT_RESET_STATE)) { > + struct kvm_sregs sregs; > + > + sregs.pvr = env->spr[SPR_PVR]; > + > + sregs.u.s.sdr1 = env->spr[SPR_SDR1]; > + > + /* Sync SLB */ > +#ifdef TARGET_PPC64 > + for (i = 0; i < 64; i++) { > + sregs.u.s.ppc64.slb[i].slbe = env->slb[i].esid; > + sregs.u.s.ppc64.slb[i].slbv = env->slb[i].vsid; > + } > +#endif > + > + /* Sync SRs */ > + for (i = 0; i < 16; i++) { > + sregs.u.s.ppc32.sr[i] = env->sr[i]; > + } > + > + /* Sync BATs */ > + for (i = 0; i < 8; i++) { > + sregs.u.s.ppc32.dbat[i] = ((uint64_t)env->DBAT[1][i] << 32) > + | env->DBAT[0][i]; > + sregs.u.s.ppc32.ibat[i] = ((uint64_t)env->IBAT[1][i] << 32) > + | env->IBAT[0][i]; > + } > + > + ret = kvm_vcpu_ioctl(env, KVM_SET_SREGS, &sregs); > + if (ret) { > + return ret; > + } > + } > + > + if (cap_hior && (level >= KVM_PUT_RESET_STATE)) { > + uint64_t hior = env->spr[SPR_HIOR]; > + struct kvm_one_reg reg = { > + .id = KVM_REG_PPC_HIOR, > + .addr = (uintptr_t) &hior, > + }; > + > + ret = kvm_vcpu_ioctl(env, KVM_SET_ONE_REG, ®); > + if (ret) { > + return ret; > + } > + } > + > return ret; > } > > @@ -946,52 +995,14 @@ int kvmppc_get_hypercall(CPUPPCState *env, uint8_t *buf, int buf_len) > void kvmppc_set_papr(CPUPPCState *env) > { > struct kvm_enable_cap cap = {}; > - struct kvm_one_reg reg = {}; > - struct kvm_sregs sregs = {}; > int ret; > - uint64_t hior = env->spr[SPR_HIOR]; > > cap.cap = KVM_CAP_PPC_PAPR; > ret = kvm_vcpu_ioctl(env, KVM_ENABLE_CAP, &cap); > > if (ret) { > - goto fail; > - } > - > - /* > - * XXX We set HIOR here. It really should be a qdev property of > - * the CPU node, but we don't have CPUs converted to qdev yet. > - * > - * Once we have qdev CPUs, move HIOR to a qdev property and > - * remove this chunk. > - */ > - reg.id = KVM_REG_PPC_HIOR; > - reg.addr = (uintptr_t)&hior; > - ret = kvm_vcpu_ioctl(env, KVM_SET_ONE_REG, ®); > - if (ret) { > - fprintf(stderr, "Couldn't set HIOR. Maybe you're running an old \n" > - "kernel with support for HV KVM but no PAPR PR \n" > - "KVM in which case things will work. If they don't \n" > - "please update your host kernel!\n"); > - } > - > - /* Set SDR1 so kernel space finds the HTAB */ > - ret = kvm_vcpu_ioctl(env, KVM_GET_SREGS, &sregs); > - if (ret) { > - goto fail; > - } > - > - sregs.u.s.sdr1 = env->spr[SPR_SDR1]; > - > - ret = kvm_vcpu_ioctl(env, KVM_SET_SREGS, &sregs); > - if (ret) { > - goto fail; > + cpu_abort(env, "This KVM version does not support PAPR\n"); > } > - > - return; > - > -fail: > - cpu_abort(env, "This KVM version does not support PAPR\n"); > } > > int kvmppc_smt_threads(void) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/7] ppc: Make kvm_arch_put_registers() put *all* the registers 2012-08-17 13:58 ` Alexander Graf @ 2012-08-17 16:24 ` David Gibson 0 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2012-08-17 16:24 UTC (permalink / raw) To: Alexander Graf; +Cc: paulus, qemu-ppc, qemu-devel On Fri, Aug 17, 2012 at 03:58:08PM +0200, Alexander Graf wrote: > On 08/15/2012 06:33 AM, David Gibson wrote: > >At least when invoked with high enough 'level' arguments, > >kvm_arch_put_registers() is supposed to copy essentially all the cpu state > >as encoded in qemu's internal structures into the kvm state. Currently > >the ppc version does not do this - it never calls KVM_SET_SREGS, for > >example, and therefore never sets the SDR1 and various other important > >though rarely changed registers. > > > >Instead, the code paths which need to set these registers need to > >explicitly make (conditional) kvm calls which transfer the changes to kvm. > >This breaks the usual model of handling state updates in qemu, where code > >just changes the internal model and has it flushed out to kvm automatically > >at some later point. > > > >This patch fixes this for Book S ppc CPUs by adding a suitable call to > >KVM_SET_SREGS and als to KVM_SET_ONE_REG to set the HIOR (the only register > >that is set with that call so far). This lets us remove the hacks to > >explicitly set these registers from the kvmppc_set_papr() function. > > HIOR is a read-only register from the guest's point of view when > running in PAPR mode, so we don't need to sync it back again. The > same goes for SDR1, though resetting that is valid for non-PAPR > guests. *When running in PAPR mode*, which we aren't always. System resets are so rare that there's really no point optimizing register sets out of it, so we might as well set HIOR and SDR1 on every reset, then it's correct both for PAPR and non-PAPR mode. > Overall, does a normal system reset on PPC guarantee that the SRs > and SLBs are reset? At least OpenBIOS boots up in real mode and > overwrites all SR/SLB entries while still in real mode. I don't know, but it's not really relevant here. What this does is make sure that KVM state is synced with qemu state on reset. That means that whatever reset handlers do to the qemu state - and that's what they usually operate on - will get reflected in KVM state when the guest executes again. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 3/7] pseries: Fix and cleanup CPU initialization and reset 2012-08-15 4:33 [Qemu-devel] [0/7] pseries: Patches to fix system reset David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 1/7] Allow QEMUMachine to override reset sequencing David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 2/7] ppc: Make kvm_arch_put_registers() put *all* the registers David Gibson @ 2012-08-15 4:33 ` David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 4/7] pseries: Use new method to correct reset sequence David Gibson ` (3 subsequent siblings) 6 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2012-08-15 4:33 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, paulus, David Gibson The current pseries machine init function iterates over the CPUs at several points, doing various bits of initialization. This is messy; these can and should be merged into a single iteration doing all the necessary per cpu initialization. Worse, some of these initializations were setting up state which should be set on every reset, not just at machine init time. A few of the initializations simply weren't necessary at all. This patch, therefore, moves those things that need to be to the per-cpu reset handler, and combines the remainder into two loops over the cpus (which also creates them). The second loop is for setting up hash table information, and will be removed in a subsequent patch also making other fixes to the hash table setup. This exposes a bug in our start-cpu RTAS routine (called by the guest to start up CPUs other than CPU0) under kvm. Previously, this function did not make a call to ensure that it's changes to the new cpu's state were pushed into KVM in-kernel state. We sort-of got away with this because some of the initializations had already placed the secondary CPUs into the right starting state for the sorts of Linux guests we've been running. Nonetheless the start-cpu RTAS call's behaviour was not correct and could easily have been broken by guest changes. This patch also fixes it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- hw/spapr.c | 33 +++++++++++++++++++-------------- hw/spapr_rtas.c | 5 +++++ 2 files changed, 24 insertions(+), 14 deletions(-) diff --git a/hw/spapr.c b/hw/spapr.c index 9efed0e..603674d 100644 --- a/hw/spapr.c +++ b/hw/spapr.c @@ -574,8 +574,15 @@ static void spapr_reset(void *opaque) static void spapr_cpu_reset(void *opaque) { PowerPCCPU *cpu = opaque; + CPUPPCState *env = &cpu->env; cpu_reset(CPU(cpu)); + + /* All CPUs start halted, except CPU0, the rest are explicitly + * started up by the guest using an RTAS call */ + env->halted = 1; + + env->spr[SPR_HIOR] = 0; } /* pSeries LPAR / sPAPR hardware init */ @@ -640,11 +647,16 @@ static void ppc_spapr_init(ram_addr_t ram_size, /* Set time-base frequency to 512 MHz */ cpu_ppc_tb_init(env, TIMEBASE_FREQ); - qemu_register_reset(spapr_cpu_reset, cpu); - env->hreset_vector = 0x60; + /* PAPR always has exception vectors in RAM not ROM */ env->hreset_excp_prefix = 0; - env->gpr[3] = env->cpu_index; + + /* Tell KVM that we're in PAPR mode */ + if (kvm_enabled()) { + kvmppc_set_papr(env); + } + + qemu_register_reset(spapr_cpu_reset, cpu); } /* allocate RAM */ @@ -660,7 +672,10 @@ static void ppc_spapr_init(ram_addr_t ram_size, /* allocate hash page table. For now we always make this 16mb, * later we should probably make it scale to the size of guest - * RAM */ + * RAM. FIXME: setting the htab information in the CPU env really + * belongs at CPU reset time, but we can get away with it for now + * because the PAPR guest is not permitted to write SDR1 so in + * fact these settings will never change during the run */ spapr->htab_size = 1ULL << (pteg_shift + 7); spapr->htab = qemu_memalign(spapr->htab_size, spapr->htab_size); @@ -672,11 +687,6 @@ static void ppc_spapr_init(ram_addr_t ram_size, /* Tell KVM that we're in PAPR mode */ env->spr[SPR_SDR1] = (unsigned long)spapr->htab | ((pteg_shift + 7) - 18); - env->spr[SPR_HIOR] = 0; - - if (kvm_enabled()) { - kvmppc_set_papr(env); - } } filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin"); @@ -788,11 +798,6 @@ static void ppc_spapr_init(ram_addr_t ram_size, spapr->entry_point = 0x100; - /* SLOF will startup the secondary CPUs using RTAS */ - for (env = first_cpu; env != NULL; env = env->next_cpu) { - env->halted = 1; - } - /* Prepare the device tree */ spapr->fdt_skel = spapr_create_fdt_skel(cpu_model, rma_size, initrd_base, initrd_size, diff --git a/hw/spapr_rtas.c b/hw/spapr_rtas.c index ae18595..b808f80 100644 --- a/hw/spapr_rtas.c +++ b/hw/spapr_rtas.c @@ -184,6 +184,11 @@ static void rtas_start_cpu(sPAPREnvironment *spapr, return; } + /* This will make sure qemu state is up to date with kvm, and + * mark it dirty so our changes get flushed back before the + * new cpu enters */ + kvm_cpu_synchronize_state(env); + env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME); env->nip = start; env->gpr[3] = r3; -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 4/7] pseries: Use new method to correct reset sequence 2012-08-15 4:33 [Qemu-devel] [0/7] pseries: Patches to fix system reset David Gibson ` (2 preceding siblings ...) 2012-08-15 4:33 ` [Qemu-devel] [PATCH 3/7] pseries: Fix and cleanup CPU initialization and reset David Gibson @ 2012-08-15 4:33 ` David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 5/7] pseries: Add support for new KVM hash table control call David Gibson ` (2 subsequent siblings) 6 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2012-08-15 4:33 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, paulus, David Gibson A number of things need to occur during reset of the PAPR paravirtualized platform in a specific order. For example, the hash table needs to be cleared before the CPUs are reset, so that they initialize their register state correctly, and the CPUs need to have their main reset called before we set up the entry point state on the boot cpu. We also need to have the main qdev reset happen before the creation and installation of the device tree for the new boot, because we need the state of the devices settled to correctly construct the device tree. We currently do the pseries once-per-reset initializations done from a reset handler. However we can't adequately control when this handler is called during the reset - in particular we can't guarantee it happens after all the qdev resets (since qdevs might be registered after the machine init function has executed). This patch uses the new QEMUMachine reset method to to fix this problem, ensuring the various order dependent reset steps happen in the correct order. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- hw/spapr.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/hw/spapr.c b/hw/spapr.c index 603674d..f515e02 100644 --- a/hw/spapr.c +++ b/hw/spapr.c @@ -552,13 +552,13 @@ static void emulate_spapr_hypercall(CPUPPCState *env) env->gpr[3] = spapr_hypercall(env, env->gpr[3], &env->gpr[4]); } -static void spapr_reset(void *opaque) +static void ppc_spapr_reset(void) { - sPAPREnvironment *spapr = (sPAPREnvironment *)opaque; - /* flush out the hash table */ memset(spapr->htab, 0, spapr->htab_size); + qemu_devices_reset(); + /* Load the fdt */ spapr_finalize_fdt(spapr, spapr->fdt_addr, spapr->rtas_addr, spapr->rtas_size); @@ -805,14 +805,13 @@ static void ppc_spapr_init(ram_addr_t ram_size, boot_device, kernel_cmdline, pteg_shift + 7); assert(spapr->fdt_skel != NULL); - - qemu_register_reset(spapr_reset, spapr); } static QEMUMachine spapr_machine = { .name = "pseries", .desc = "pSeries Logical Partition (PAPR compliant)", .init = ppc_spapr_init, + .reset = ppc_spapr_reset, .max_cpus = MAX_CPUS, .no_parallel = 1, .use_scsi = 1, -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 5/7] pseries: Add support for new KVM hash table control call 2012-08-15 4:33 [Qemu-devel] [0/7] pseries: Patches to fix system reset David Gibson ` (3 preceding siblings ...) 2012-08-15 4:33 ` [Qemu-devel] [PATCH 4/7] pseries: Use new method to correct reset sequence David Gibson @ 2012-08-15 4:33 ` David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 6/7] pseries: Clear TCE state when resetting PAPR VIO devices David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 7/7] ppc/pseries: Reset VPA registration on CPU reset David Gibson 6 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2012-08-15 4:33 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, paulus, David Gibson From: Ben Herrenschmidt <benh@kernel.crashing.org> This adds support for then new "reset htab" ioctl which allows qemu to properly cleanup the MMU hash table when the guest is reset. With the corresponding kernel support, reset of a guest now works properly. This also paves the way for indicating a different size hash table to the kernel and for the kernel to be able to impose limits on the requested size. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- hw/spapr.c | 274 +++++++++++++++++++++++++++++--------------------- hw/spapr.h | 4 +- target-ppc/kvm.c | 29 ++++++ target-ppc/kvm_ppc.h | 19 ++++ 4 files changed, 213 insertions(+), 113 deletions(-) diff --git a/hw/spapr.c b/hw/spapr.c index f515e02..2d8303b 100644 --- a/hw/spapr.c +++ b/hw/spapr.c @@ -83,6 +83,8 @@ #define PHANDLE_XICP 0x00001111 +#define HTAB_SIZE(spapr) (1ULL << ((spapr)->htab_shift)) + sPAPREnvironment *spapr; int spapr_allocate_irq(int hint, enum xics_irq_type type) @@ -132,12 +134,13 @@ int spapr_allocate_irq_block(int num, enum xics_irq_type type) return first; } -static int spapr_set_associativity(void *fdt, sPAPREnvironment *spapr) +static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr) { int ret = 0, offset; CPUPPCState *env; char cpu_model[32]; int smt = kvmppc_smt_threads(); + uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)}; assert(spapr->cpu_model); @@ -161,8 +164,16 @@ static int spapr_set_associativity(void *fdt, sPAPREnvironment *spapr) return offset; } - ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity, - sizeof(associativity)); + if (nb_numa_nodes > 1) { + ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity, + sizeof(associativity)); + if (ret < 0) { + return ret; + } + } + + ret = fdt_setprop(fdt, offset, "ibm,pft-size", + pft_size_prop, sizeof(pft_size_prop)); if (ret < 0) { return ret; } @@ -204,45 +215,36 @@ static size_t create_page_sizes_prop(CPUPPCState *env, uint32_t *prop, return (p - prop) * sizeof(uint32_t); } +#define _FDT(exp) \ + do { \ + int ret = (exp); \ + if (ret < 0) { \ + fprintf(stderr, "qemu: error creating device tree: %s: %s\n", \ + #exp, fdt_strerror(ret)); \ + exit(1); \ + } \ + } while (0) + + static void *spapr_create_fdt_skel(const char *cpu_model, - target_phys_addr_t rma_size, target_phys_addr_t initrd_base, target_phys_addr_t initrd_size, target_phys_addr_t kernel_size, const char *boot_device, - const char *kernel_cmdline, - long hash_shift) + const char *kernel_cmdline) { void *fdt; CPUPPCState *env; - uint64_t mem_reg_property[2]; uint32_t start_prop = cpu_to_be32(initrd_base); uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size); - uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)}; char hypertas_prop[] = "hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt" "\0hcall-tce\0hcall-vio\0hcall-splpar\0hcall-bulk"; char qemu_hypertas_prop[] = "hcall-memop1"; + uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)}; uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)}; - int i; char *modelname; - int smt = kvmppc_smt_threads(); + int i, smt = kvmppc_smt_threads(); unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80}; - uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)}; - uint32_t associativity[] = {cpu_to_be32(0x4), cpu_to_be32(0x0), - cpu_to_be32(0x0), cpu_to_be32(0x0), - cpu_to_be32(0x0)}; - char mem_name[32]; - target_phys_addr_t node0_size, mem_start; - -#define _FDT(exp) \ - do { \ - int ret = (exp); \ - if (ret < 0) { \ - fprintf(stderr, "qemu: error creating device tree: %s: %s\n", \ - #exp, fdt_strerror(ret)); \ - exit(1); \ - } \ - } while (0) fdt = g_malloc0(FDT_MAX_SIZE); _FDT((fdt_create(fdt, FDT_MAX_SIZE))); @@ -284,55 +286,6 @@ static void *spapr_create_fdt_skel(const char *cpu_model, _FDT((fdt_end_node(fdt))); - /* memory node(s) */ - node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; - if (rma_size > node0_size) { - rma_size = node0_size; - } - - /* RMA */ - mem_reg_property[0] = 0; - mem_reg_property[1] = cpu_to_be64(rma_size); - _FDT((fdt_begin_node(fdt, "memory@0"))); - _FDT((fdt_property_string(fdt, "device_type", "memory"))); - _FDT((fdt_property(fdt, "reg", mem_reg_property, - sizeof(mem_reg_property)))); - _FDT((fdt_property(fdt, "ibm,associativity", associativity, - sizeof(associativity)))); - _FDT((fdt_end_node(fdt))); - - /* RAM: Node 0 */ - if (node0_size > rma_size) { - mem_reg_property[0] = cpu_to_be64(rma_size); - mem_reg_property[1] = cpu_to_be64(node0_size - rma_size); - - sprintf(mem_name, "memory@" TARGET_FMT_lx, rma_size); - _FDT((fdt_begin_node(fdt, mem_name))); - _FDT((fdt_property_string(fdt, "device_type", "memory"))); - _FDT((fdt_property(fdt, "reg", mem_reg_property, - sizeof(mem_reg_property)))); - _FDT((fdt_property(fdt, "ibm,associativity", associativity, - sizeof(associativity)))); - _FDT((fdt_end_node(fdt))); - } - - /* RAM: Node 1 and beyond */ - mem_start = node0_size; - for (i = 1; i < nb_numa_nodes; i++) { - mem_reg_property[0] = cpu_to_be64(mem_start); - mem_reg_property[1] = cpu_to_be64(node_mem[i]); - associativity[3] = associativity[4] = cpu_to_be32(i); - sprintf(mem_name, "memory@" TARGET_FMT_lx, mem_start); - _FDT((fdt_begin_node(fdt, mem_name))); - _FDT((fdt_property_string(fdt, "device_type", "memory"))); - _FDT((fdt_property(fdt, "reg", mem_reg_property, - sizeof(mem_reg_property)))); - _FDT((fdt_property(fdt, "ibm,associativity", associativity, - sizeof(associativity)))); - _FDT((fdt_end_node(fdt))); - mem_start += node_mem[i]; - } - /* cpus */ _FDT((fdt_begin_node(fdt, "cpus"))); @@ -384,8 +337,6 @@ static void *spapr_create_fdt_skel(const char *cpu_model, _FDT((fdt_property_cell(fdt, "timebase-frequency", tbfreq))); _FDT((fdt_property_cell(fdt, "clock-frequency", cpufreq))); _FDT((fdt_property_cell(fdt, "ibm,slb-size", env->slb_nr))); - _FDT((fdt_property(fdt, "ibm,pft-size", - pft_size_prop, sizeof(pft_size_prop)))); _FDT((fdt_property_string(fdt, "status", "okay"))); _FDT((fdt_property(fdt, "64-bit", NULL, 0))); @@ -484,6 +435,68 @@ static void *spapr_create_fdt_skel(const char *cpu_model, return fdt; } +static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt) +{ + uint32_t associativity[] = {cpu_to_be32(0x4), cpu_to_be32(0x0), + cpu_to_be32(0x0), cpu_to_be32(0x0), + cpu_to_be32(0x0)}; + char mem_name[32]; + target_phys_addr_t node0_size, mem_start; + uint64_t mem_reg_property[2]; + int i, off; + + /* memory node(s) */ + node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; + if (spapr->rma_size > node0_size) { + spapr->rma_size = node0_size; + } + + /* RMA */ + mem_reg_property[0] = 0; + mem_reg_property[1] = cpu_to_be64(spapr->rma_size); + off = fdt_add_subnode(fdt, 0, "memory@0"); + _FDT(off); + _FDT((fdt_setprop_string(fdt, off, "device_type", "memory"))); + _FDT((fdt_setprop(fdt, off, "reg", mem_reg_property, + sizeof(mem_reg_property)))); + _FDT((fdt_setprop(fdt, off, "ibm,associativity", associativity, + sizeof(associativity)))); + + /* RAM: Node 0 */ + if (node0_size > spapr->rma_size) { + mem_reg_property[0] = cpu_to_be64(spapr->rma_size); + mem_reg_property[1] = cpu_to_be64(node0_size - spapr->rma_size); + + sprintf(mem_name, "memory@" TARGET_FMT_lx, spapr->rma_size); + off = fdt_add_subnode(fdt, 0, mem_name); + _FDT(off); + _FDT((fdt_setprop_string(fdt, off, "device_type", "memory"))); + _FDT((fdt_setprop(fdt, off, "reg", mem_reg_property, + sizeof(mem_reg_property)))); + _FDT((fdt_setprop(fdt, off, "ibm,associativity", associativity, + sizeof(associativity)))); + } + + /* RAM: Node 1 and beyond */ + mem_start = node0_size; + for (i = 1; i < nb_numa_nodes; i++) { + mem_reg_property[0] = cpu_to_be64(mem_start); + mem_reg_property[1] = cpu_to_be64(node_mem[i]); + associativity[3] = associativity[4] = cpu_to_be32(i); + sprintf(mem_name, "memory@" TARGET_FMT_lx, mem_start); + off = fdt_add_subnode(fdt, 0, mem_name); + _FDT(off); + _FDT((fdt_setprop_string(fdt, off, "device_type", "memory"))); + _FDT((fdt_setprop(fdt, off, "reg", mem_reg_property, + sizeof(mem_reg_property)))); + _FDT((fdt_setprop(fdt, off, "ibm,associativity", associativity, + sizeof(associativity)))); + mem_start += node_mem[i]; + } + + return 0; +} + static void spapr_finalize_fdt(sPAPREnvironment *spapr, target_phys_addr_t fdt_addr, target_phys_addr_t rtas_addr, @@ -498,6 +511,12 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr, /* open out the base tree into a temp buffer for the final tweaks */ _FDT((fdt_open_into(spapr->fdt_skel, fdt, FDT_MAX_SIZE))); + ret = spapr_populate_memory(spapr, fdt); + if (ret < 0) { + fprintf(stderr, "couldn't setup memory nodes in fdt\n"); + exit(1); + } + ret = spapr_populate_vdevice(spapr->vio_bus, fdt); if (ret < 0) { fprintf(stderr, "couldn't setup vio devices in fdt\n"); @@ -520,11 +539,9 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr, } /* Advertise NUMA via ibm,associativity */ - if (nb_numa_nodes > 1) { - ret = spapr_set_associativity(fdt, spapr); - if (ret < 0) { - fprintf(stderr, "Couldn't set up NUMA device tree properties\n"); - } + ret = spapr_fixup_cpu_dt(fdt, spapr); + if (ret < 0) { + fprintf(stderr, "Couldn't finalize CPU device tree properties\n"); } spapr_populate_chosen_stdout(fdt, spapr->vio_bus); @@ -552,10 +569,39 @@ static void emulate_spapr_hypercall(CPUPPCState *env) env->gpr[3] = spapr_hypercall(env, env->gpr[3], &env->gpr[4]); } +static void spapr_reset_htab(sPAPREnvironment *spapr) +{ + long shift; + + /* allocate hash page table. For now we always make this 16mb, + * later we should probably make it scale to the size of guest + * RAM */ + + shift = kvmppc_reset_htab(spapr->htab_shift); + + if (shift > 0) { + /* Kernel handles htab, we don't need to allocate one */ + spapr->htab_shift = shift; + } else { + if (!spapr->htab) { + /* Allocate an htab if we don't yet have one */ + spapr->htab = qemu_memalign(HTAB_SIZE(spapr), HTAB_SIZE(spapr)); + } + + /* And clear it */ + memset(spapr->htab, 0, HTAB_SIZE(spapr)); + } + + /* Update the RMA size if necessary */ + if (spapr->vrma_adjust) { + spapr->rma_size = kvmppc_rma_size(ram_size, spapr->htab_shift); + } +} + static void ppc_spapr_reset(void) { - /* flush out the hash table */ - memset(spapr->htab, 0, spapr->htab_size); + /* Reset the hash table & recalc the RMA */ + spapr_reset_htab(spapr); qemu_devices_reset(); @@ -583,6 +629,12 @@ static void spapr_cpu_reset(void *opaque) env->halted = 1; env->spr[SPR_HIOR] = 0; + + env->external_htab = spapr->htab; + env->htab_base = -1; + env->htab_mask = HTAB_SIZE(spapr) - 1; + env->spr[SPR_SDR1] = (unsigned long)spapr->htab | + (spapr->htab_shift - 18); } /* pSeries LPAR / sPAPR hardware init */ @@ -598,11 +650,10 @@ static void ppc_spapr_init(ram_addr_t ram_size, int i; MemoryRegion *sysmem = get_system_memory(); MemoryRegion *ram = g_new(MemoryRegion, 1); - target_phys_addr_t rma_alloc_size, rma_size; + target_phys_addr_t rma_alloc_size; uint32_t initrd_base = 0; long kernel_size = 0, initrd_size = 0; long load_limit, rtas_limit, fw_size; - long pteg_shift = 17; char *filename; msi_supported = true; @@ -619,20 +670,39 @@ static void ppc_spapr_init(ram_addr_t ram_size, hw_error("qemu: Unable to create RMA\n"); exit(1); } + if (rma_alloc_size && (rma_alloc_size < ram_size)) { - rma_size = rma_alloc_size; + spapr->rma_size = rma_alloc_size; } else { - rma_size = ram_size; + spapr->rma_size = ram_size; + + /* With KVM, we don't actually know whether KVM supports an + * unbounded RMA (PR KVM) or is limited by the hash table size + * (HV KVM using VRMA), so we always assume the latter + * + * In that case, we also limit the initial allocations for RTAS + * etc... to 256M since we have no way to know what the VRMA size + * is going to be as it depends on the size of the hash table + * isn't determined yet. + */ + if (kvm_enabled()) { + spapr->vrma_adjust = 1; + spapr->rma_size = MIN(spapr->rma_size, 0x10000000); + } } /* We place the device tree and RTAS just below either the top of the RMA, * or just below 2GB, whichever is lowere, so that it can be * processed with 32-bit real mode code if necessary */ - rtas_limit = MIN(rma_size, 0x80000000); + rtas_limit = MIN(spapr->rma_size, 0x80000000); spapr->rtas_addr = rtas_limit - RTAS_MAX_SIZE; spapr->fdt_addr = spapr->rtas_addr - FDT_MAX_SIZE; load_limit = spapr->fdt_addr - FW_OVERHEAD; + /* For now, always aim for a 16MB hash table */ + /* FIXME: we should change this default based on RAM size */ + spapr->htab_shift = 24; + /* init CPUs */ if (cpu_model == NULL) { cpu_model = kvm_enabled() ? "host" : "POWER7"; @@ -670,25 +740,6 @@ static void ppc_spapr_init(ram_addr_t ram_size, memory_region_add_subregion(sysmem, nonrma_base, ram); } - /* allocate hash page table. For now we always make this 16mb, - * later we should probably make it scale to the size of guest - * RAM. FIXME: setting the htab information in the CPU env really - * belongs at CPU reset time, but we can get away with it for now - * because the PAPR guest is not permitted to write SDR1 so in - * fact these settings will never change during the run */ - spapr->htab_size = 1ULL << (pteg_shift + 7); - spapr->htab = qemu_memalign(spapr->htab_size, spapr->htab_size); - - for (env = first_cpu; env != NULL; env = env->next_cpu) { - env->external_htab = spapr->htab; - env->htab_base = -1; - env->htab_mask = spapr->htab_size - 1; - - /* Tell KVM that we're in PAPR mode */ - env->spr[SPR_SDR1] = (unsigned long)spapr->htab | - ((pteg_shift + 7) - 18); - } - filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin"); spapr->rtas_size = load_image_targphys(filename, spapr->rtas_addr, rtas_limit - spapr->rtas_addr); @@ -747,7 +798,7 @@ static void ppc_spapr_init(ram_addr_t ram_size, spapr_vscsi_create(spapr->vio_bus); } - if (rma_size < (MIN_RMA_SLOF << 20)) { + if (spapr->rma_size < (MIN_RMA_SLOF << 20)) { fprintf(stderr, "qemu: pSeries SLOF firmware requires >= " "%ldM guest RMA (Real Mode Area memory)\n", MIN_RMA_SLOF); exit(1); @@ -799,11 +850,10 @@ static void ppc_spapr_init(ram_addr_t ram_size, spapr->entry_point = 0x100; /* Prepare the device tree */ - spapr->fdt_skel = spapr_create_fdt_skel(cpu_model, rma_size, + spapr->fdt_skel = spapr_create_fdt_skel(cpu_model, initrd_base, initrd_size, kernel_size, - boot_device, kernel_cmdline, - pteg_shift + 7); + boot_device, kernel_cmdline); assert(spapr->fdt_skel != NULL); } diff --git a/hw/spapr.h b/hw/spapr.h index 5d76c5e..4a155ac 100644 --- a/hw/spapr.h +++ b/hw/spapr.h @@ -15,7 +15,9 @@ typedef struct sPAPREnvironment { target_phys_addr_t ram_limit; void *htab; - long htab_size; + long htab_shift; + target_phys_addr_t rma_size; + int vrma_adjust; target_phys_addr_t fdt_addr, rtas_addr; long rtas_size; void *fdt_skel; diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 1a7489b..546c116 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1010,6 +1010,7 @@ int kvmppc_smt_threads(void) return cap_ppc_smt ? cap_ppc_smt : 1; } +#ifdef TARGET_PPC64 off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem) { void *rma; @@ -1053,6 +1054,16 @@ off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem) return size; } +uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift) +{ + if (cap_ppc_rma >= 2) { + return current_size; + } + return MIN(current_size, + getrampagesize() << (hash_shift - 7)); +} +#endif + void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd) { struct kvm_create_spapr_tce args = { @@ -1112,6 +1123,24 @@ int kvmppc_remove_spapr_tce(void *table, int fd, uint32_t window_size) return 0; } +int kvmppc_reset_htab(int shift_hint) +{ + uint32_t shift = shift_hint; + + if (kvm_enabled() && + kvm_check_extension(kvm_state, KVM_CAP_PPC_ALLOC_HTAB)) { + int ret; + ret = kvm_vm_ioctl(kvm_state, KVM_PPC_ALLOCATE_HTAB, &shift); + if (ret < 0) { + return ret; + } + return shift; + } + + /* For now.. */ + return 0; +} + static inline uint32_t mfpvr(void) { uint32_t pvr; diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h index e2f8703..baad6eb 100644 --- a/target-ppc/kvm_ppc.h +++ b/target-ppc/kvm_ppc.h @@ -27,6 +27,8 @@ int kvmppc_smt_threads(void); off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem); void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd); int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size); +int kvmppc_reset_htab(int shift_hint); +uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift); #endif /* !CONFIG_USER_ONLY */ const ppc_def_t *kvmppc_host_cpu_def(void); int kvmppc_fixup_cpu(CPUPPCState *env); @@ -94,6 +96,23 @@ static inline int kvmppc_remove_spapr_tce(void *table, int pfd, { return -1; } + +static inline int kvmppc_reset_htab(int shift_hint) +{ + return -1; +} + +static inline uint64_t kvmppc_rma_size(uint64_t current_size, + unsigned int hash_shift) +{ + return ram_size; +} + +static inline int kvmppc_update_sdr1(CPUPPCState *env) +{ + return 0; +} + #endif /* !CONFIG_USER_ONLY */ static inline const ppc_def_t *kvmppc_host_cpu_def(void) -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 6/7] pseries: Clear TCE state when resetting PAPR VIO devices 2012-08-15 4:33 [Qemu-devel] [0/7] pseries: Patches to fix system reset David Gibson ` (4 preceding siblings ...) 2012-08-15 4:33 ` [Qemu-devel] [PATCH 5/7] pseries: Add support for new KVM hash table control call David Gibson @ 2012-08-15 4:33 ` David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 7/7] ppc/pseries: Reset VPA registration on CPU reset David Gibson 6 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2012-08-15 4:33 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, paulus, David Gibson When we reset the system, the reset method for VIO bus devices resets the state of their request queue (if present) as it should. However it was not resetting the state of their TCE table (DMA translation) if present. This patch corrects that bug, and also removes some small code duplication in the reset paths. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- hw/spapr_vio.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c index 7ca4452..298e239 100644 --- a/hw/spapr_vio.c +++ b/hw/spapr_vio.c @@ -324,9 +324,7 @@ static void spapr_vio_quiesce_one(VIOsPAPRDevice *dev) } dev->dma = spapr_tce_new_dma_context(liobn, pc->rtce_window_size); - dev->crq.qladdr = 0; - dev->crq.qsize = 0; - dev->crq.qnext = 0; + free_crq(dev); } static void rtas_set_tce_bypass(sPAPREnvironment *spapr, uint32_t token, @@ -409,9 +407,8 @@ static void spapr_vio_busdev_reset(DeviceState *qdev) VIOsPAPRDevice *dev = DO_UPCAST(VIOsPAPRDevice, qdev, qdev); VIOsPAPRDeviceClass *pc = VIO_SPAPR_DEVICE_GET_CLASS(dev); - if (dev->crq.qsize) { - free_crq(dev); - } + /* Shut down the request queue and TCEs if necessary */ + spapr_vio_quiesce_one(dev); if (pc->reset) { pc->reset(dev); -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH 7/7] ppc/pseries: Reset VPA registration on CPU reset 2012-08-15 4:33 [Qemu-devel] [0/7] pseries: Patches to fix system reset David Gibson ` (5 preceding siblings ...) 2012-08-15 4:33 ` [Qemu-devel] [PATCH 6/7] pseries: Clear TCE state when resetting PAPR VIO devices David Gibson @ 2012-08-15 4:33 ` David Gibson 6 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2012-08-15 4:33 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, paulus, David Gibson The ppc specific CPU state contains several variables which track the VPA, SLB shadow and dispatch trace log. These are structures shared between OS and hypervisor that are used on the pseries machine to track various per-CPU quantities. The address of these structures needs to be registered by the guest on each boot, however currently this registration is not cleared when we reset the cpu. This patch corrects this bug. Well, it corrects it for the full emu case anyway. To fix the KVM case, we need some KVM extensions to actually make the KVM internal VPA registration accessible to qemu. Patches coming for qemu once the kernel updates are merged. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- target-ppc/translate_init.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c index 6fe4168..2bc4a3f 100644 --- a/target-ppc/translate_init.c +++ b/target-ppc/translate_init.c @@ -10423,6 +10423,14 @@ static void ppc_cpu_reset(CPUState *s) env->pending_interrupts = 0; env->exception_index = POWERPC_EXCP_NONE; env->error_code = 0; + +#if defined(TARGET_PPC64) && !defined(CONFIG_USER_ONLY) + env->vpa = 0; + env->slb_shadow = 0; + env->dispatch_trace_log = 0; + env->dtl_size = 0; +#endif /* TARGET_PPC64 */ + /* Flush all TLBs */ tlb_flush(env, 1); } -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-08-18 8:31 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-15 4:33 [Qemu-devel] [0/7] pseries: Patches to fix system reset David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 1/7] Allow QEMUMachine to override reset sequencing David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 2/7] ppc: Make kvm_arch_put_registers() put *all* the registers David Gibson 2012-08-17 13:58 ` Alexander Graf 2012-08-17 16:24 ` [Qemu-devel] [Qemu-ppc] " David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 3/7] pseries: Fix and cleanup CPU initialization and reset David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 4/7] pseries: Use new method to correct reset sequence David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 5/7] pseries: Add support for new KVM hash table control call David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 6/7] pseries: Clear TCE state when resetting PAPR VIO devices David Gibson 2012-08-15 4:33 ` [Qemu-devel] [PATCH 7/7] ppc/pseries: Reset VPA registration on CPU reset David Gibson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).