* [Qemu-devel] [PATCH 1/4] pseries: Set hash table size based on RAM size
2012-09-21 3:42 [Qemu-devel] [0/4] pseries patches for comment David Gibson
@ 2012-09-21 3:42 ` David Gibson
2012-09-21 3:42 ` [Qemu-devel] [PATCH 2/4] target-ppc: Remove unused power_mode field from cpu state David Gibson
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2012-09-21 3:42 UTC (permalink / raw)
To: agraf; +Cc: qemu-ppc, qemu-devel, David Gibson
Currently the pseries machine code always attempts to set the size of the
guests's hash page table to 16MB. However, because of the way the POWER
MMU works, a suitable hash page table size should really depend on memory
size. 16MB will be excessive for guests with <1GB and RAM, and may not be
enough for guests with >2GB of RAM (depending on guest page size and
other factors).
The usual given rule of thumb is that the hash table should be 1/64 of
the size of memory, but in fact the Linux guests we are aiming at don't
really need that much. This patch, therefore, changes the hash table
allocation code to aim for 1/128 of the size of RAM (rounding up). When
using KVM, this size may still be adjusted by the host kernel if it is
unable to allocate a suitable (contiguous) table.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
This patch can only be safely applied after my earlier patch
"target-ppc: KVM: Fix some kernel version edge cases for
kvmppc_reset_htab()" (already in ppc-next), otherwise the kernel can
end up advertising to the guest a hash table size different from that
actually allocated by the kernel.
---
hw/spapr.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/hw/spapr.c b/hw/spapr.c
index 1177efa..a8bd3c1 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -725,9 +725,16 @@ static void ppc_spapr_init(ram_addr_t ram_size,
spapr->fdt_addr = spapr->rtas_addr - FDT_MAX_SIZE;
load_limit = spapr->fdt_addr - FW_OVERHEAD;
- /* For now, always aim for a 16MB hash table */
- /* FIXME: we should change this default based on RAM size */
- spapr->htab_shift = 24;
+ /* We aim for a hash table of size 1/128 the size of RAM. The
+ * normal rule of thumb is 1/64 the size of RAM, but that's much
+ * more than needed for the Linux guests we support. */
+ spapr->htab_shift = 18; /* Minimum architected size */
+ while (spapr->htab_shift <= 46) {
+ if ((1ULL << (spapr->htab_shift + 7)) >= ram_size) {
+ break;
+ }
+ spapr->htab_shift++;
+ }
/* init CPUs */
if (cpu_model == NULL) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 2/4] target-ppc: Remove unused power_mode field from cpu state
2012-09-21 3:42 [Qemu-devel] [0/4] pseries patches for comment David Gibson
2012-09-21 3:42 ` [Qemu-devel] [PATCH 1/4] pseries: Set hash table size based on RAM size David Gibson
@ 2012-09-21 3:42 ` David Gibson
2012-09-21 3:42 ` [Qemu-devel] [PATCH 3/4] target-ppc: Extend FPU state for newer POWER CPUs David Gibson
2012-09-21 3:42 ` [Qemu-devel] [PATCH 4/4] pseries: Remove unnecessary locking from PAPR hash table hcalls David Gibson
3 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2012-09-21 3:42 UTC (permalink / raw)
To: agraf; +Cc: qemu-ppc, qemu-devel, David Gibson
CPUPPCState includes a variable 'power_mode' which is used nowhere. This
patch removes it. This includes saving a dummy zero in its place during
vmsave, to avoid breaking the save format.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
target-ppc/cpu.h | 1 -
target-ppc/machine.c | 4 ++--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index ca2fc21..faf4404 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1079,7 +1079,6 @@ struct CPUPPCState {
int mmu_idx; /* precomputed MMU index to speed up mem accesses */
/* Power management */
- int power_mode;
int (*check_pow)(CPUPPCState *env);
#if !defined(CONFIG_USER_ONLY)
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index d6c2ee4..21ce757 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -82,7 +82,7 @@ void cpu_save(QEMUFile *f, void *opaque)
qemu_put_betls(f, &env->hflags);
qemu_put_betls(f, &env->hflags_nmsr);
qemu_put_sbe32s(f, &env->mmu_idx);
- qemu_put_sbe32s(f, &env->power_mode);
+ qemu_put_sbe32(f, 0);
}
int cpu_load(QEMUFile *f, void *opaque, int version_id)
@@ -167,7 +167,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
qemu_get_betls(f, &env->hflags);
qemu_get_betls(f, &env->hflags_nmsr);
qemu_get_sbe32s(f, &env->mmu_idx);
- qemu_get_sbe32s(f, &env->power_mode);
+ qemu_get_sbe32(f); /* Discard unused power_mode */
return 0;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 3/4] target-ppc: Extend FPU state for newer POWER CPUs
2012-09-21 3:42 [Qemu-devel] [0/4] pseries patches for comment David Gibson
2012-09-21 3:42 ` [Qemu-devel] [PATCH 1/4] pseries: Set hash table size based on RAM size David Gibson
2012-09-21 3:42 ` [Qemu-devel] [PATCH 2/4] target-ppc: Remove unused power_mode field from cpu state David Gibson
@ 2012-09-21 3:42 ` David Gibson
2012-09-21 3:42 ` [Qemu-devel] [PATCH 4/4] pseries: Remove unnecessary locking from PAPR hash table hcalls David Gibson
3 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2012-09-21 3:42 UTC (permalink / raw)
To: agraf; +Cc: qemu-ppc, qemu-devel, David Gibson
This patch adds some extra FPU state to CPUPPCState. Specifically, fpscr
is extended to 64 bits, since some recent CPUs now have more status bits
than fit inside 64 bits, and we add the 32 VSR registers present on CPUs
with VSX (these extend the standard FP regs, which together with the
Altivec/VMX registers form a 64 x 128bit register file for VSX).
We don't actually support the instructions using these extra registers in
TCG yet, but we still a place to store the state so we can sync it with
KVM and savevm/loadvm it. This patch updates the savevm code to not
fail on the extended state, but also does not actually save it - that's
a project for another patch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
target-ppc/cpu.h | 4 +++-
target-ppc/machine.c | 8 ++++++--
target-ppc/translate.c | 2 +-
3 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index faf4404..846778f 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -963,7 +963,7 @@ struct CPUPPCState {
/* floating point registers */
float64 fpr[32];
/* floating point status and control register */
- uint32_t fpscr;
+ uint64_t fpscr;
/* Next instruction pointer */
target_ulong nip;
@@ -1014,6 +1014,8 @@ struct CPUPPCState {
/* Altivec registers */
ppc_avr_t avr[32];
uint32_t vscr;
+ /* VSX registers */
+ uint64_t vsr[32];
/* SPE registers */
uint64_t spe_acc;
uint32_t spe_fscr;
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 21ce757..5e7bc00 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -6,6 +6,7 @@ void cpu_save(QEMUFile *f, void *opaque)
{
CPUPPCState *env = (CPUPPCState *)opaque;
unsigned int i, j;
+ uint32_t fpscr;
for (i = 0; i < 32; i++)
qemu_put_betls(f, &env->gpr[i]);
@@ -30,7 +31,8 @@ void cpu_save(QEMUFile *f, void *opaque)
u.d = env->fpr[i];
qemu_put_be64(f, u.l);
}
- qemu_put_be32s(f, &env->fpscr);
+ fpscr = env->fpscr;
+ qemu_put_be32s(f, &fpscr);
qemu_put_sbe32s(f, &env->access_type);
#if defined(TARGET_PPC64)
qemu_put_betls(f, &env->asr);
@@ -90,6 +92,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
CPUPPCState *env = (CPUPPCState *)opaque;
unsigned int i, j;
target_ulong sdr1;
+ uint32_t fpscr;
for (i = 0; i < 32; i++)
qemu_get_betls(f, &env->gpr[i]);
@@ -114,7 +117,8 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
u.l = qemu_get_be64(f);
env->fpr[i] = u.d;
}
- qemu_get_be32s(f, &env->fpscr);
+ qemu_get_be32s(f, &fpscr);
+ env->fpscr = fpscr;
qemu_get_sbe32s(f, &env->access_type);
#if defined(TARGET_PPC64)
qemu_get_betls(f, &env->asr);
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index ac915cc..c8122b7 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -9463,7 +9463,7 @@ void cpu_dump_state (CPUPPCState *env, FILE *f, fprintf_function cpu_fprintf,
if ((i & (RFPL - 1)) == (RFPL - 1))
cpu_fprintf(f, "\n");
}
- cpu_fprintf(f, "FPSCR %08x\n", env->fpscr);
+ cpu_fprintf(f, "FPSCR %08" PRIx64 "\n", env->fpscr);
#if !defined(CONFIG_USER_ONLY)
cpu_fprintf(f, " SRR0 " TARGET_FMT_lx " SRR1 " TARGET_FMT_lx
" PVR " TARGET_FMT_lx " VRSAVE " TARGET_FMT_lx "\n",
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 4/4] pseries: Remove unnecessary locking from PAPR hash table hcalls
2012-09-21 3:42 [Qemu-devel] [0/4] pseries patches for comment David Gibson
` (2 preceding siblings ...)
2012-09-21 3:42 ` [Qemu-devel] [PATCH 3/4] target-ppc: Extend FPU state for newer POWER CPUs David Gibson
@ 2012-09-21 3:42 ` David Gibson
2012-09-24 13:57 ` Alexander Graf
3 siblings, 1 reply; 6+ messages in thread
From: David Gibson @ 2012-09-21 3:42 UTC (permalink / raw)
To: agraf; +Cc: qemu-ppc, qemu-devel, David Gibson
In the paravirtualized environment provided by PAPR, there is a standard
locking scheme so that hypercalls updating the hash page table from
different guest threads don't corrupt the haah table state. We implement
this HVLOCK bit in out page table hypercalls. However, it is not necessary
in our case, since the hypercalls all run in the qemu environment under the
big qemu lock.
Therefore, this patch removes the locking code. This has the additional
advantage of freeing up a hash PTE bit which will be useful for migration
support.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
hw/spapr_hcall.c | 42 ++++--------------------------------------
1 file changed, 4 insertions(+), 38 deletions(-)
diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index 2df94d1..826ca67 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -39,22 +39,6 @@
#define HPTE_V_1TB_SEG 0x4000000000000000ULL
#define HPTE_V_VRMA_MASK 0x4001ffffff000000ULL
-#define HPTE_V_HVLOCK 0x40ULL
-
-static inline int lock_hpte(void *hpte, target_ulong bits)
-{
- uint64_t pteh;
-
- pteh = ldq_p(hpte);
-
- /* We're protected by qemu's global lock here */
- if (pteh & bits) {
- return 0;
- }
- stq_p(hpte, pteh | HPTE_V_HVLOCK);
- return 1;
-}
-
static target_ulong compute_tlbie_rb(target_ulong v, target_ulong r,
target_ulong pte_index)
{
@@ -151,8 +135,7 @@ static target_ulong h_enter(CPUPPCState *env, sPAPREnvironment *spapr,
if (i == 8) {
return H_PTEG_FULL;
}
- if (((ldq_p(hpte) & HPTE_V_VALID) == 0) &&
- lock_hpte(hpte, HPTE_V_HVLOCK | HPTE_V_VALID)) {
+ if ((ldq_p(hpte) & HPTE_V_VALID) == 0) {
break;
}
hpte += HASH_PTE_SIZE_64;
@@ -160,7 +143,7 @@ static target_ulong h_enter(CPUPPCState *env, sPAPREnvironment *spapr,
} else {
i = 0;
hpte = env->external_htab + (pte_index * HASH_PTE_SIZE_64);
- if (!lock_hpte(hpte, HPTE_V_HVLOCK | HPTE_V_VALID)) {
+ if (ldq_p(hpte) & HPTE_V_VALID) {
return H_PTEG_FULL;
}
}
@@ -168,7 +151,6 @@ static target_ulong h_enter(CPUPPCState *env, sPAPREnvironment *spapr,
/* eieio(); FIXME: need some sort of barrier for smp? */
stq_p(hpte, pteh);
- assert(!(ldq_p(hpte) & HPTE_V_HVLOCK));
args[0] = pte_index + i;
return H_SUCCESS;
}
@@ -193,11 +175,6 @@ static target_ulong remove_hpte(CPUPPCState *env, target_ulong ptex,
}
hpte = env->external_htab + (ptex * HASH_PTE_SIZE_64);
- while (!lock_hpte(hpte, HPTE_V_HVLOCK)) {
- /* We have no real concurrency in qemu soft-emulation, so we
- * will never actually have a contested lock */
- assert(0);
- }
v = ldq_p(hpte);
r = ldq_p(hpte + (HASH_PTE_SIZE_64/2));
@@ -205,16 +182,13 @@ static target_ulong remove_hpte(CPUPPCState *env, target_ulong ptex,
if ((v & HPTE_V_VALID) == 0 ||
((flags & H_AVPN) && (v & ~0x7fULL) != avpn) ||
((flags & H_ANDCOND) && (v & avpn) != 0)) {
- stq_p(hpte, v & ~HPTE_V_HVLOCK);
- assert(!(ldq_p(hpte) & HPTE_V_HVLOCK));
return REMOVE_NOT_FOUND;
}
- *vp = v & ~HPTE_V_HVLOCK;
+ *vp = v;
*rp = r;
stq_p(hpte, 0);
rb = compute_tlbie_rb(v, r, ptex);
ppc_tlb_invalidate_one(env, rb);
- assert(!(ldq_p(hpte) & HPTE_V_HVLOCK));
return REMOVE_SUCCESS;
}
@@ -324,19 +298,12 @@ static target_ulong h_protect(CPUPPCState *env, sPAPREnvironment *spapr,
}
hpte = env->external_htab + (pte_index * HASH_PTE_SIZE_64);
- while (!lock_hpte(hpte, HPTE_V_HVLOCK)) {
- /* We have no real concurrency in qemu soft-emulation, so we
- * will never actually have a contested lock */
- assert(0);
- }
v = ldq_p(hpte);
r = ldq_p(hpte + (HASH_PTE_SIZE_64/2));
if ((v & HPTE_V_VALID) == 0 ||
((flags & H_AVPN) && (v & ~0x7fULL) != avpn)) {
- stq_p(hpte, v & ~HPTE_V_HVLOCK);
- assert(!(ldq_p(hpte) & HPTE_V_HVLOCK));
return H_NOT_FOUND;
}
@@ -350,8 +317,7 @@ static target_ulong h_protect(CPUPPCState *env, sPAPREnvironment *spapr,
ppc_tlb_invalidate_one(env, rb);
stq_p(hpte + (HASH_PTE_SIZE_64/2), r);
/* Don't need a memory barrier, due to qemu's global lock */
- stq_p(hpte, v & ~HPTE_V_HVLOCK);
- assert(!(ldq_p(hpte) & HPTE_V_HVLOCK));
+ stq_p(hpte, v);
return H_SUCCESS;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] pseries: Remove unnecessary locking from PAPR hash table hcalls
2012-09-21 3:42 ` [Qemu-devel] [PATCH 4/4] pseries: Remove unnecessary locking from PAPR hash table hcalls David Gibson
@ 2012-09-24 13:57 ` Alexander Graf
0 siblings, 0 replies; 6+ messages in thread
From: Alexander Graf @ 2012-09-24 13:57 UTC (permalink / raw)
To: David Gibson; +Cc: qemu-ppc, qemu-devel
On 21.09.2012, at 05:42, David Gibson wrote:
> In the paravirtualized environment provided by PAPR, there is a standard
> locking scheme so that hypercalls updating the hash page table from
> different guest threads don't corrupt the haah table state. We implement
> this HVLOCK bit in out page table hypercalls. However, it is not necessary
> in our case, since the hypercalls all run in the qemu environment under the
> big qemu lock.
>
> Therefore, this patch removes the locking code. This has the additional
> advantage of freeing up a hash PTE bit which will be useful for migration
> support.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Thanks, applied to ppc-next.
If we ever get around to implement multi-threaded TCG, we know where to find the code :)
Alex
^ permalink raw reply [flat|nested] 6+ messages in thread