* [Qemu-devel] [PATCH] spapr: Enable use of huge pages
@ 2014-07-09 5:57 Alexey Kardashevskiy
2014-07-09 7:38 ` Hu Tao
2014-07-09 7:46 ` Paolo Bonzini
0 siblings, 2 replies; 7+ messages in thread
From: Alexey Kardashevskiy @ 2014-07-09 5:57 UTC (permalink / raw)
To: qemu-devel
Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-ppc, Alexander Graf,
Hu Tao
0b183fc87 "memory: move mem_path handling to
memory_region_allocate_system_memory" disabled -mempath use for all
machines that do not use memory_region_allocate_system_memory() to
register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
support was disabled for it.
This replaces memory_region_init_ram()+vmstate_register_ram_global() with
memory_region_allocate_system_memory() to get huge pages back.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
hw/ppc/spapr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a23c0f0..8fa9f7e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1337,8 +1337,8 @@ static void ppc_spapr_init(MachineState *machine)
ram_addr_t nonrma_base = rma_alloc_size;
ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
- memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
- vmstate_register_ram_global(ram);
+ memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
+ nonrma_size);
memory_region_add_subregion(sysmem, nonrma_base, ram);
}
--
2.0.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] spapr: Enable use of huge pages
2014-07-09 5:57 [Qemu-devel] [PATCH] spapr: Enable use of huge pages Alexey Kardashevskiy
@ 2014-07-09 7:38 ` Hu Tao
2014-07-09 7:46 ` Paolo Bonzini
1 sibling, 0 replies; 7+ messages in thread
From: Hu Tao @ 2014-07-09 7:38 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: Paolo Bonzini, qemu-ppc, qemu-devel, Alexander Graf
On Wed, Jul 09, 2014 at 03:57:52PM +1000, Alexey Kardashevskiy wrote:
> 0b183fc87 "memory: move mem_path handling to
> memory_region_allocate_system_memory" disabled -mempath use for all
> machines that do not use memory_region_allocate_system_memory() to
> register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
> support was disabled for it.
>
> This replaces memory_region_init_ram()+vmstate_register_ram_global() with
> memory_region_allocate_system_memory() to get huge pages back.
>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Hu Tao <hutao@cn.fujitsu.com>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> hw/ppc/spapr.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a23c0f0..8fa9f7e 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1337,8 +1337,8 @@ static void ppc_spapr_init(MachineState *machine)
> ram_addr_t nonrma_base = rma_alloc_size;
> ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
>
> - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
> - vmstate_register_ram_global(ram);
> + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
> + nonrma_size);
> memory_region_add_subregion(sysmem, nonrma_base, ram);
> }
>
> --
> 2.0.0
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
I had a patch that did this change for all boards:
http://lists.nongnu.org/archive/html/qemu-devel/2014-03/msg04982.html.
but incremental changes are OK to me.
Regards,
Hu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] spapr: Enable use of huge pages
2014-07-09 5:57 [Qemu-devel] [PATCH] spapr: Enable use of huge pages Alexey Kardashevskiy
2014-07-09 7:38 ` Hu Tao
@ 2014-07-09 7:46 ` Paolo Bonzini
2014-07-09 13:59 ` [Qemu-devel] [RFC PATCH v2] " Alexey Kardashevskiy
1 sibling, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2014-07-09 7:46 UTC (permalink / raw)
To: Alexey Kardashevskiy, qemu-devel; +Cc: Hu Tao, qemu-ppc, Alexander Graf
Il 09/07/2014 07:57, Alexey Kardashevskiy ha scritto:
> 0b183fc87 "memory: move mem_path handling to
> memory_region_allocate_system_memory" disabled -mempath use for all
> machines that do not use memory_region_allocate_system_memory() to
> register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
> support was disabled for it.
>
> This replaces memory_region_init_ram()+vmstate_register_ram_global() with
> memory_region_allocate_system_memory() to get huge pages back.
>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Hu Tao <hutao@cn.fujitsu.com>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> hw/ppc/spapr.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a23c0f0..8fa9f7e 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1337,8 +1337,8 @@ static void ppc_spapr_init(MachineState *machine)
> ram_addr_t nonrma_base = rma_alloc_size;
> ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
>
> - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
> - vmstate_register_ram_global(ram);
> + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
> + nonrma_size);
The reason why I didn't do this in the simple way is that depending on
the value of nonrma_base you may get smaller hugepages than you wanted.
For example, if the hugepage size is 1G but nonrma_base is 32M, you will
not be able to get a page size larger than 32M.
Depending on the value of nonrma_base, it may be better to allocate the
whole spapr->ram_limit to ppc_spapr.ram, and just ignore the first part
of it.
I see in target-ppc/kvm.c that rma_alloc_size is capped to 256M, and in
practice it is 128M (arch/powerpc/kvm/book3s_hv_builtin.c. Considering
that Linux overcommits so the memory isn't lost in the non-hugepage
case, I think it's better to just waste the 128M of address space.
Paolo
> memory_region_add_subregion(sysmem, nonrma_base, ram);
> }
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Qemu-devel] [RFC PATCH v2] spapr: Enable use of huge pages
2014-07-09 7:46 ` Paolo Bonzini
@ 2014-07-09 13:59 ` Alexey Kardashevskiy
2014-07-09 14:02 ` Paolo Bonzini
2014-07-10 10:29 ` Alexander Graf
0 siblings, 2 replies; 7+ messages in thread
From: Alexey Kardashevskiy @ 2014-07-09 13:59 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Alexey Kardashevskiy, qemu-ppc, qemu-devel, Alexander Graf
On 07/09/2014 05:46 PM, Paolo Bonzini wrote:> Il 09/07/2014 07:57, Alexey Kardashevskiy ha scritto:
>> 0b183fc87 "memory: move mem_path handling to
>> memory_region_allocate_system_memory" disabled -mempath use for all
>> machines that do not use memory_region_allocate_system_memory() to
>> register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
>> support was disabled for it.
>>
>> This replaces memory_region_init_ram()+vmstate_register_ram_global() with
>> memory_region_allocate_system_memory() to get huge pages back.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Hu Tao <hutao@cn.fujitsu.com>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>> hw/ppc/spapr.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index a23c0f0..8fa9f7e 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1337,8 +1337,8 @@ static void ppc_spapr_init(MachineState *machine)
>> ram_addr_t nonrma_base = rma_alloc_size;
>> ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
>>
>> - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
>> - vmstate_register_ram_global(ram);
>> + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>> + nonrma_size);
>
> The reason why I didn't do this in the simple way is that depending on the
> value of nonrma_base you may get smaller hugepages than you wanted.
>
> For example, if the hugepage size is 1G but nonrma_base is 32M, you will
> not be able to get a page size larger than 32M.
>
> Depending on the value of nonrma_base, it may be better to allocate the
> whole spapr->ram_limit to ppc_spapr.ram, and just ignore the first part of it.
>
> I see in target-ppc/kvm.c that rma_alloc_size is capped to 256M, and in
> practice it is 128M (arch/powerpc/kvm/book3s_hv_builtin.c. Considering
> that Linux overcommits so the memory isn't lost in the non-hugepage case, I
> think it's better to just waste the 128M of address space.
>
> Paolo
>
>> memory_region_add_subregion(sysmem, nonrma_base, ram);
>> }
Did you mean something like below? If so, I have to change MR tree and
place RMA under RAM, I guess.
I'll try to give it a try tomorrow on bare PPC970.
---
hw/ppc/spapr.c | 19 ++++++++++++-------
target-ppc/kvm.c | 9 +--------
target-ppc/kvm_ppc.h | 2 +-
3 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a23c0f0..47ae6c1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1223,6 +1223,7 @@ static void ppc_spapr_init(MachineState *machine)
int i;
MemoryRegion *sysmem = get_system_memory();
MemoryRegion *ram = g_new(MemoryRegion, 1);
+ MemoryRegion *rma_region;
hwaddr rma_alloc_size;
hwaddr node0_size = (nb_numa_nodes > 1) ? numa_info[0].node_mem : ram_size;
uint32_t initrd_base = 0;
@@ -1230,6 +1231,7 @@ static void ppc_spapr_init(MachineState *machine)
long load_limit, rtas_limit, fw_size;
bool kernel_le = false;
char *filename;
+ void *rma = NULL;
msi_supported = true;
@@ -1239,7 +1241,7 @@ static void ppc_spapr_init(MachineState *machine)
cpu_ppc_hypercall = emulate_spapr_hypercall;
/* Allocate RMA if necessary */
- rma_alloc_size = kvmppc_alloc_rma("ppc_spapr.rma", sysmem);
+ rma_alloc_size = kvmppc_alloc_rma(&rma);
if (rma_alloc_size == -1) {
hw_error("qemu: Unable to create RMA\n");
@@ -1333,13 +1335,16 @@ static void ppc_spapr_init(MachineState *machine)
/* allocate RAM */
spapr->ram_limit = ram_size;
- if (spapr->ram_limit > rma_alloc_size) {
- ram_addr_t nonrma_base = rma_alloc_size;
- ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
+ memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
+ spapr->ram_limit);
+ memory_region_add_subregion(sysmem, 0, ram);
- memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
- vmstate_register_ram_global(ram);
- memory_region_add_subregion(sysmem, nonrma_base, ram);
+ if (rma_alloc_size && rma) {
+ rma_region = g_new(MemoryRegion, 1);
+ memory_region_init_ram_ptr(rma_region, NULL, "ppc_spapr.rma",
+ rma_alloc_size, rma);
+ vmstate_register_ram_global(rma_region);
+ memory_region_add_subregion(sysmem, 0, rma_region);
}
filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 995706a..9ca14d2 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1582,13 +1582,11 @@ int kvmppc_smt_threads(void)
}
#ifdef TARGET_PPC64
-off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem)
+off_t kvmppc_alloc_rma(void **rma)
{
- void *rma;
off_t size;
int fd;
struct kvm_allocate_rma ret;
- MemoryRegion *rma_region;
/* If cap_ppc_rma == 0, contiguous RMA allocation is not supported
* if cap_ppc_rma == 1, contiguous RMA allocation is supported, but
@@ -1617,11 +1615,6 @@ off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem)
return -1;
};
- rma_region = g_new(MemoryRegion, 1);
- memory_region_init_ram_ptr(rma_region, NULL, name, size, rma);
- vmstate_register_ram_global(rma_region);
- memory_region_add_subregion(sysmem, 0, rma_region);
-
return size;
}
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 33c2171..bc5d2e8 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -31,7 +31,7 @@ int kvmppc_or_tsr_bits(PowerPCCPU *cpu, uint32_t tsr_bits);
int kvmppc_set_tcr(PowerPCCPU *cpu);
int kvmppc_booke_watchdog_enable(PowerPCCPU *cpu);
#ifndef CONFIG_USER_ONLY
-off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem);
+off_t kvmppc_alloc_rma(void **rma);
bool kvmppc_spapr_use_multitce(void);
void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size,
uint64_t bus_offset, uint32_t page_shift,
--
2.0.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH v2] spapr: Enable use of huge pages
2014-07-09 13:59 ` [Qemu-devel] [RFC PATCH v2] " Alexey Kardashevskiy
@ 2014-07-09 14:02 ` Paolo Bonzini
2014-07-10 10:29 ` Alexander Graf
1 sibling, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2014-07-09 14:02 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: qemu-ppc, qemu-devel, Alexander Graf
Il 09/07/2014 15:59, Alexey Kardashevskiy ha scritto:
> Did you mean something like below? If so, I have to change MR tree and
> place RMA under RAM, I guess.
You could also use priorities, but the patch below looks nicer.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH v2] spapr: Enable use of huge pages
2014-07-09 13:59 ` [Qemu-devel] [RFC PATCH v2] " Alexey Kardashevskiy
2014-07-09 14:02 ` Paolo Bonzini
@ 2014-07-10 10:29 ` Alexander Graf
2014-07-10 10:45 ` Alexey Kardashevskiy
1 sibling, 1 reply; 7+ messages in thread
From: Alexander Graf @ 2014-07-10 10:29 UTC (permalink / raw)
To: Alexey Kardashevskiy, Paolo Bonzini; +Cc: qemu-ppc, qemu-devel
On 09.07.14 15:59, Alexey Kardashevskiy wrote:
> On 07/09/2014 05:46 PM, Paolo Bonzini wrote:> Il 09/07/2014 07:57, Alexey Kardashevskiy ha scritto:
>>> 0b183fc87 "memory: move mem_path handling to
>>> memory_region_allocate_system_memory" disabled -mempath use for all
>>> machines that do not use memory_region_allocate_system_memory() to
>>> register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
>>> support was disabled for it.
>>>
>>> This replaces memory_region_init_ram()+vmstate_register_ram_global() with
>>> memory_region_allocate_system_memory() to get huge pages back.
>>>
>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>> Cc: Hu Tao <hutao@cn.fujitsu.com>
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> ---
>>> hw/ppc/spapr.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index a23c0f0..8fa9f7e 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -1337,8 +1337,8 @@ static void ppc_spapr_init(MachineState *machine)
>>> ram_addr_t nonrma_base = rma_alloc_size;
>>> ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
>>>
>>> - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
>>> - vmstate_register_ram_global(ram);
>>> + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>>> + nonrma_size);
>> The reason why I didn't do this in the simple way is that depending on the
>> value of nonrma_base you may get smaller hugepages than you wanted.
>>
>> For example, if the hugepage size is 1G but nonrma_base is 32M, you will
>> not be able to get a page size larger than 32M.
>>
>> Depending on the value of nonrma_base, it may be better to allocate the
>> whole spapr->ram_limit to ppc_spapr.ram, and just ignore the first part of it.
>>
>> I see in target-ppc/kvm.c that rma_alloc_size is capped to 256M, and in
>> practice it is 128M (arch/powerpc/kvm/book3s_hv_builtin.c. Considering
>> that Linux overcommits so the memory isn't lost in the non-hugepage case, I
>> think it's better to just waste the 128M of address space.
>>
>> Paolo
>>
>>> memory_region_add_subregion(sysmem, nonrma_base, ram);
>>> }
> Did you mean something like below? If so, I have to change MR tree and
> place RMA under RAM, I guess.
> I'll try to give it a try tomorrow on bare PPC970.
>
>
>
>
> ---
> hw/ppc/spapr.c | 19 ++++++++++++-------
> target-ppc/kvm.c | 9 +--------
> target-ppc/kvm_ppc.h | 2 +-
> 3 files changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a23c0f0..47ae6c1 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1223,6 +1223,7 @@ static void ppc_spapr_init(MachineState *machine)
> int i;
> MemoryRegion *sysmem = get_system_memory();
> MemoryRegion *ram = g_new(MemoryRegion, 1);
> + MemoryRegion *rma_region;
> hwaddr rma_alloc_size;
> hwaddr node0_size = (nb_numa_nodes > 1) ? numa_info[0].node_mem : ram_size;
> uint32_t initrd_base = 0;
> @@ -1230,6 +1231,7 @@ static void ppc_spapr_init(MachineState *machine)
> long load_limit, rtas_limit, fw_size;
> bool kernel_le = false;
> char *filename;
> + void *rma = NULL;
>
> msi_supported = true;
>
> @@ -1239,7 +1241,7 @@ static void ppc_spapr_init(MachineState *machine)
> cpu_ppc_hypercall = emulate_spapr_hypercall;
>
> /* Allocate RMA if necessary */
> - rma_alloc_size = kvmppc_alloc_rma("ppc_spapr.rma", sysmem);
> + rma_alloc_size = kvmppc_alloc_rma(&rma);
>
> if (rma_alloc_size == -1) {
> hw_error("qemu: Unable to create RMA\n");
> @@ -1333,13 +1335,16 @@ static void ppc_spapr_init(MachineState *machine)
>
> /* allocate RAM */
> spapr->ram_limit = ram_size;
> - if (spapr->ram_limit > rma_alloc_size) {
> - ram_addr_t nonrma_base = rma_alloc_size;
> - ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
> + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
> + spapr->ram_limit);
> + memory_region_add_subregion(sysmem, 0, ram);
>
> - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
> - vmstate_register_ram_global(ram);
> - memory_region_add_subregion(sysmem, nonrma_base, ram);
> + if (rma_alloc_size && rma) {
> + rma_region = g_new(MemoryRegion, 1);
> + memory_region_init_ram_ptr(rma_region, NULL, "ppc_spapr.rma",
> + rma_alloc_size, rma);
> + vmstate_register_ram_global(rma_region);
> + memory_region_add_subregion(sysmem, 0, rma_region);
> }
>
> filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 995706a..9ca14d2 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -1582,13 +1582,11 @@ int kvmppc_smt_threads(void)
> }
>
> #ifdef TARGET_PPC64
> -off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem)
> +off_t kvmppc_alloc_rma(void **rma)
> {
> - void *rma;
> off_t size;
> int fd;
> struct kvm_allocate_rma ret;
> - MemoryRegion *rma_region;
>
> /* If cap_ppc_rma == 0, contiguous RMA allocation is not supported
> * if cap_ppc_rma == 1, contiguous RMA allocation is supported, but
> @@ -1617,11 +1615,6 @@ off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem)
> return -1;
> };
>
> - rma_region = g_new(MemoryRegion, 1);
> - memory_region_init_ram_ptr(rma_region, NULL, name, size, rma);
> - vmstate_register_ram_global(rma_region);
> - memory_region_add_subregion(sysmem, 0, rma_region);
> -
I don't see where you set *rma here.
Apart from that while I think that with hugetlbfs we might actually
waste a few MB of RAM, I don't think it's a real problem for systems
that require an RMA. So semantically the change works well for me.
Please verify it works though :).
Alex
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [RFC PATCH v2] spapr: Enable use of huge pages
2014-07-10 10:29 ` Alexander Graf
@ 2014-07-10 10:45 ` Alexey Kardashevskiy
0 siblings, 0 replies; 7+ messages in thread
From: Alexey Kardashevskiy @ 2014-07-10 10:45 UTC (permalink / raw)
To: Alexander Graf, Paolo Bonzini; +Cc: qemu-ppc, qemu-devel
On 07/10/2014 08:29 PM, Alexander Graf wrote:
>
> On 09.07.14 15:59, Alexey Kardashevskiy wrote:
>> On 07/09/2014 05:46 PM, Paolo Bonzini wrote:> Il 09/07/2014 07:57, Alexey
>> Kardashevskiy ha scritto:
>>>> 0b183fc87 "memory: move mem_path handling to
>>>> memory_region_allocate_system_memory" disabled -mempath use for all
>>>> machines that do not use memory_region_allocate_system_memory() to
>>>> register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
>>>> support was disabled for it.
>>>>
>>>> This replaces memory_region_init_ram()+vmstate_register_ram_global() with
>>>> memory_region_allocate_system_memory() to get huge pages back.
>>>>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Cc: Hu Tao <hutao@cn.fujitsu.com>
>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>> ---
>>>> hw/ppc/spapr.c | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index a23c0f0..8fa9f7e 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -1337,8 +1337,8 @@ static void ppc_spapr_init(MachineState *machine)
>>>> ram_addr_t nonrma_base = rma_alloc_size;
>>>> ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
>>>>
>>>> - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
>>>> - vmstate_register_ram_global(ram);
>>>> + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>>>> + nonrma_size);
>>> The reason why I didn't do this in the simple way is that depending on the
>>> value of nonrma_base you may get smaller hugepages than you wanted.
>>>
>>> For example, if the hugepage size is 1G but nonrma_base is 32M, you will
>>> not be able to get a page size larger than 32M.
>>>
>>> Depending on the value of nonrma_base, it may be better to allocate the
>>> whole spapr->ram_limit to ppc_spapr.ram, and just ignore the first part
>>> of it.
>>>
>>> I see in target-ppc/kvm.c that rma_alloc_size is capped to 256M, and in
>>> practice it is 128M (arch/powerpc/kvm/book3s_hv_builtin.c. Considering
>>> that Linux overcommits so the memory isn't lost in the non-hugepage case, I
>>> think it's better to just waste the 128M of address space.
>>>
>>> Paolo
>>>
>>>> memory_region_add_subregion(sysmem, nonrma_base, ram);
>>>> }
>> Did you mean something like below? If so, I have to change MR tree and
>> place RMA under RAM, I guess.
>> I'll try to give it a try tomorrow on bare PPC970.
>>
>>
>>
>>
>> ---
>> hw/ppc/spapr.c | 19 ++++++++++++-------
>> target-ppc/kvm.c | 9 +--------
>> target-ppc/kvm_ppc.h | 2 +-
>> 3 files changed, 14 insertions(+), 16 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index a23c0f0..47ae6c1 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1223,6 +1223,7 @@ static void ppc_spapr_init(MachineState *machine)
>> int i;
>> MemoryRegion *sysmem = get_system_memory();
>> MemoryRegion *ram = g_new(MemoryRegion, 1);
>> + MemoryRegion *rma_region;
>> hwaddr rma_alloc_size;
>> hwaddr node0_size = (nb_numa_nodes > 1) ? numa_info[0].node_mem :
>> ram_size;
>> uint32_t initrd_base = 0;
>> @@ -1230,6 +1231,7 @@ static void ppc_spapr_init(MachineState *machine)
>> long load_limit, rtas_limit, fw_size;
>> bool kernel_le = false;
>> char *filename;
>> + void *rma = NULL;
>> msi_supported = true;
>> @@ -1239,7 +1241,7 @@ static void ppc_spapr_init(MachineState *machine)
>> cpu_ppc_hypercall = emulate_spapr_hypercall;
>> /* Allocate RMA if necessary */
>> - rma_alloc_size = kvmppc_alloc_rma("ppc_spapr.rma", sysmem);
>> + rma_alloc_size = kvmppc_alloc_rma(&rma);
>> if (rma_alloc_size == -1) {
>> hw_error("qemu: Unable to create RMA\n");
>> @@ -1333,13 +1335,16 @@ static void ppc_spapr_init(MachineState *machine)
>> /* allocate RAM */
>> spapr->ram_limit = ram_size;
>> - if (spapr->ram_limit > rma_alloc_size) {
>> - ram_addr_t nonrma_base = rma_alloc_size;
>> - ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
>> + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>> + spapr->ram_limit);
>> + memory_region_add_subregion(sysmem, 0, ram);
>> - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
>> - vmstate_register_ram_global(ram);
>> - memory_region_add_subregion(sysmem, nonrma_base, ram);
>> + if (rma_alloc_size && rma) {
>> + rma_region = g_new(MemoryRegion, 1);
>> + memory_region_init_ram_ptr(rma_region, NULL, "ppc_spapr.rma",
>> + rma_alloc_size, rma);
>> + vmstate_register_ram_global(rma_region);
>> + memory_region_add_subregion(sysmem, 0, rma_region);
>> }
>> filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>> index 995706a..9ca14d2 100644
>> --- a/target-ppc/kvm.c
>> +++ b/target-ppc/kvm.c
>> @@ -1582,13 +1582,11 @@ int kvmppc_smt_threads(void)
>> }
>> #ifdef TARGET_PPC64
>> -off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem)
>> +off_t kvmppc_alloc_rma(void **rma)
>> {
>> - void *rma;
>> off_t size;
>> int fd;
>> struct kvm_allocate_rma ret;
>> - MemoryRegion *rma_region;
>> /* If cap_ppc_rma == 0, contiguous RMA allocation is not supported
>> * if cap_ppc_rma == 1, contiguous RMA allocation is supported, but
>> @@ -1617,11 +1615,6 @@ off_t kvmppc_alloc_rma(const char *name,
>> MemoryRegion *sysmem)
>> return -1;
>> };
>> - rma_region = g_new(MemoryRegion, 1);
>> - memory_region_init_ram_ptr(rma_region, NULL, name, size, rma);
>> - vmstate_register_ram_global(rma_region);
>> - memory_region_add_subregion(sysmem, 0, rma_region);
>> -
>
> I don't see where you set *rma here.
That patch was an RFC, this is why :)
>
> Apart from that while I think that with hugetlbfs we might actually waste a
> few MB of RAM, I don't think it's a real problem for systems that require
> an RMA. So semantically the change works well for me. Please verify it
> works though :).
I managed to verify that it still works on PPC970 and will report the
patch(es) soon.
--
Alexey
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-07-10 10:45 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-09 5:57 [Qemu-devel] [PATCH] spapr: Enable use of huge pages Alexey Kardashevskiy
2014-07-09 7:38 ` Hu Tao
2014-07-09 7:46 ` Paolo Bonzini
2014-07-09 13:59 ` [Qemu-devel] [RFC PATCH v2] " Alexey Kardashevskiy
2014-07-09 14:02 ` Paolo Bonzini
2014-07-10 10:29 ` Alexander Graf
2014-07-10 10:45 ` Alexey Kardashevskiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).