From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59905) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X5BbF-0000hy-76 for qemu-devel@nongnu.org; Thu, 10 Jul 2014 06:29:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X5Bb8-00037w-Qt for qemu-devel@nongnu.org; Thu, 10 Jul 2014 06:29:13 -0400 Message-ID: <53BE6AF1.1020905@suse.de> Date: Thu, 10 Jul 2014 12:29:05 +0200 From: Alexander Graf MIME-Version: 1.0 References: <53BCF352.7070005@redhat.com> <1404914381-9953-1-git-send-email-aik@ozlabs.ru> In-Reply-To: <1404914381-9953-1-git-send-email-aik@ozlabs.ru> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH v2] spapr: Enable use of huge pages List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy , Paolo Bonzini Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org On 09.07.14 15:59, Alexey Kardashevskiy wrote: > On 07/09/2014 05:46 PM, Paolo Bonzini wrote:> Il 09/07/2014 07:57, Alexey Kardashevskiy ha scritto: >>> 0b183fc87 "memory: move mem_path handling to >>> memory_region_allocate_system_memory" disabled -mempath use for all >>> machines that do not use memory_region_allocate_system_memory() to >>> register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages >>> support was disabled for it. >>> >>> This replaces memory_region_init_ram()+vmstate_register_ram_global() with >>> memory_region_allocate_system_memory() to get huge pages back. >>> >>> Cc: Paolo Bonzini >>> Cc: Hu Tao >>> Signed-off-by: Alexey Kardashevskiy >>> --- >>> hw/ppc/spapr.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >>> index a23c0f0..8fa9f7e 100644 >>> --- a/hw/ppc/spapr.c >>> +++ b/hw/ppc/spapr.c >>> @@ -1337,8 +1337,8 @@ static void ppc_spapr_init(MachineState *machine) >>> ram_addr_t nonrma_base = rma_alloc_size; >>> ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size; >>> >>> - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size); >>> - vmstate_register_ram_global(ram); >>> + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram", >>> + nonrma_size); >> The reason why I didn't do this in the simple way is that depending on the >> value of nonrma_base you may get smaller hugepages than you wanted. >> >> For example, if the hugepage size is 1G but nonrma_base is 32M, you will >> not be able to get a page size larger than 32M. >> >> Depending on the value of nonrma_base, it may be better to allocate the >> whole spapr->ram_limit to ppc_spapr.ram, and just ignore the first part of it. >> >> I see in target-ppc/kvm.c that rma_alloc_size is capped to 256M, and in >> practice it is 128M (arch/powerpc/kvm/book3s_hv_builtin.c. Considering >> that Linux overcommits so the memory isn't lost in the non-hugepage case, I >> think it's better to just waste the 128M of address space. >> >> Paolo >> >>> memory_region_add_subregion(sysmem, nonrma_base, ram); >>> } > Did you mean something like below? If so, I have to change MR tree and > place RMA under RAM, I guess. > I'll try to give it a try tomorrow on bare PPC970. > > > > > --- > hw/ppc/spapr.c | 19 ++++++++++++------- > target-ppc/kvm.c | 9 +-------- > target-ppc/kvm_ppc.h | 2 +- > 3 files changed, 14 insertions(+), 16 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index a23c0f0..47ae6c1 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1223,6 +1223,7 @@ static void ppc_spapr_init(MachineState *machine) > int i; > MemoryRegion *sysmem = get_system_memory(); > MemoryRegion *ram = g_new(MemoryRegion, 1); > + MemoryRegion *rma_region; > hwaddr rma_alloc_size; > hwaddr node0_size = (nb_numa_nodes > 1) ? numa_info[0].node_mem : ram_size; > uint32_t initrd_base = 0; > @@ -1230,6 +1231,7 @@ static void ppc_spapr_init(MachineState *machine) > long load_limit, rtas_limit, fw_size; > bool kernel_le = false; > char *filename; > + void *rma = NULL; > > msi_supported = true; > > @@ -1239,7 +1241,7 @@ static void ppc_spapr_init(MachineState *machine) > cpu_ppc_hypercall = emulate_spapr_hypercall; > > /* Allocate RMA if necessary */ > - rma_alloc_size = kvmppc_alloc_rma("ppc_spapr.rma", sysmem); > + rma_alloc_size = kvmppc_alloc_rma(&rma); > > if (rma_alloc_size == -1) { > hw_error("qemu: Unable to create RMA\n"); > @@ -1333,13 +1335,16 @@ static void ppc_spapr_init(MachineState *machine) > > /* allocate RAM */ > spapr->ram_limit = ram_size; > - if (spapr->ram_limit > rma_alloc_size) { > - ram_addr_t nonrma_base = rma_alloc_size; > - ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size; > + memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram", > + spapr->ram_limit); > + memory_region_add_subregion(sysmem, 0, ram); > > - memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size); > - vmstate_register_ram_global(ram); > - memory_region_add_subregion(sysmem, nonrma_base, ram); > + if (rma_alloc_size && rma) { > + rma_region = g_new(MemoryRegion, 1); > + memory_region_init_ram_ptr(rma_region, NULL, "ppc_spapr.rma", > + rma_alloc_size, rma); > + vmstate_register_ram_global(rma_region); > + memory_region_add_subregion(sysmem, 0, rma_region); > } > > filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin"); > diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c > index 995706a..9ca14d2 100644 > --- a/target-ppc/kvm.c > +++ b/target-ppc/kvm.c > @@ -1582,13 +1582,11 @@ int kvmppc_smt_threads(void) > } > > #ifdef TARGET_PPC64 > -off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem) > +off_t kvmppc_alloc_rma(void **rma) > { > - void *rma; > off_t size; > int fd; > struct kvm_allocate_rma ret; > - MemoryRegion *rma_region; > > /* If cap_ppc_rma == 0, contiguous RMA allocation is not supported > * if cap_ppc_rma == 1, contiguous RMA allocation is supported, but > @@ -1617,11 +1615,6 @@ off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem) > return -1; > }; > > - rma_region = g_new(MemoryRegion, 1); > - memory_region_init_ram_ptr(rma_region, NULL, name, size, rma); > - vmstate_register_ram_global(rma_region); > - memory_region_add_subregion(sysmem, 0, rma_region); > - I don't see where you set *rma here. Apart from that while I think that with hugetlbfs we might actually waste a few MB of RAM, I don't think it's a real problem for systems that require an RMA. So semantically the change works well for me. Please verify it works though :). Alex