All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Alexander Graf <agraf@suse.de>, Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC PATCH v2] spapr: Enable use of huge pages
Date: Thu, 10 Jul 2014 20:45:28 +1000	[thread overview]
Message-ID: <53BE6EC8.8020103@ozlabs.ru> (raw)
In-Reply-To: <53BE6AF1.1020905@suse.de>

On 07/10/2014 08:29 PM, Alexander Graf wrote:
> 
> On 09.07.14 15:59, Alexey Kardashevskiy wrote:
>> On 07/09/2014 05:46 PM, Paolo Bonzini wrote:> Il 09/07/2014 07:57, Alexey
>> Kardashevskiy ha scritto:
>>>> 0b183fc87 "memory: move mem_path handling to
>>>> memory_region_allocate_system_memory" disabled -mempath use for all
>>>> machines that do not use memory_region_allocate_system_memory() to
>>>> register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
>>>> support was disabled for it.
>>>>
>>>> This replaces memory_region_init_ram()+vmstate_register_ram_global() with
>>>> memory_region_allocate_system_memory() to get huge pages back.
>>>>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Cc: Hu Tao <hutao@cn.fujitsu.com>
>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>> ---
>>>>   hw/ppc/spapr.c | 4 ++--
>>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index a23c0f0..8fa9f7e 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -1337,8 +1337,8 @@ static void ppc_spapr_init(MachineState *machine)
>>>>           ram_addr_t nonrma_base = rma_alloc_size;
>>>>           ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
>>>>
>>>> -        memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
>>>> -        vmstate_register_ram_global(ram);
>>>> +        memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>>>> +                                             nonrma_size);
>>> The reason why I didn't do this in the simple way is that depending on the
>>> value of nonrma_base you may get smaller hugepages than you wanted.
>>>
>>> For example, if the hugepage size is 1G but nonrma_base is 32M, you will
>>> not be able to get a page size larger than 32M.
>>>
>>> Depending on the value of nonrma_base, it may be better to allocate the
>>> whole spapr->ram_limit to ppc_spapr.ram, and just ignore the first part
>>> of it.
>>>
>>> I see in target-ppc/kvm.c that rma_alloc_size is capped to 256M, and  in
>>> practice it is 128M (arch/powerpc/kvm/book3s_hv_builtin.c.  Considering
>>> that Linux overcommits so the memory isn't lost in the non-hugepage case, I
>>> think it's better to just waste the 128M of address space.
>>>
>>> Paolo
>>>
>>>>           memory_region_add_subregion(sysmem, nonrma_base, ram);
>>>>       }
>> Did you mean something like below? If so, I have to change MR tree and
>> place RMA under RAM, I guess.
>> I'll try to give it a try tomorrow on bare PPC970.
>>
>>
>>
>>
>> ---
>>   hw/ppc/spapr.c       | 19 ++++++++++++-------
>>   target-ppc/kvm.c     |  9 +--------
>>   target-ppc/kvm_ppc.h |  2 +-
>>   3 files changed, 14 insertions(+), 16 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index a23c0f0..47ae6c1 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1223,6 +1223,7 @@ static void ppc_spapr_init(MachineState *machine)
>>       int i;
>>       MemoryRegion *sysmem = get_system_memory();
>>       MemoryRegion *ram = g_new(MemoryRegion, 1);
>> +    MemoryRegion *rma_region;
>>       hwaddr rma_alloc_size;
>>       hwaddr node0_size = (nb_numa_nodes > 1) ? numa_info[0].node_mem :
>> ram_size;
>>       uint32_t initrd_base = 0;
>> @@ -1230,6 +1231,7 @@ static void ppc_spapr_init(MachineState *machine)
>>       long load_limit, rtas_limit, fw_size;
>>       bool kernel_le = false;
>>       char *filename;
>> +    void *rma = NULL;
>>         msi_supported = true;
>>   @@ -1239,7 +1241,7 @@ static void ppc_spapr_init(MachineState *machine)
>>       cpu_ppc_hypercall = emulate_spapr_hypercall;
>>         /* Allocate RMA if necessary */
>> -    rma_alloc_size = kvmppc_alloc_rma("ppc_spapr.rma", sysmem);
>> +    rma_alloc_size = kvmppc_alloc_rma(&rma);
>>         if (rma_alloc_size == -1) {
>>           hw_error("qemu: Unable to create RMA\n");
>> @@ -1333,13 +1335,16 @@ static void ppc_spapr_init(MachineState *machine)
>>         /* allocate RAM */
>>       spapr->ram_limit = ram_size;
>> -    if (spapr->ram_limit > rma_alloc_size) {
>> -        ram_addr_t nonrma_base = rma_alloc_size;
>> -        ram_addr_t nonrma_size = spapr->ram_limit - rma_alloc_size;
>> +    memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>> +                                         spapr->ram_limit);
>> +    memory_region_add_subregion(sysmem, 0, ram);
>>   -        memory_region_init_ram(ram, NULL, "ppc_spapr.ram", nonrma_size);
>> -        vmstate_register_ram_global(ram);
>> -        memory_region_add_subregion(sysmem, nonrma_base, ram);
>> +    if (rma_alloc_size && rma) {
>> +        rma_region = g_new(MemoryRegion, 1);
>> +        memory_region_init_ram_ptr(rma_region, NULL, "ppc_spapr.rma",
>> +                                   rma_alloc_size, rma);
>> +        vmstate_register_ram_global(rma_region);
>> +        memory_region_add_subregion(sysmem, 0, rma_region);
>>       }
>>         filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>> index 995706a..9ca14d2 100644
>> --- a/target-ppc/kvm.c
>> +++ b/target-ppc/kvm.c
>> @@ -1582,13 +1582,11 @@ int kvmppc_smt_threads(void)
>>   }
>>     #ifdef TARGET_PPC64
>> -off_t kvmppc_alloc_rma(const char *name, MemoryRegion *sysmem)
>> +off_t kvmppc_alloc_rma(void **rma)
>>   {
>> -    void *rma;
>>       off_t size;
>>       int fd;
>>       struct kvm_allocate_rma ret;
>> -    MemoryRegion *rma_region;
>>         /* If cap_ppc_rma == 0, contiguous RMA allocation is not supported
>>        * if cap_ppc_rma == 1, contiguous RMA allocation is supported, but
>> @@ -1617,11 +1615,6 @@ off_t kvmppc_alloc_rma(const char *name,
>> MemoryRegion *sysmem)
>>           return -1;
>>       };
>>   -    rma_region = g_new(MemoryRegion, 1);
>> -    memory_region_init_ram_ptr(rma_region, NULL, name, size, rma);
>> -    vmstate_register_ram_global(rma_region);
>> -    memory_region_add_subregion(sysmem, 0, rma_region);
>> -
> 
> I don't see where you set *rma here.

That patch was an RFC, this is why :)

> 
> Apart from that while I think that with hugetlbfs we might actually waste a
> few MB of RAM, I don't think it's a real problem for systems that require
> an RMA. So semantically the change works well for me. Please verify it
> works though :).

I managed to verify that it still works on PPC970 and will report the
patch(es) soon.



-- 
Alexey

      reply	other threads:[~2014-07-10 10:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-09  5:57 [Qemu-devel] [PATCH] spapr: Enable use of huge pages Alexey Kardashevskiy
2014-07-09  7:38 ` Hu Tao
2014-07-09  7:46 ` Paolo Bonzini
2014-07-09 13:59   ` [Qemu-devel] [RFC PATCH v2] " Alexey Kardashevskiy
2014-07-09 14:02     ` Paolo Bonzini
2014-07-10 10:29     ` Alexander Graf
2014-07-10 10:45       ` Alexey Kardashevskiy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53BE6EC8.8020103@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=agraf@suse.de \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.