Re: [PATCH v3 1/1] oslib-posix: initialize backend memory objects in parallel

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Mark Kanda <mark.kanda@oracle.com>
To: David Hildenbrand <david@redhat.com>, qemu-devel@nongnu.org
Cc: pbonzini@redhat.com, berrange@redhat.com
Subject: Re: [PATCH v3 1/1] oslib-posix: initialize backend memory objects in parallel
Date: Wed, 31 Jan 2024 08:27:24 -0600	[thread overview]
Message-ID: <578395fd-9538-4d3d-85ea-f5e9b95b8bd7@oracle.com> (raw)
In-Reply-To: <46fc0732-8735-4440-b14a-13e2389d7d6c@redhat.com>

On 1/31/24 8:04 AM, David Hildenbrand wrote:
> On 31.01.24 14:48, Mark Kanda wrote:
>> QEMU initializes preallocated backend memory as the objects are 
>> parsed from
>> the command line. This is not optimal in some cases (e.g. memory 
>> spanning
>> multiple NUMA nodes) because the memory objects are initialized in 
>> series.
>>
>> Allow the initialization to occur in parallel (asynchronously). In 
>> order to
>> ensure optimal thread placement, asynchronous initialization requires 
>> prealloc
>> context threads to be in use.
>>
>> Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   backends/hostmem.c     |   8 ++-
>>   hw/virtio/virtio-mem.c |   4 +-
>>   include/qemu/osdep.h   |  18 +++++-
>>   system/vl.c            |   8 +++
>>   util/oslib-posix.c     | 131 +++++++++++++++++++++++++++++++----------
>>   util/oslib-win32.c     |   8 ++-
>>   6 files changed, 140 insertions(+), 37 deletions(-)
>>
>> diff --git a/backends/hostmem.c b/backends/hostmem.c
>> index 30f69b2cb5..8f602dc86f 100644
>> --- a/backends/hostmem.c
>> +++ b/backends/hostmem.c
>> @@ -20,6 +20,7 @@
>>   #include "qom/object_interfaces.h"
>>   #include "qemu/mmap-alloc.h"
>>   #include "qemu/madvise.h"
>> +#include "hw/qdev-core.h"
>>     #ifdef CONFIG_NUMA
>>   #include <numaif.h>
>> @@ -235,9 +236,10 @@ static void 
>> host_memory_backend_set_prealloc(Object *obj, bool value,
>>           int fd = memory_region_get_fd(&backend->mr);
>>           void *ptr = memory_region_get_ram_ptr(&backend->mr);
>>           uint64_t sz = memory_region_size(&backend->mr);
>> +        bool async = !phase_check(PHASE_MACHINE_INITIALIZED);
>>             if (!qemu_prealloc_mem(fd, ptr, sz, 
>> backend->prealloc_threads,
>> -                               backend->prealloc_context, errp)) {
>> +                               backend->prealloc_context, async, 
>> errp)) {
>>               return;
>>           }
>
> I think we will never trigger that case: we would have to set the 
> propertly after the device was already initialized, which shouldn't 
> happen.
>
> So I guess we can simplify and drop that.
>

Will fix.

>>           backend->prealloc = true;
>
>
> [...]
>
>> +++ b/include/qemu/osdep.h
>> @@ -680,6 +680,8 @@ typedef struct ThreadContext ThreadContext;
>>    * @area: start address of the are to preallocate
>>    * @sz: the size of the area to preallocate
>>    * @max_threads: maximum number of threads to use
>> + * @tc: prealloc context threads pointer, NULL if not in use
>> + * @async: request asynchronous preallocation, requires @tc
>>    * @errp: returns an error if this function fails
>>    *
>>    * Preallocate memory (populate/prefault page tables writable) for 
>> the virtual
>> @@ -687,10 +689,24 @@ typedef struct ThreadContext ThreadContext;
>>    * each page in the area was faulted in writable at least once, for 
>> example,
>>    * after allocating file blocks for mapped files.
>>    *
>> + * When setting @async, allocation might be performed asynchronously.
>> + * qemu_finish_async_mem_prealloc() must be called to finish any 
>> asynchronous
>> + * preallocation.
>> + *
>>    * Return: true on success, else false setting @errp with error.
>>    */
>>   bool qemu_prealloc_mem(int fd, char *area, size_t sz, int max_threads,
>> -                       ThreadContext *tc, Error **errp);
>> +                       ThreadContext *tc, bool async, Error **errp);
>> +
>> +/**
>> + * qemu_finish_async_mem_prealloc:
>> + * @errp: returns an error if this function fails
>> + *
>> + * Finish all outstanding asynchronous memory preallocation.
>> + *
>> + * Return: true on success, else false setting @errp with error.
>> + */
>> +bool qemu_finish_async_mem_prealloc(Error **errp);
>
> Suboptimal suggestion from my side, guess it woud be better to call this
>
> "qemu_finish_async_prealloc_mem" to match "qemu_prealloc_mem"
>

Will fix.

>>     /**
>>    * qemu_get_pid_name:
>> diff --git a/system/vl.c b/system/vl.c
>> index 788d88ea03..290bb3232b 100644
>> --- a/system/vl.c
>> +++ b/system/vl.c
>> @@ -2009,6 +2009,14 @@ static void qemu_create_late_backends(void)
>>         object_option_foreach_add(object_create_late);
>>   +    /*
>> +     * Wait for any outstanding memory prealloc from created memory
>> +     * backends to complete.
>> +     */
>> +    if (!qemu_finish_async_mem_prealloc(&error_fatal)) {
>> +        exit(1);
>> +    }
>> +
>
> I'm wondering if we should have a new phase instead, like
>
> PHASE_LATE_OBJECTS_CREATED.
>
> and do here
>
> phase_advance(PHASE_LATE_OBJECTS_CREATED);
>
> and use that instead. Currently, there is a "gap" between both things. 
> I don't think anything is actually broken right now (because any 
> internal memory abckend wouldn't have a thread context), but it might 
> be much cleaner and obvious that way.
>

OK. I'll call it 'PHASE_LATE_BACKENDS_CREATED' (to make it consistent 
with code comments/function name).

> Apart from that LGTM!
>

Thanks/regards,
-Mark

next prev parent reply	other threads:[~2024-01-31 14:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31 13:48 [PATCH v3 0/1] Initialize backend memory objects in parallel Mark Kanda
2024-01-31 13:48 ` [PATCH v3 1/1] oslib-posix: initialize " Mark Kanda
2024-01-31 14:04   ` David Hildenbrand
2024-01-31 14:27     ` Mark Kanda [this message]
2024-01-31 14:30       ` David Hildenbrand
2024-01-31 14:48         ` Mark Kanda
2024-01-31 14:57           ` David Hildenbrand
2024-01-31 15:02             ` Mark Kanda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=578395fd-9538-4d3d-85ea-f5e9b95b8bd7@oracle.com \
    --to=mark.kanda@oracle.com \
    --cc=berrange@redhat.com \
    --cc=david@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).