From: David Hildenbrand <david@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
Eduardo Habkost <ehabkost@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Igor Mammedov <imammedo@redhat.com>,
Marek Kedzierski <mkedzier@redhat.com>
Subject: Re: [PATCH v1 0/3] util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()
Date: Wed, 21 Jul 2021 10:23:55 +0200 [thread overview]
Message-ID: <028e93aa-c25c-93db-4f3a-40e5c1eaabb2@redhat.com> (raw)
In-Reply-To: <YPbhhj1mbwFtdc4z@redhat.com>
On 20.07.21 16:45, Daniel P. Berrangé wrote:
> On Wed, Jul 14, 2021 at 01:23:03PM +0200, David Hildenbrand wrote:
>> #1 adds support for MADV_POPULATE_WRITE, #2 cleans up the code to avoid
>> global variables and prepare for concurrency and #3 makes os_mem_prealloc()
>> safe to be called from multiple threads concurrently.
>>
>> Details regarding MADV_POPULATE_WRITE can be found in introducing upstream
>> Linux commit 4ca9b3859dac ("mm/madvise: introduce
>> MADV_POPULATE_(READ|WRITE) to prefault page tables") and in the latest man
>> page patch [1].
>
> Looking at that commit message, I see your caveat about POPULATE_WRITE
> used together with shared file mappings, causing an undesirable glut
> of dirty pages that needs to be flushed back to the underlying storage.
>
> Is this something we need to be concerned with for the hostmem-file.c
> implementation ? While it is mostly used to point to files on tmpfs
> or hugetlbfs, I think users do something point it to a plain file
> on a normal filesystem. So will we need to optimize to use the
> fallocate+POPULATE_READ combination at some point ?
In the future, it might make sense to use fallocate() only when it comes
to shared file mappings.
AFAIKS os_mem_prealloc() currently serves the following purposes:
1) Preallocate anonymous memory or backend storage (file, hugetlbfs, ...)
2) Apply mbind() policy, preallocating it from the right node when
applicable.
3) Prefault page tables
For shared mappings, it's a little bit difficult, though: mbind() does
not seem to work on shared mappings (which to some degree makes
logically sense, but I don't think QEMU users are aware that it is like
that): "The specified policy will be ignored for any MAP_SHARED
mappings in the specified memory range. Rather the pages will be
allocated according to the memory policy of the thread that caused the
page to be allocated. Again, this may not be the thread that called
mbind()."
So 2) does not apply. A simple fallocate() can get 1) done more efficiently.
So if we want to use MADV_POPULATE_READ completely depends on whether we
want 3). It can make sense to prefault page tables for RT workloads,
however, there is usually nothing stopping the OS from clearing the page
cache and requiring a refault later -- except with mlock.
So whether we want fallocate() or fallocate()+MADV_POPULATE_READ for
shared file mappings really depends on the use case, and on the system
setup. If the system won't immediately free up the page cache and undo
what MADV_POPULATE_READ did, it might make sense to use it.
Long story short: it's complicated :)
--
Thanks,
David / dhildenb
prev parent reply other threads:[~2021-07-21 8:24 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-14 11:23 [PATCH v1 0/3] util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc() David Hildenbrand
2021-07-14 11:23 ` [PATCH v1 1/3] " David Hildenbrand
2021-07-20 14:08 ` Daniel P. Berrangé
2021-07-20 14:34 ` David Hildenbrand
2021-07-14 11:23 ` [PATCH v1 2/3] util/oslib-posix: Introduce and use MemsetContext for touch_all_pages() David Hildenbrand
2021-07-20 14:27 ` Daniel P. Berrangé
2021-07-14 11:23 ` [PATCH v1 3/3] util/oslib-posix: Support concurrent os_mem_prealloc() invocation David Hildenbrand
2021-07-20 14:22 ` Daniel P. Berrangé
2021-07-20 14:27 ` David Hildenbrand
2021-07-20 14:31 ` Daniel P. Berrangé
2021-07-20 14:35 ` David Hildenbrand
2021-07-20 13:55 ` [PATCH v1 0/3] util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc() Pankaj Gupta
2021-07-20 13:58 ` Pankaj Gupta
2021-07-20 14:45 ` Daniel P. Berrangé
2021-07-21 8:23 ` David Hildenbrand [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=028e93aa-c25c-93db-4f3a-40e5c1eaabb2@redhat.com \
--to=david@redhat.com \
--cc=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=ehabkost@redhat.com \
--cc=imammedo@redhat.com \
--cc=mkedzier@redhat.com \
--cc=mst@redhat.com \
--cc=pankaj.gupta.linux@gmail.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).