From: Anthony Liguori <anthony@codemonkey.ws>
To: Jan Kiszka <jan.kiszka@siemens.com>,
Satoru Moriya <satoru.moriya@hds.com>
Cc: "dle-develop@lists.sourceforge.net"
<dle-develop@lists.sourceforge.net>,
Seiji Aguchi <seiji.aguchi@hds.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"avi@redhat.com" <avi@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] Add option to mlock guest and qemu memory
Date: Fri, 28 Sep 2012 07:33:44 -0500 [thread overview]
Message-ID: <87a9was4wn.fsf@codemonkey.ws> (raw)
In-Reply-To: <50655A35.9080505@siemens.com>
Jan Kiszka <jan.kiszka@siemens.com> writes:
> On 2012-09-28 01:21, Satoru Moriya wrote:
>> This is a first time for me to post a patch to qemu-devel.
>> If there is something missing/wrong, please let me know.
>>
>> We have some plans to migrate old enterprise systems which require
>> low latency (msec order) to kvm virtualized environment. Usually,
>> we uses mlock to preallocate and pin down process memory in order
>> to avoid page allocation in latency critical path. On the other
>> hand, in kvm environment, mlocking in guests is not effective
>> because it can't avoid page reclaim in host. Actually, to avoid
>> guest memory reclaim, qemu has "mem-path" option that is actually
>> for using hugepage. But a memory region of qemu is not allocated
>> on hugepage, so it may be reclaimed. That may cause a latency
>> problem.
>>
>> To avoid guest and qemu memory reclaim, this patch introduces
>> a new "mlock" option. With this option, we can preallocate and
>> pin down guest and qemu memory before booting guest OS.
>
> I guess this reduces the likeliness of multi-millisecond latencies for
> you but not eliminate them. Of course, mlockall is part of our local
> changes for real-time QEMU/KVM, but it is just one of the many pieces
> required. I'm wondering how the situation is on your side.
>
> I think mlockall should once be enabled automatically as soon as you ask
> for real-time support for QEMU guests. How that should be controlled is
> another question. I'm currently carrying a top-level switch "-rt
> maxprio=x[,policy=y]" here, likely not the final solution. I'm not
> really convinced we need to control memory locking separately. And as we
> are very reluctant to add new top-level switches, this is even more
> important.
I think you're right here although I'd suggest not abbreviating.
Regards,
Anthony Liguori
>
>>
>> Tested on Linux, x86_64 (fedora 17).
>>
>> Signed-off-by: Satoru Moriya <satoru.moriya@hds.com>
>> ---
>> cpu-all.h | 1 +
>> exec.c | 3 +++
>> qemu-options.hx | 8 ++++++++
>> vl.c | 4 ++++
>> 4 files changed, 16 insertions(+)
>>
>> diff --git a/cpu-all.h b/cpu-all.h
>> index 74d3681..e12e5d5 100644
>> --- a/cpu-all.h
>> +++ b/cpu-all.h
>> @@ -503,6 +503,7 @@ extern RAMList ram_list;
>>
>> extern const char *mem_path;
>> extern int mem_prealloc;
>> +extern int mem_lock;
>>
>> /* Flags stored in the low bits of the TLB virtual address. These are
>> defined so that fast path ram access is all zeros. */
>> diff --git a/exec.c b/exec.c
>> index bb6aa4a..de13bc9 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -2572,6 +2572,9 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>> }
>> memory_try_enable_merging(new_block->host, size);
>> }
>> + if (mem_lock && mlockall(MCL_CURRENT | MCL_FUTURE)) {
>> + perror("mlockall");
>> + }
>
> This belongs to the OS abstraction layer (it's POSIX). And you only need
> to call it once per process lifetime.
>
>> }
>> new_block->length = size;
>>
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 7d97f96..9d82f15 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -427,6 +427,14 @@ Preallocate memory when using -mem-path.
>> ETEXI
>> #endif
>>
>> +DEF("mlock", 0, QEMU_OPTION_mlock,
>> + "-mlock mlock guest and qemu memory\n",
>> + QEMU_ARCH_ALL)
>> +STEXI
>> +@item -mlock
>> +mlock guest and qemu memory.
>> +ETEXI
>> +
>> DEF("k", HAS_ARG, QEMU_OPTION_k,
>> "-k language use keyboard layout (for example 'fr' for French)\n",
>> QEMU_ARCH_ALL)
>> diff --git a/vl.c b/vl.c
>> index 8d305ca..c902084 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -187,6 +187,7 @@ const char *mem_path = NULL;
>> #ifdef MAP_POPULATE
>> int mem_prealloc = 0; /* force preallocation of physical target memory */
>> #endif
>> +int mem_lock;
>> int nb_nics;
>> NICInfo nd_table[MAX_NICS];
>> int autostart;
>> @@ -2770,6 +2771,9 @@ int main(int argc, char **argv, char **envp)
>> mem_prealloc = 1;
>> break;
>> #endif
>> + case QEMU_OPTION_mlock:
>> + mem_lock = 1;
>> + break;
>> case QEMU_OPTION_d:
>> log_mask = optarg;
>> break;
>>
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux
next prev parent reply other threads:[~2012-09-28 12:34 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-27 23:21 [Qemu-devel] [PATCH] Add option to mlock guest and qemu memory Satoru Moriya
2012-09-28 8:05 ` Jan Kiszka
2012-09-28 12:33 ` Anthony Liguori [this message]
2012-09-28 13:14 ` Jan Kiszka
2012-09-28 15:54 ` Anthony Liguori
2012-10-01 21:24 ` Satoru Moriya
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a9was4wn.fsf@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=dle-develop@lists.sourceforge.net \
--cc=jan.kiszka@siemens.com \
--cc=qemu-devel@nongnu.org \
--cc=satoru.moriya@hds.com \
--cc=seiji.aguchi@hds.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).