* [PATCH] misc: introduce strim-memory qapi to support free memory trimming
@ 2024-06-28 10:22 Guoyi Tu
2024-07-07 3:48 ` Guoyi Tu
2024-07-25 11:35 ` Markus Armbruster
0 siblings, 2 replies; 9+ messages in thread
From: Guoyi Tu @ 2024-06-28 10:22 UTC (permalink / raw)
To: Dr. David Alan Gilbert, Markus Armbruster, Eric Blake
Cc: tugy, qemu-devel, dengpc12, zhangl161
In the test environment, we conducted IO stress tests on all storage disks
within a virtual machine that had five storage devices mounted.During
testing,
we found that the qemu process allocated a large amount of memory (~800MB)
to handle these IO operations.
When the test ended, although qemu called free() to release the allocated
memory, the memory was not actually returned to the operating system, as
observed via the top command.
Upon researching the glibc memory management mechanism, we found that when
small chunks of memory are allocated in user space and then released with
free(), the glibc memory management mechanism does not necessarily return
this memory to the operating system. Instead, it retains the memory until
certain conditions are met for release.
For virtual machines that only have business operations during specific
periods, they remain idle most of the time. However, the qemu process
still occupies a large amount of memory resources, leading to significant
memory resource waste.
To address this issue, this patch introduces an API to actively reclaim
idle memory within the qemu process. This API effectively calls
malloc_trim()
to notify glibc to trim free memory. With this api, the management tool
can monitor the virtual machine's state and call this API during idle times
to free up the memory occupied by the virtual machine, thereby allowing more
virtual machines to be provisioned.
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn>
---
hmp-commands.hx | 13 +++++++++++++
include/monitor/hmp.h | 1 +
monitor/hmp-cmds.c | 14 ++++++++++++++
monitor/qmp-cmds.c | 18 ++++++++++++++++++
qapi/misc.json | 13 +++++++++++++
5 files changed, 59 insertions(+)
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 06746f0afc..0fde22fc71 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1858,4 +1858,17 @@ SRST
``xen-event-list``
List event channels in the guest
ERST
+
+ {
+ .name = "trim-memory",
+ .args_type = "reserved:l?",
+ .params = "[reserved]",
+ .help = "trim momory",
+ .cmd = hmp_trim_memory,
+ },
+
+SRST
+``trim-memory`` *reserved*
+ try to release free memory and keep reserved bytes of free memory
untrimmed
+ERST
#endif
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 954f3c83ad..547cde0056 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -181,5 +181,6 @@ void hmp_boot_set(Monitor *mon, const QDict *qdict);
void hmp_info_mtree(Monitor *mon, const QDict *qdict);
void hmp_info_cryptodev(Monitor *mon, const QDict *qdict);
void hmp_dumpdtb(Monitor *mon, const QDict *qdict);
+void hmp_trim_memory(Monitor *mon, const QDict *qdict);
#endif
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index ea79148ee8..f842e43315 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -460,3 +460,17 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "dtb dumped to %s", filename);
}
#endif
+
+void hmp_trim_memory(Monitor *mon, const QDict *qdict)
+{
+ int64_t reserved;
+ bool has_reserved = qdict_haskey(qdict, "reserved");
+ Error *err = NULL;
+
+ if (has_reserved) {
+ reserved = qdict_get_int(qdict, "reserved");
+ }
+
+ qmp_trim_memory(has_reserved, reserved, &err);
+ hmp_handle_error(mon, err);
+}
diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index f84a0dc523..878a7a646a 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -31,6 +31,7 @@
#include "qapi/type-helpers.h"
#include "hw/mem/memory-device.h"
#include "hw/intc/intc.h"
+#include <malloc.h>
NameInfo *qmp_query_name(Error **errp)
{
@@ -161,6 +162,23 @@ void qmp_add_client(const char *protocol, const
char *fdname,
}
}
+void qmp_trim_memory(bool has_reserved, int64_t reserved, Error **errp)
+{
+#if defined(CONFIG_MALLOC_TRIM)
+ if (!has_reserved) {
+ reserved = 1024 * 1024;
+ }
+ if (reserved < 0) {
+ error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+ "reserved", "a >0 reserved");
+ return;
+ }
+ malloc_trim(reserved);
+#else
+ error_setg(errp, "malloc_trim feature not configured");
+#endif
+}
+
char *qmp_human_monitor_command(const char *command_line, bool
has_cpu_index,
int64_t cpu_index, Error **errp)
{
diff --git a/qapi/misc.json b/qapi/misc.json
index ec30e5c570..00e6f2f650 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -605,3 +605,16 @@
{ 'event': 'VFU_CLIENT_HANGUP',
'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str',
'dev-id': 'str', 'dev-qom-path': 'str' } }
+
+##
+# @trim-memory:
+#
+# try to release free memory
+#
+# @reserved: specifies the amount of free space to leave untrimmed.
+# default to 1MB if not specified.
+#
+# Since: 9.0
+##
+{'command': 'trim-memory',
+ 'data': {'*reserved': 'int'} }
--
2.17.1
----
Guoyi
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: introduce strim-memory qapi to support free memory trimming
2024-06-28 10:22 [PATCH] misc: introduce strim-memory qapi to support free memory trimming Guoyi Tu
@ 2024-07-07 3:48 ` Guoyi Tu
2024-07-25 11:35 ` Markus Armbruster
1 sibling, 0 replies; 9+ messages in thread
From: Guoyi Tu @ 2024-07-07 3:48 UTC (permalink / raw)
To: Dr. David Alan Gilbert, Markus Armbruster, Eric Blake
Cc: tugy, qemu-devel, dengpc12, zhangl161
Hi there,
please review this patch,any comments are welcome.
On 2024/6/28 18:22, Guoyi Tu wrote:
> In the test environment, we conducted IO stress tests on all storage disks
> within a virtual machine that had five storage devices mounted.During
> testing,
> we found that the qemu process allocated a large amount of memory (~800MB)
> to handle these IO operations.
>
> When the test ended, although qemu called free() to release the allocated
> memory, the memory was not actually returned to the operating system, as
> observed via the top command.
>
> Upon researching the glibc memory management mechanism, we found that when
> small chunks of memory are allocated in user space and then released with
> free(), the glibc memory management mechanism does not necessarily return
> this memory to the operating system. Instead, it retains the memory until
> certain conditions are met for release.
>
> For virtual machines that only have business operations during specific
> periods, they remain idle most of the time. However, the qemu process
> still occupies a large amount of memory resources, leading to significant
> memory resource waste.
>
> To address this issue, this patch introduces an API to actively reclaim
> idle memory within the qemu process. This API effectively calls
> malloc_trim()
> to notify glibc to trim free memory. With this api, the management tool
> can monitor the virtual machine's state and call this API during idle times
> to free up the memory occupied by the virtual machine, thereby allowing
> more
> virtual machines to be provisioned.
>
> Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
> Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn>
> ---
> hmp-commands.hx | 13 +++++++++++++
> include/monitor/hmp.h | 1 +
> monitor/hmp-cmds.c | 14 ++++++++++++++
> monitor/qmp-cmds.c | 18 ++++++++++++++++++
> qapi/misc.json | 13 +++++++++++++
> 5 files changed, 59 insertions(+)
>
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 06746f0afc..0fde22fc71 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1858,4 +1858,17 @@ SRST
> ``xen-event-list``
> List event channels in the guest
> ERST
> +
> + {
> + .name = "trim-memory",
> + .args_type = "reserved:l?",
> + .params = "[reserved]",
> + .help = "trim momory",
> + .cmd = hmp_trim_memory,
> + },
> +
> +SRST
> +``trim-memory`` *reserved*
> + try to release free memory and keep reserved bytes of free memory
> untrimmed
> +ERST
> #endif
> diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
> index 954f3c83ad..547cde0056 100644
> --- a/include/monitor/hmp.h
> +++ b/include/monitor/hmp.h
> @@ -181,5 +181,6 @@ void hmp_boot_set(Monitor *mon, const QDict *qdict);
> void hmp_info_mtree(Monitor *mon, const QDict *qdict);
> void hmp_info_cryptodev(Monitor *mon, const QDict *qdict);
> void hmp_dumpdtb(Monitor *mon, const QDict *qdict);
> +void hmp_trim_memory(Monitor *mon, const QDict *qdict);
>
> #endif
> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
> index ea79148ee8..f842e43315 100644
> --- a/monitor/hmp-cmds.c
> +++ b/monitor/hmp-cmds.c
> @@ -460,3 +460,17 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
> monitor_printf(mon, "dtb dumped to %s", filename);
> }
> #endif
> +
> +void hmp_trim_memory(Monitor *mon, const QDict *qdict)
> +{
> + int64_t reserved;
> + bool has_reserved = qdict_haskey(qdict, "reserved");
> + Error *err = NULL;
> +
> + if (has_reserved) {
> + reserved = qdict_get_int(qdict, "reserved");
> + }
> +
> + qmp_trim_memory(has_reserved, reserved, &err);
> + hmp_handle_error(mon, err);
> +}
> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
> index f84a0dc523..878a7a646a 100644
> --- a/monitor/qmp-cmds.c
> +++ b/monitor/qmp-cmds.c
> @@ -31,6 +31,7 @@
> #include "qapi/type-helpers.h"
> #include "hw/mem/memory-device.h"
> #include "hw/intc/intc.h"
> +#include <malloc.h>
>
> NameInfo *qmp_query_name(Error **errp)
> {
> @@ -161,6 +162,23 @@ void qmp_add_client(const char *protocol, const
> char *fdname,
> }
> }
>
> +void qmp_trim_memory(bool has_reserved, int64_t reserved, Error **errp)
> +{
> +#if defined(CONFIG_MALLOC_TRIM)
> + if (!has_reserved) {
> + reserved = 1024 * 1024;
> + }
> + if (reserved < 0) {
> + error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> + "reserved", "a >0 reserved");
> + return;
> + }
> + malloc_trim(reserved);
> +#else
> + error_setg(errp, "malloc_trim feature not configured");
> +#endif
> +}
> +
> char *qmp_human_monitor_command(const char *command_line, bool
> has_cpu_index,
> int64_t cpu_index, Error **errp)
> {
> diff --git a/qapi/misc.json b/qapi/misc.json
> index ec30e5c570..00e6f2f650 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -605,3 +605,16 @@
> { 'event': 'VFU_CLIENT_HANGUP',
> 'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str',
> 'dev-id': 'str', 'dev-qom-path': 'str' } }
> +
> +##
> +# @trim-memory:
> +#
> +# try to release free memory
> +#
> +# @reserved: specifies the amount of free space to leave untrimmed.
> +# default to 1MB if not specified.
> +#
> +# Since: 9.0
> +##
> +{'command': 'trim-memory',
> + 'data': {'*reserved': 'int'} }
--
涂国义
云网产品事业部-弹性计算-G1组-虚拟化
18030537745
成都市武侯区益州大道1666号中国电信西部信息中心
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: introduce strim-memory qapi to support free memory trimming
2024-06-28 10:22 [PATCH] misc: introduce strim-memory qapi to support free memory trimming Guoyi Tu
2024-07-07 3:48 ` Guoyi Tu
@ 2024-07-25 11:35 ` Markus Armbruster
2024-07-25 11:57 ` Daniel P. Berrangé
2024-07-27 4:09 ` Guoyi Tu
1 sibling, 2 replies; 9+ messages in thread
From: Markus Armbruster @ 2024-07-25 11:35 UTC (permalink / raw)
To: Guoyi Tu
Cc: Dr. David Alan Gilbert, Eric Blake, qemu-devel, dengpc12,
zhangl161, Paolo Bonzini, Yang Zhong
Guoyi Tu <tugy@chinatelecom.cn> writes:
> In the test environment, we conducted IO stress tests on all storage disks
> within a virtual machine that had five storage devices mounted.During
> testing,
> we found that the qemu process allocated a large amount of memory (~800MB)
> to handle these IO operations.
>
> When the test ended, although qemu called free() to release the allocated
> memory, the memory was not actually returned to the operating system, as
> observed via the top command.
>
> Upon researching the glibc memory management mechanism, we found that when
> small chunks of memory are allocated in user space and then released with
> free(), the glibc memory management mechanism does not necessarily return
> this memory to the operating system. Instead, it retains the memory until
> certain conditions are met for release.
Yes.
> For virtual machines that only have business operations during specific
> periods, they remain idle most of the time. However, the qemu process
> still occupies a large amount of memory resources, leading to significant
> memory resource waste.
Mitigation: the memory free()'s but not returned to the OS can be paged
out.
> To address this issue, this patch introduces an API to actively reclaim
> idle memory within the qemu process. This API effectively calls
> malloc_trim()
> to notify glibc to trim free memory. With this api, the management tool
> can monitor the virtual machine's state and call this API during idle times
> to free up the memory occupied by the virtual machine, thereby allowing more
> virtual machines to be provisioned.
How does this affect the test case you described above?
There's an existing use of malloc_trim() in util/rcu.c's
call_rcu_thread(). It's from commit 5a22ab71623:
rcu: reduce more than 7MB heap memory by malloc_trim()
Since there are some issues in memory alloc/free machenism
in glibc for little chunk memory, if Qemu frequently
alloc/free little chunk memory, the glibc doesn't alloc
little chunk memory from free list of glibc and still
allocate from OS, which make the heap size bigger and bigger.
This patch introduce malloc_trim(), which will free heap
memory when there is no rcu call during rcu thread loop.
malloc_trim() can be enabled/disabled by --enable-malloc-trim/
--disable-malloc-trim in the Qemu configure command. The
default malloc_trim() is enabled for libc.
Below are test results from smaps file.
(1)without patch
55f0783e1000-55f07992a000 rw-p 00000000 00:00 0 [heap]
Size: 21796 kB
Rss: 14260 kB
Pss: 14260 kB
(2)with patch
55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0 [heap]
Size: 21668 kB
Rss: 6940 kB
Pss: 6940 kB
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1513775806-19779-1-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
How would the malloc_trim() you propose interact with this one?
> Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
> Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn>
> ---
> hmp-commands.hx | 13 +++++++++++++
> include/monitor/hmp.h | 1 +
> monitor/hmp-cmds.c | 14 ++++++++++++++
> monitor/qmp-cmds.c | 18 ++++++++++++++++++
> qapi/misc.json | 13 +++++++++++++
> 5 files changed, 59 insertions(+)
>
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 06746f0afc..0fde22fc71 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1858,4 +1858,17 @@ SRST
> ``xen-event-list``
> List event channels in the guest
> ERST
> +
> + {
> + .name = "trim-memory",
> + .args_type = "reserved:l?",
> + .params = "[reserved]",
> + .help = "trim momory",
> + .cmd = hmp_trim_memory,
> + },
> +
> +SRST
> +``trim-memory`` *reserved*
> + try to release free memory and keep reserved bytes of free memory
> untrimmed
> +ERST
> #endif
> diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
> index 954f3c83ad..547cde0056 100644
> --- a/include/monitor/hmp.h
> +++ b/include/monitor/hmp.h
> @@ -181,5 +181,6 @@ void hmp_boot_set(Monitor *mon, const QDict *qdict);
> void hmp_info_mtree(Monitor *mon, const QDict *qdict);
> void hmp_info_cryptodev(Monitor *mon, const QDict *qdict);
> void hmp_dumpdtb(Monitor *mon, const QDict *qdict);
> +void hmp_trim_memory(Monitor *mon, const QDict *qdict);
>
> #endif
> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
> index ea79148ee8..f842e43315 100644
> --- a/monitor/hmp-cmds.c
> +++ b/monitor/hmp-cmds.c
> @@ -460,3 +460,17 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
> monitor_printf(mon, "dtb dumped to %s", filename);
> }
> #endif
> +
> +void hmp_trim_memory(Monitor *mon, const QDict *qdict)
> +{
> + int64_t reserved;
> + bool has_reserved = qdict_haskey(qdict, "reserved");
> + Error *err = NULL;
> +
> + if (has_reserved) {
> + reserved = qdict_get_int(qdict, "reserved");
> + }
> +
> + qmp_trim_memory(has_reserved, reserved, &err);
> + hmp_handle_error(mon, err);
> +}
> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
> index f84a0dc523..878a7a646a 100644
> --- a/monitor/qmp-cmds.c
> +++ b/monitor/qmp-cmds.c
> @@ -31,6 +31,7 @@
> #include "qapi/type-helpers.h"
> #include "hw/mem/memory-device.h"
> #include "hw/intc/intc.h"
> +#include <malloc.h>
>
> NameInfo *qmp_query_name(Error **errp)
> {
> @@ -161,6 +162,23 @@ void qmp_add_client(const char *protocol, const
> char *fdname,
> }
> }
>
> +void qmp_trim_memory(bool has_reserved, int64_t reserved, Error **errp)
> +{
> +#if defined(CONFIG_MALLOC_TRIM)
> + if (!has_reserved) {
> + reserved = 1024 * 1024;
> + }
> + if (reserved < 0) {
> + error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> + "reserved", "a >0 reserved");
> + return;
> + }
> + malloc_trim(reserved);
> +#else
> + error_setg(errp, "malloc_trim feature not configured");
Have you tried making the entire command conditional instead? Like...
> +#endif
> +}
> +
> char *qmp_human_monitor_command(const char *command_line, bool
> has_cpu_index,
> int64_t cpu_index, Error **errp)
> {
> diff --git a/qapi/misc.json b/qapi/misc.json
> index ec30e5c570..00e6f2f650 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -605,3 +605,16 @@
> { 'event': 'VFU_CLIENT_HANGUP',
> 'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str',
> 'dev-id': 'str', 'dev-qom-path': 'str' } }
> +
> +##
> +# @trim-memory:
> +#
> +# try to release free memory
> +#
> +# @reserved: specifies the amount of free space to leave untrimmed.
> +# default to 1MB if not specified.
> +#
> +# Since: 9.0
> +##
> +{'command': 'trim-memory',
> + 'data': {'*reserved': 'int'} }
... so:
{ 'command': 'trim-memory',
'data': {'*reserved': 'int'},
'if': 'CONFIG_MALLOC_TRIM' }
Could we do without the argument?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: introduce strim-memory qapi to support free memory trimming
2024-07-25 11:35 ` Markus Armbruster
@ 2024-07-25 11:57 ` Daniel P. Berrangé
2024-07-25 12:50 ` Dr. David Alan Gilbert
` (2 more replies)
2024-07-27 4:09 ` Guoyi Tu
1 sibling, 3 replies; 9+ messages in thread
From: Daniel P. Berrangé @ 2024-07-25 11:57 UTC (permalink / raw)
To: Markus Armbruster
Cc: Guoyi Tu, Dr. David Alan Gilbert, Eric Blake, qemu-devel,
dengpc12, zhangl161, Paolo Bonzini, Yang Zhong
On Thu, Jul 25, 2024 at 01:35:21PM +0200, Markus Armbruster wrote:
> Guoyi Tu <tugy@chinatelecom.cn> writes:
>
> > In the test environment, we conducted IO stress tests on all storage disks
> > within a virtual machine that had five storage devices mounted.During
> > testing,
> > we found that the qemu process allocated a large amount of memory (~800MB)
> > to handle these IO operations.
> >
> > When the test ended, although qemu called free() to release the allocated
> > memory, the memory was not actually returned to the operating system, as
> > observed via the top command.
> >
> > Upon researching the glibc memory management mechanism, we found that when
> > small chunks of memory are allocated in user space and then released with
> > free(), the glibc memory management mechanism does not necessarily return
> > this memory to the operating system. Instead, it retains the memory until
> > certain conditions are met for release.
>
> Yes.
Looking at mallopt(3) man page, the M_TRIM_THRESHOLD is said to control
when glibc releases the top of the heap back to the OS. It is said to
default to 128 kb.
I'm curious how we get from that default, to 800 MB of unused memory ?
Is it related to the number of distinct malloc arenas that are in use ?
I'm curious what malloc_stats() would report before & after malloc_trim
when QEMU is in this situation with lots of wasted memory.
>
> > For virtual machines that only have business operations during specific
> > periods, they remain idle most of the time. However, the qemu process
> > still occupies a large amount of memory resources, leading to significant
> > memory resource waste.
>
> Mitigation: the memory free()'s but not returned to the OS can be paged
> out.
>
> > To address this issue, this patch introduces an API to actively reclaim
> > idle memory within the qemu process. This API effectively calls
> > malloc_trim()
> > to notify glibc to trim free memory. With this api, the management tool
> > can monitor the virtual machine's state and call this API during idle times
> > to free up the memory occupied by the virtual machine, thereby allowing more
> > virtual machines to be provisioned.
>
> How does this affect the test case you described above?
>
> There's an existing use of malloc_trim() in util/rcu.c's
> call_rcu_thread(). It's from commit 5a22ab71623:
>
> rcu: reduce more than 7MB heap memory by malloc_trim()
>
> Since there are some issues in memory alloc/free machenism
> in glibc for little chunk memory, if Qemu frequently
> alloc/free little chunk memory, the glibc doesn't alloc
> little chunk memory from free list of glibc and still
> allocate from OS, which make the heap size bigger and bigger.
>
> This patch introduce malloc_trim(), which will free heap
> memory when there is no rcu call during rcu thread loop.
> malloc_trim() can be enabled/disabled by --enable-malloc-trim/
> --disable-malloc-trim in the Qemu configure command. The
> default malloc_trim() is enabled for libc.
>
> Below are test results from smaps file.
> (1)without patch
> 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0 [heap]
> Size: 21796 kB
> Rss: 14260 kB
> Pss: 14260 kB
>
> (2)with patch
> 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0 [heap]
> Size: 21668 kB
> Rss: 6940 kB
> Pss: 6940 kB
>
> Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> Message-Id: <1513775806-19779-1-git-send-email-yang.zhong@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
> How would the malloc_trim() you propose interact with this one?
The above usage is automatic, while this proposal requires that
an external mgmt app monitor QEMU and tell it to free memory.
I'm wondering if the latter is really desirable, or whether QEMU
can call this itself when reasonable ?
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: introduce strim-memory qapi to support free memory trimming
2024-07-25 11:57 ` Daniel P. Berrangé
@ 2024-07-25 12:50 ` Dr. David Alan Gilbert
2024-07-27 5:18 ` Guoyi Tu
[not found] ` <1020253492.3796.1721956050910.JavaMail.root@jt-retransmission-dep-7c968f646d-qxbl2>
2 siblings, 0 replies; 9+ messages in thread
From: Dr. David Alan Gilbert @ 2024-07-25 12:50 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: Markus Armbruster, Guoyi Tu, Eric Blake, qemu-devel, dengpc12,
zhangl161, Paolo Bonzini, Yang Zhong
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Thu, Jul 25, 2024 at 01:35:21PM +0200, Markus Armbruster wrote:
> > Guoyi Tu <tugy@chinatelecom.cn> writes:
> >
> > > In the test environment, we conducted IO stress tests on all storage disks
> > > within a virtual machine that had five storage devices mounted.During
> > > testing,
> > > we found that the qemu process allocated a large amount of memory (~800MB)
> > > to handle these IO operations.
> > >
> > > When the test ended, although qemu called free() to release the allocated
> > > memory, the memory was not actually returned to the operating system, as
> > > observed via the top command.
> > >
> > > Upon researching the glibc memory management mechanism, we found that when
> > > small chunks of memory are allocated in user space and then released with
> > > free(), the glibc memory management mechanism does not necessarily return
> > > this memory to the operating system. Instead, it retains the memory until
> > > certain conditions are met for release.
> >
> > Yes.
>
> Looking at mallopt(3) man page, the M_TRIM_THRESHOLD is said to control
> when glibc releases the top of the heap back to the OS. It is said to
> default to 128 kb.
>
> I'm curious how we get from that default, to 800 MB of unused memory ?
> Is it related to the number of distinct malloc arenas that are in use ?
I wonder which IO mechanism was being used - the 'iothreads' used to sometimes
blow up and start 100s of threads; is that the case here?
> I'm curious what malloc_stats() would report before & after malloc_trim
> when QEMU is in this situation with lots of wasted memory.
Yes; maybe also trying valgrind's massif:
https://valgrind.org/docs/manual/ms-manual.html
(if it works on Qemu!)
might help say where it's going?
Dave
> >
> > > For virtual machines that only have business operations during specific
> > > periods, they remain idle most of the time. However, the qemu process
> > > still occupies a large amount of memory resources, leading to significant
> > > memory resource waste.
> >
> > Mitigation: the memory free()'s but not returned to the OS can be paged
> > out.
> >
> > > To address this issue, this patch introduces an API to actively reclaim
> > > idle memory within the qemu process. This API effectively calls
> > > malloc_trim()
> > > to notify glibc to trim free memory. With this api, the management tool
> > > can monitor the virtual machine's state and call this API during idle times
> > > to free up the memory occupied by the virtual machine, thereby allowing more
> > > virtual machines to be provisioned.
> >
> > How does this affect the test case you described above?
> >
> > There's an existing use of malloc_trim() in util/rcu.c's
> > call_rcu_thread(). It's from commit 5a22ab71623:
> >
> > rcu: reduce more than 7MB heap memory by malloc_trim()
> >
> > Since there are some issues in memory alloc/free machenism
> > in glibc for little chunk memory, if Qemu frequently
> > alloc/free little chunk memory, the glibc doesn't alloc
> > little chunk memory from free list of glibc and still
> > allocate from OS, which make the heap size bigger and bigger.
> >
> > This patch introduce malloc_trim(), which will free heap
> > memory when there is no rcu call during rcu thread loop.
> > malloc_trim() can be enabled/disabled by --enable-malloc-trim/
> > --disable-malloc-trim in the Qemu configure command. The
> > default malloc_trim() is enabled for libc.
> >
> > Below are test results from smaps file.
> > (1)without patch
> > 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0 [heap]
> > Size: 21796 kB
> > Rss: 14260 kB
> > Pss: 14260 kB
> >
> > (2)with patch
> > 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0 [heap]
> > Size: 21668 kB
> > Rss: 6940 kB
> > Pss: 6940 kB
> >
> > Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> > Message-Id: <1513775806-19779-1-git-send-email-yang.zhong@intel.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >
> > How would the malloc_trim() you propose interact with this one?
>
> The above usage is automatic, while this proposal requires that
> an external mgmt app monitor QEMU and tell it to free memory.
> I'm wondering if the latter is really desirable, or whether QEMU
> can call this itself when reasonable ?
>
>
> With regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
>
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux | Happy \
\ dave @ treblig.org | | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: introduce strim-memory qapi to support free memory trimming
2024-07-25 11:35 ` Markus Armbruster
2024-07-25 11:57 ` Daniel P. Berrangé
@ 2024-07-27 4:09 ` Guoyi Tu
1 sibling, 0 replies; 9+ messages in thread
From: Guoyi Tu @ 2024-07-27 4:09 UTC (permalink / raw)
To: Markus Armbruster
Cc: tugy, Dr. David Alan Gilbert, Eric Blake, qemu-devel, dengpc12,
zhangl161, Paolo Bonzini, Yang Zhong
On 2024/7/25 19:35, Markus Armbruster wrote:
> Guoyi Tu <tugy@chinatelecom.cn> writes:
>
>> In the test environment, we conducted IO stress tests on all storage disks
>> within a virtual machine that had five storage devices mounted.During
>> testing,
>> we found that the qemu process allocated a large amount of memory (~800MB)
>> to handle these IO operations.
>>
>> When the test ended, although qemu called free() to release the allocated
>> memory, the memory was not actually returned to the operating system, as
>> observed via the top command.
>>
>> Upon researching the glibc memory management mechanism, we found that when
>> small chunks of memory are allocated in user space and then released with
>> free(), the glibc memory management mechanism does not necessarily return
>> this memory to the operating system. Instead, it retains the memory until
>> certain conditions are met for release.
>
> Yes.
>
>> For virtual machines that only have business operations during specific
>> periods, they remain idle most of the time. However, the qemu process
>> still occupies a large amount of memory resources, leading to significant
>> memory resource waste.
>
> Mitigation: the memory free()'s but not returned to the OS can be paged
> out.
Yes, swap can alleviate the issue of insufficient system memory, but it
can also affect performance. Additionally, some systems may disable the
swap function for various reasons, making it unavailable for use
>> To address this issue, this patch introduces an API to actively reclaim
>> idle memory within the qemu process. This API effectively calls
>> malloc_trim()
>> to notify glibc to trim free memory. With this api, the management tool
>> can monitor the virtual machine's state and call this API during idle times
>> to free up the memory occupied by the virtual machine, thereby allowing more
>> virtual machines to be provisioned.
>
> How does this affect the test case you described above?
Based on the test results, QEMU generally allocates a lot of memory
when handling backend I/O, and this memory can be released when the
virtual machine’s storage I/O is idle.
Based on this observation, management tools can monitor the storage I/O
load of virtual machines. If the storage I/O load of a virtual machine
remains low for a certain period, the tool can use this interface to
return most of the allocatable memory to the operating system.
According to our statistics, this approach can save at least 150MB of
system memory per virtual machine on average.
> There's an existing use of malloc_trim() in util/rcu.c's
> call_rcu_thread(). It's from commit 5a22ab71623:
>
> rcu: reduce more than 7MB heap memory by malloc_trim()
>
> Since there are some issues in memory alloc/free machenism
> in glibc for little chunk memory, if Qemu frequently
> alloc/free little chunk memory, the glibc doesn't alloc
> little chunk memory from free list of glibc and still
> allocate from OS, which make the heap size bigger and bigger.
>
> This patch introduce malloc_trim(), which will free heap
> memory when there is no rcu call during rcu thread loop.
> malloc_trim() can be enabled/disabled by --enable-malloc-trim/
> --disable-malloc-trim in the Qemu configure command. The
> default malloc_trim() is enabled for libc.
>
> Below are test results from smaps file.
> (1)without patch
> 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0 [heap]
> Size: 21796 kB
> Rss: 14260 kB
> Pss: 14260 kB
>
> (2)with patch
> 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0 [heap]
> Size: 21668 kB
> Rss: 6940 kB
> Pss: 6940 kB
>
> Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> Message-Id: <1513775806-19779-1-git-send-email-yang.zhong@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
> How would the malloc_trim() you propose interact with this one?
During the lifecycle of the virtual machine, this part of the logic
may not be triggered, so we plan to add a new interface.
>> Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
>> Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn>
>> ---
>> hmp-commands.hx | 13 +++++++++++++
>> include/monitor/hmp.h | 1 +
>> monitor/hmp-cmds.c | 14 ++++++++++++++
>> monitor/qmp-cmds.c | 18 ++++++++++++++++++
>> qapi/misc.json | 13 +++++++++++++
>> 5 files changed, 59 insertions(+)
>>
>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>> index 06746f0afc..0fde22fc71 100644
>> --- a/hmp-commands.hx
>> +++ b/hmp-commands.hx
>> @@ -1858,4 +1858,17 @@ SRST
>> ``xen-event-list``
>> List event channels in the guest
>> ERST
>> +
>> + {
>> + .name = "trim-memory",
>> + .args_type = "reserved:l?",
>> + .params = "[reserved]",
>> + .help = "trim momory",
>> + .cmd = hmp_trim_memory,
>> + },
>> +
>> +SRST
>> +``trim-memory`` *reserved*
>> + try to release free memory and keep reserved bytes of free memory
>> untrimmed
>> +ERST
>> #endif
>> diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
>> index 954f3c83ad..547cde0056 100644
>> --- a/include/monitor/hmp.h
>> +++ b/include/monitor/hmp.h
>> @@ -181,5 +181,6 @@ void hmp_boot_set(Monitor *mon, const QDict *qdict);
>> void hmp_info_mtree(Monitor *mon, const QDict *qdict);
>> void hmp_info_cryptodev(Monitor *mon, const QDict *qdict);
>> void hmp_dumpdtb(Monitor *mon, const QDict *qdict);
>> +void hmp_trim_memory(Monitor *mon, const QDict *qdict);
>>
>> #endif
>> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
>> index ea79148ee8..f842e43315 100644
>> --- a/monitor/hmp-cmds.c
>> +++ b/monitor/hmp-cmds.c
>> @@ -460,3 +460,17 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
>> monitor_printf(mon, "dtb dumped to %s", filename);
>> }
>> #endif
>> +
>> +void hmp_trim_memory(Monitor *mon, const QDict *qdict)
>> +{
>> + int64_t reserved;
>> + bool has_reserved = qdict_haskey(qdict, "reserved");
>> + Error *err = NULL;
>> +
>> + if (has_reserved) {
>> + reserved = qdict_get_int(qdict, "reserved");
>> + }
>> +
>> + qmp_trim_memory(has_reserved, reserved, &err);
>> + hmp_handle_error(mon, err);
>> +}
>> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
>> index f84a0dc523..878a7a646a 100644
>> --- a/monitor/qmp-cmds.c
>> +++ b/monitor/qmp-cmds.c
>> @@ -31,6 +31,7 @@
>> #include "qapi/type-helpers.h"
>> #include "hw/mem/memory-device.h"
>> #include "hw/intc/intc.h"
>> +#include <malloc.h>
>>
>> NameInfo *qmp_query_name(Error **errp)
>> {
>> @@ -161,6 +162,23 @@ void qmp_add_client(const char *protocol, const
>> char *fdname,
>> }
>> }
>>
>> +void qmp_trim_memory(bool has_reserved, int64_t reserved, Error **errp)
>> +{
>> +#if defined(CONFIG_MALLOC_TRIM)
>> + if (!has_reserved) {
>> + reserved = 1024 * 1024;
>> + }
>> + if (reserved < 0) {
>> + error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
>> + "reserved", "a >0 reserved");
>> + return;
>> + }
>> + malloc_trim(reserved);
>> +#else
>> + error_setg(errp, "malloc_trim feature not configured");
>
> Have you tried making the entire command conditional instead? Like...
>
>> +#endif
>> +}
>> +
>> char *qmp_human_monitor_command(const char *command_line, bool
>> has_cpu_index,
>> int64_t cpu_index, Error **errp)
>> {
>> diff --git a/qapi/misc.json b/qapi/misc.json
>> index ec30e5c570..00e6f2f650 100644
>> --- a/qapi/misc.json
>> +++ b/qapi/misc.json
>> @@ -605,3 +605,16 @@
>> { 'event': 'VFU_CLIENT_HANGUP',
>> 'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str',
>> 'dev-id': 'str', 'dev-qom-path': 'str' } }
>> +
>> +##
>> +# @trim-memory:
>> +#
>> +# try to release free memory
>> +#
>> +# @reserved: specifies the amount of free space to leave untrimmed.
>> +# default to 1MB if not specified.
>> +#
>> +# Since: 9.0
>> +##
>> +{'command': 'trim-memory',
>> + 'data': {'*reserved': 'int'} }
>
> ... so:
>
> { 'command': 'trim-memory',
> 'data': {'*reserved': 'int'},
> 'if': 'CONFIG_MALLOC_TRIM' }
>
> Could we do without the argument?
>
>
--
Guoyi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: introduce strim-memory qapi to support free memory trimming
2024-07-25 11:57 ` Daniel P. Berrangé
2024-07-25 12:50 ` Dr. David Alan Gilbert
@ 2024-07-27 5:18 ` Guoyi Tu
2024-08-01 13:12 ` Daniel P. Berrangé
[not found] ` <1020253492.3796.1721956050910.JavaMail.root@jt-retransmission-dep-7c968f646d-qxbl2>
2 siblings, 1 reply; 9+ messages in thread
From: Guoyi Tu @ 2024-07-27 5:18 UTC (permalink / raw)
To: Daniel P. Berrangé, Markus Armbruster
Cc: tugy, Dr. David Alan Gilbert, Eric Blake, qemu-devel, dengpc12,
zhangl161, Paolo Bonzini, Yang Zhong
On 2024/7/25 19:57, Daniel P. Berrangé wrote:
> On Thu, Jul 25, 2024 at 01:35:21PM +0200, Markus Armbruster wrote:
>> Guoyi Tu <tugy@chinatelecom.cn> writes:
>>
>>> In the test environment, we conducted IO stress tests on all storage disks
>>> within a virtual machine that had five storage devices mounted.During
>>> testing,
>>> we found that the qemu process allocated a large amount of memory (~800MB)
>>> to handle these IO operations.
>>>
>>> When the test ended, although qemu called free() to release the allocated
>>> memory, the memory was not actually returned to the operating system, as
>>> observed via the top command.
>>>
>>> Upon researching the glibc memory management mechanism, we found that when
>>> small chunks of memory are allocated in user space and then released with
>>> free(), the glibc memory management mechanism does not necessarily return
>>> this memory to the operating system. Instead, it retains the memory until
>>> certain conditions are met for release.
>>
>> Yes.
>
> Looking at mallopt(3) man page, the M_TRIM_THRESHOLD is said to control
> when glibc releases the top of the heap back to the OS. It is said to
> default to 128 kb.
Yes, the M_TRIM_THRESHOLD option can control glibc to release the free
memory at the top of the heap, but glibc will not release the free
memory in the middle of the heap.
> I'm curious how we get from that default, to 800 MB of unused memory > Is it related to the number of distinct malloc arenas that are in use ?
At least 600MB of memory is free, and this memory might be in the middle
of the heap and cannot be automatically released.
> I'm curious what malloc_stats() would report before & after malloc_trim
> when QEMU is in this situation with lots of wasted memory.
Here is the test case:
1. start the test process
Rss: 1504 kB
malloc_stats:
Arena 0:
system bytes = 135168
in use bytes = 5808
Total (incl. mmap):
system bytes = 135168
in use bytes = 5808
max mmap regions = 0
max mmap bytes = 0
2. Call malloc to allocate memory 320 times, each memory chunk is 64KiB
and total allocated memory is 20MiB
Rss: 21992 kB
malloc_stats:
Arena 0:
system bytes = 21049344
in use bytes = 20982448
Total (incl. mmap):
system bytes = 21049344
in use bytes = 20982448
max mmap regions = 0
max mmap bytes = 0
3. free the first 319 chunks , total size: 20416 KiB
Rss: 21992 kB
malloc_stats:
Arena 0:
system bytes = 21049344
in use bytes = 71360
Total (incl. mmap):
system bytes = 21049344
in use bytes = 71360
max mmap regions = 0
max mmap bytes = 0
4. free the last one
Rss: 1636 kB
malloc_stats:
Arena 0:
system bytes = 139264
in use bytes = 5808
Total (incl. mmap):
system bytes = 139264
in use bytes = 5808
max mmap regions = 0
max mmap bytes = 0
>>
>>> For virtual machines that only have business operations during specific
>>> periods, they remain idle most of the time. However, the qemu process
>>> still occupies a large amount of memory resources, leading to significant
>>> memory resource waste.
>>
>> Mitigation: the memory free()'s but not returned to the OS can be paged
>> out.
>>
>>> To address this issue, this patch introduces an API to actively reclaim
>>> idle memory within the qemu process. This API effectively calls
>>> malloc_trim()
>>> to notify glibc to trim free memory. With this api, the management tool
>>> can monitor the virtual machine's state and call this API during idle times
>>> to free up the memory occupied by the virtual machine, thereby allowing more
>>> virtual machines to be provisioned.
>>
>> How does this affect the test case you described above?
>>
>> There's an existing use of malloc_trim() in util/rcu.c's
>> call_rcu_thread(). It's from commit 5a22ab71623:
>>
>> rcu: reduce more than 7MB heap memory by malloc_trim()
>>
>> Since there are some issues in memory alloc/free machenism
>> in glibc for little chunk memory, if Qemu frequently
>> alloc/free little chunk memory, the glibc doesn't alloc
>> little chunk memory from free list of glibc and still
>> allocate from OS, which make the heap size bigger and bigger.
>>
>> This patch introduce malloc_trim(), which will free heap
>> memory when there is no rcu call during rcu thread loop.
>> malloc_trim() can be enabled/disabled by --enable-malloc-trim/
>> --disable-malloc-trim in the Qemu configure command. The
>> default malloc_trim() is enabled for libc.
>>
>> Below are test results from smaps file.
>> (1)without patch
>> 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0 [heap]
>> Size: 21796 kB
>> Rss: 14260 kB
>> Pss: 14260 kB
>>
>> (2)with patch
>> 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0 [heap]
>> Size: 21668 kB
>> Rss: 6940 kB
>> Pss: 6940 kB
>>
>> Signed-off-by: Yang Zhong <yang.zhong@intel.com>
>> Message-Id: <1513775806-19779-1-git-send-email-yang.zhong@intel.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>
>> How would the malloc_trim() you propose interact with this one?
>
> The above usage is automatic, while this proposal requires that
> an external mgmt app monitor QEMU and tell it to free memory.
> I'm wondering if the latter is really desirable, or whether QEMU
> can call this itself when reasonable ?
Yes, I have also considered implementing an automatic memory release
function within qemu. This approach would require qemu to periodically
monitor the IO load of all backend storage, and if the IO load is very
low over a period of time, it would proactively release memory.
This patch is a preliminary implementation, and I also want to discuss
with you which implementation approach is more reasonable.
So, which approach do you prefer?
>
> With regards,
> Daniel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: introduce strim-memory qapi to support free memory trimming
[not found] ` <1020253492.3796.1721956050910.JavaMail.root@jt-retransmission-dep-7c968f646d-qxbl2>
@ 2024-07-27 5:25 ` Guoyi Tu
0 siblings, 0 replies; 9+ messages in thread
From: Guoyi Tu @ 2024-07-27 5:25 UTC (permalink / raw)
To: Dr. David Alan Gilbert, Daniel P. Berrangé
Cc: tugy, Markus Armbruster, Eric Blake, qemu-devel, dengpc12,
zhangl161, Paolo Bonzini, Yang Zhong
On 2024/7/25 20:50, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrang� (berrange@redhat.com) wrote:
>> On Thu, Jul 25, 2024 at 01:35:21PM +0200, Markus Armbruster wrote:
>>> Guoyi Tu <tugy@chinatelecom.cn> writes:
>>>
>>>> In the test environment, we conducted IO stress tests on all storage disks
>>>> within a virtual machine that had five storage devices mounted.During
>>>> testing,
>>>> we found that the qemu process allocated a large amount of memory (~800MB)
>>>> to handle these IO operations.
>>>>
>>>> When the test ended, although qemu called free() to release the allocated
>>>> memory, the memory was not actually returned to the operating system, as
>>>> observed via the top command.
>>>>
>>>> Upon researching the glibc memory management mechanism, we found that when
>>>> small chunks of memory are allocated in user space and then released with
>>>> free(), the glibc memory management mechanism does not necessarily return
>>>> this memory to the operating system. Instead, it retains the memory until
>>>> certain conditions are met for release.
>>>
>>> Yes.
>>
>> Looking at mallopt(3) man page, the M_TRIM_THRESHOLD is said to control
>> when glibc releases the top of the heap back to the OS. It is said to
>> default to 128 kb.
>>
>> I'm curious how we get from that default, to 800 MB of unused memory ?
>> Is it related to the number of distinct malloc arenas that are in use ?
>
> I wonder which IO mechanism was being used - the 'iothreads' used to sometimes
> blow up and start 100s of threads; is that the case here?
No, qemu is not configured with iothread. If what you said is correct,
then in the case where iothread is used, the qemu process might occupy
more memory when storage IO is idle.
>> I'm curious what malloc_stats() would report before & after malloc_trim
>> when QEMU is in this situation with lots of wasted memory.
>
> Yes; maybe also trying valgrind's massif:
> https://valgrind.org/docs/manual/ms-manual.html
>
> (if it works on Qemu!)
> might help say where it's going?
Thank you for your suggestion. I will use this tool to analyze which
part is allocating more memory.
--
Guoyi
> Dave
>
>>>
>>>> For virtual machines that only have business operations during specific
>>>> periods, they remain idle most of the time. However, the qemu process
>>>> still occupies a large amount of memory resources, leading to significant
>>>> memory resource waste.
>>>
>>> Mitigation: the memory free()'s but not returned to the OS can be paged
>>> out.
>>>
>>>> To address this issue, this patch introduces an API to actively reclaim
>>>> idle memory within the qemu process. This API effectively calls
>>>> malloc_trim()
>>>> to notify glibc to trim free memory. With this api, the management tool
>>>> can monitor the virtual machine's state and call this API during idle times
>>>> to free up the memory occupied by the virtual machine, thereby allowing more
>>>> virtual machines to be provisioned.
>>>
>>> How does this affect the test case you described above?
>>>
>>> There's an existing use of malloc_trim() in util/rcu.c's
>>> call_rcu_thread(). It's from commit 5a22ab71623:
>>>
>>> rcu: reduce more than 7MB heap memory by malloc_trim()
>>>
>>> Since there are some issues in memory alloc/free machenism
>>> in glibc for little chunk memory, if Qemu frequently
>>> alloc/free little chunk memory, the glibc doesn't alloc
>>> little chunk memory from free list of glibc and still
>>> allocate from OS, which make the heap size bigger and bigger.
>>>
>>> This patch introduce malloc_trim(), which will free heap
>>> memory when there is no rcu call during rcu thread loop.
>>> malloc_trim() can be enabled/disabled by --enable-malloc-trim/
>>> --disable-malloc-trim in the Qemu configure command. The
>>> default malloc_trim() is enabled for libc.
>>>
>>> Below are test results from smaps file.
>>> (1)without patch
>>> 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0 [heap]
>>> Size: 21796 kB
>>> Rss: 14260 kB
>>> Pss: 14260 kB
>>>
>>> (2)with patch
>>> 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0 [heap]
>>> Size: 21668 kB
>>> Rss: 6940 kB
>>> Pss: 6940 kB
>>>
>>> Signed-off-by: Yang Zhong <yang.zhong@intel.com>
>>> Message-Id: <1513775806-19779-1-git-send-email-yang.zhong@intel.com>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> How would the malloc_trim() you propose interact with this one?
>>
>> The above usage is automatic, while this proposal requires that
>> an external mgmt app monitor QEMU and tell it to free memory.
>> I'm wondering if the latter is really desirable, or whether QEMU
>> can call this itself when reasonable ?
>>
>>
>> With regards,
>> Daniel
>> --
>> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
>> |: https://libvirt.org -o- https://fstop138.berrange.com :|
>> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
>>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] misc: introduce strim-memory qapi to support free memory trimming
2024-07-27 5:18 ` Guoyi Tu
@ 2024-08-01 13:12 ` Daniel P. Berrangé
0 siblings, 0 replies; 9+ messages in thread
From: Daniel P. Berrangé @ 2024-08-01 13:12 UTC (permalink / raw)
To: Guoyi Tu
Cc: Markus Armbruster, Dr. David Alan Gilbert, Eric Blake, qemu-devel,
dengpc12, zhangl161, Paolo Bonzini, Yang Zhong
On Sat, Jul 27, 2024 at 01:18:32PM +0800, Guoyi Tu wrote:
> On 2024/7/25 19:57, Daniel P. Berrangé wrote:
> > On Thu, Jul 25, 2024 at 01:35:21PM +0200, Markus Armbruster wrote:
> > > Guoyi Tu <tugy@chinatelecom.cn> writes:
> > >
> > > > In the test environment, we conducted IO stress tests on all storage disks
> > > > within a virtual machine that had five storage devices mounted.During
> > > > testing,
> > > > we found that the qemu process allocated a large amount of memory (~800MB)
> > > > to handle these IO operations.
> > > >
> > > > When the test ended, although qemu called free() to release the allocated
> > > > memory, the memory was not actually returned to the operating system, as
> > > > observed via the top command.
> > > >
> > > > Upon researching the glibc memory management mechanism, we found that when
> > > > small chunks of memory are allocated in user space and then released with
> > > > free(), the glibc memory management mechanism does not necessarily return
> > > > this memory to the operating system. Instead, it retains the memory until
> > > > certain conditions are met for release.
> > >
> > > Yes.
> >
> > Looking at mallopt(3) man page, the M_TRIM_THRESHOLD is said to control
> > when glibc releases the top of the heap back to the OS. It is said to
> > default to 128 kb.
> Yes, the M_TRIM_THRESHOLD option can control glibc to release the free
> memory at the top of the heap, but glibc will not release the free
> memory in the middle of the heap.
>
> > I'm curious how we get from that default, to 800 MB of unused memory > Is it related to the number of distinct malloc arenas that are in use ?
>
> At least 600MB of memory is free, and this memory might be in the middle of
> the heap and cannot be automatically released.
>
> > I'm curious what malloc_stats() would report before & after malloc_trim
> > when QEMU is in this situation with lots of wasted memory.
> Here is the test case:
snip
That looks like an artifical reproducer, rather than the real world
QEMU scenario.
What's the actual I/O stress test scenario you use to reproduce the
problem in QEMU, and how have you configured QEMU (ie what CLI args) ?
I'm fairly inclined to suggest that having such a huge amount of
freed memory is a glibc bug, but to escalate this to glibc requires
us to provide them better real world examples of the problems.
> > The above usage is automatic, while this proposal requires that
> > an external mgmt app monitor QEMU and tell it to free memory.
> > I'm wondering if the latter is really desirable, or whether QEMU
> > can call this itself when reasonable ?
>
> Yes, I have also considered implementing an automatic memory release
> function within qemu. This approach would require qemu to periodically
> monitor the IO load of all backend storage, and if the IO load is very
> low over a period of time, it would proactively release memory.
I would note that in systemd they have logic which is monitoring
either /proc/pressure/memory or $CGROUP/memory.pressure, and in
response to events on that, it will call malloc_trim
https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md
https://docs.kernel.org/accounting/psi.html
Something like that might be better, as it lets us hide the specific
design & impl choices inside QEMU, letting us change/evolve them at
will without impacting public API designs.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-08-01 13:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-28 10:22 [PATCH] misc: introduce strim-memory qapi to support free memory trimming Guoyi Tu
2024-07-07 3:48 ` Guoyi Tu
2024-07-25 11:35 ` Markus Armbruster
2024-07-25 11:57 ` Daniel P. Berrangé
2024-07-25 12:50 ` Dr. David Alan Gilbert
2024-07-27 5:18 ` Guoyi Tu
2024-08-01 13:12 ` Daniel P. Berrangé
[not found] ` <1020253492.3796.1721956050910.JavaMail.root@jt-retransmission-dep-7c968f646d-qxbl2>
2024-07-27 5:25 ` Guoyi Tu
2024-07-27 4:09 ` Guoyi Tu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).