* [PATCH 1/1] mm/ksm: add ksm_pages_sharing for each process to calculate profit more accurately
@ 2025-06-06 7:03 Longlong Xia
2025-06-06 10:08 ` David Hildenbrand
0 siblings, 1 reply; 6+ messages in thread
From: Longlong Xia @ 2025-06-06 7:03 UTC (permalink / raw)
To: akpm, david, xu.xin16
Cc: chengming.zhou, linux-mm, linux-kernel, Longlong Xia
The general_profit_show() function only considers ksm_pages_sharing,
while ksm_process_profit() includes both ksm_pages_sharing and
ksm_pages_shared for each process. This leads to a mismatch between
the total profits from ksm_process_profit() and general_profit_show().
Based on my tests, the sum of ksm_process_profit() for all processes
can be up to 20% higher than general_profit_show(), depending on
the size of page_shared. For individual processes, the ratio of
ksm_pages_sharing to ksm_merging_pages is usually not equal to 1.
To resolve this, we suggest introducing ksm_pages_sharing for each
process to accurately calculate its pages_sharing, ensuring
ksm_process_profit() reflects shared memory benefits more accurately.
Add a new proc file named as ksm_pages_sharing both under /proc/<pid>/
and /proc/self/ksm_stat/ to indicate the saved pages of this process.
(not including ksm_zero_pages)
Suggested-by: Xu Xin <xu.xin16@zte.com.cn>
Signed-off-by: Longlong Xia <xialonglong@kylinos.cn>
---
Documentation/admin-guide/mm/ksm.rst | 5 +++--
Documentation/filesystems/proc.rst | 8 ++++++++
fs/proc/base.c | 18 ++++++++++++++++++
include/linux/mm_types.h | 5 +++++
mm/ksm.c | 12 ++++++++----
5 files changed, 42 insertions(+), 6 deletions(-)
diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
index ad8e7a41f3b5..0b33ef98930f 100644
--- a/Documentation/admin-guide/mm/ksm.rst
+++ b/Documentation/admin-guide/mm/ksm.rst
@@ -256,9 +256,10 @@ several times, which are unprofitable memory consumed.
process_profit =~ ksm_saved_pages * sizeof(page) -
ksm_rmap_items * sizeof(rmap_item).
- where ksm_saved_pages equals to the sum of ``ksm_merging_pages`` and
+ where ksm_saved_pages equals to the sum of ``ksm_pages_sharing`` and
``ksm_zero_pages``, both of which are shown under the directory
- ``/proc/<pid>/ksm_stat``, and ksm_rmap_items is also shown in
+ ``/proc/<pid>/ksm_stat``, ksm_merging_pages and ksm_rmap_items are
+ also shown in
``/proc/<pid>/ksm_stat``. The process profit is also shown in
``/proc/<pid>/ksm_stat`` as ksm_process_profit.
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 2a17865dfe39..e14ea8389500 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -2290,6 +2290,7 @@ Example
/ # cat /proc/self/ksm_stat
ksm_rmap_items 0
ksm_zero_pages 0
+ ksm_pages_sharing 0
ksm_merging_pages 0
ksm_process_profit 0
ksm_merge_any: no
@@ -2312,6 +2313,13 @@ ksm_zero_pages
When /sys/kernel/mm/ksm/use_zero_pages is enabled, it represent how many
empty pages are merged with kernel zero pages by KSM.
+ksm_pages_sharing
+^^^^^^^^^^^^^^^^^
+
+It represents how many pages saved of this process.
+(not including ksm_zero_pages). It is the same with what
+/proc/<pid>/ksm_pages_sharing shows.
+
ksm_merging_pages
^^^^^^^^^^^^^^^^^
diff --git a/fs/proc/base.c b/fs/proc/base.c
index c667702dc69b..327bf82acf54 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3262,6 +3262,21 @@ static int proc_pid_ksm_merging_pages(struct seq_file *m, struct pid_namespace *
return 0;
}
+
+static int proc_pid_ksm_pages_sharing(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *task)
+{
+ struct mm_struct *mm;
+
+ mm = get_task_mm(task);
+ if (mm) {
+ seq_printf(m, "%lu\n", mm->ksm_pages_sharing);
+ mmput(mm);
+ }
+
+ return 0;
+}
+
static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
@@ -3272,6 +3287,7 @@ static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns,
if (mm) {
seq_printf(m, "ksm_rmap_items %lu\n", mm->ksm_rmap_items);
seq_printf(m, "ksm_zero_pages %ld\n", mm_ksm_zero_pages(mm));
+ seq_printf(m, "ksm_pages_sharing %lu\n", mm->ksm_pages_sharing);
seq_printf(m, "ksm_merging_pages %lu\n", mm->ksm_merging_pages);
seq_printf(m, "ksm_process_profit %ld\n", ksm_process_profit(mm));
seq_printf(m, "ksm_merge_any: %s\n",
@@ -3421,6 +3437,7 @@ static const struct pid_entry tgid_base_stuff[] = {
ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache),
#endif
#ifdef CONFIG_KSM
+ ONE("ksm_pages_sharing", S_IRUSR, proc_pid_ksm_pages_sharing),
ONE("ksm_merging_pages", S_IRUSR, proc_pid_ksm_merging_pages),
ONE("ksm_stat", S_IRUSR, proc_pid_ksm_stat),
#endif
@@ -3758,6 +3775,7 @@ static const struct pid_entry tid_base_stuff[] = {
ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache),
#endif
#ifdef CONFIG_KSM
+ ONE("ksm_pages_sharing", S_IRUSR, proc_pid_ksm_pages_sharing),
ONE("ksm_merging_pages", S_IRUSR, proc_pid_ksm_merging_pages),
ONE("ksm_stat", S_IRUSR, proc_pid_ksm_stat),
#endif
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index d6b91e8a66d6..d260cb09c10a 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1176,6 +1176,11 @@ struct mm_struct {
* merging (not including ksm_zero_pages).
*/
unsigned long ksm_merging_pages;
+ /*
+ * Represents how many pages saved of this process.
+ * (not including ksm_zero_pages).
+ */
+ unsigned long ksm_pages_sharing;
/*
* Represent how many pages are checked for ksm merging
* including merged and not merged.
diff --git a/mm/ksm.c b/mm/ksm.c
index 8583fb91ef13..c2d85ea07b1c 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -824,6 +824,7 @@ static void remove_node_from_stable_tree(struct ksm_stable_node *stable_node)
hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
if (rmap_item->hlist.next) {
ksm_pages_sharing--;
+ rmap_item->mm->ksm_pages_sharing--;
trace_ksm_remove_rmap_item(stable_node->kpfn, rmap_item, rmap_item->mm);
} else {
ksm_pages_shared--;
@@ -976,8 +977,10 @@ static void remove_rmap_item_from_tree(struct ksm_rmap_item *rmap_item)
folio_unlock(folio);
folio_put(folio);
- if (!hlist_empty(&stable_node->hlist))
+ if (!hlist_empty(&stable_node->hlist)) {
ksm_pages_sharing--;
+ rmap_item->mm->ksm_pages_sharing--;
+ }
else
ksm_pages_shared--;
@@ -2202,9 +2205,10 @@ static void stable_tree_append(struct ksm_rmap_item *rmap_item,
rmap_item->address |= STABLE_FLAG;
hlist_add_head(&rmap_item->hlist, &stable_node->hlist);
- if (rmap_item->hlist.next)
+ if (rmap_item->hlist.next) {
ksm_pages_sharing++;
- else
+ rmap_item->mm->ksm_pages_sharing++;
+ } else
ksm_pages_shared++;
rmap_item->mm->ksm_merging_pages++;
@@ -3290,7 +3294,7 @@ bool ksm_process_mergeable(struct mm_struct *mm)
long ksm_process_profit(struct mm_struct *mm)
{
- return (long)(mm->ksm_merging_pages + mm_ksm_zero_pages(mm)) * PAGE_SIZE -
+ return (long)(mm->ksm_pages_sharing + mm_ksm_zero_pages(mm)) * PAGE_SIZE -
mm->ksm_rmap_items * sizeof(struct ksm_rmap_item);
}
#endif /* CONFIG_PROC_FS */
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] mm/ksm: add ksm_pages_sharing for each process to calculate profit more accurately
2025-06-06 7:03 [PATCH 1/1] mm/ksm: add ksm_pages_sharing for each process to calculate profit more accurately Longlong Xia
@ 2025-06-06 10:08 ` David Hildenbrand
2025-06-18 7:13 ` Longlong Xia
0 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand @ 2025-06-06 10:08 UTC (permalink / raw)
To: Longlong Xia, akpm, xu.xin16
Cc: chengming.zhou, linux-mm, linux-kernel, Stefan Roesch
On 06.06.25 09:03, Longlong Xia wrote:
CCing Stefan.
> The general_profit_show() function only considers ksm_pages_sharing,
> while ksm_process_profit() includes both ksm_pages_sharing and
> ksm_pages_shared for each process. This leads to a mismatch between
> the total profits from ksm_process_profit() and general_profit_show().
>
> Based on my tests, the sum of ksm_process_profit() for all processes
> can be up to 20% higher than general_profit_show(), depending on
> the size of page_shared. For individual processes, the ratio of
> ksm_pages_sharing to ksm_merging_pages is usually not equal to 1.
>
> To resolve this, we suggest introducing ksm_pages_sharing for each
> process to accurately calculate its pages_sharing, ensuring
> ksm_process_profit() reflects shared memory benefits more accurately.
>
> Add a new proc file named as ksm_pages_sharing both under /proc/<pid>/
It's an entry in the file, not a new file.
> and /proc/self/ksm_stat/ to indicate the saved pages of this process.
> (not including ksm_zero_pages)
Curious, why is updating ksm_process_profit() insufficient and we also
have to expose ksm_pages_sharing?
>
> Suggested-by: Xu Xin <xu.xin16@zte.com.cn>
> Signed-off-by: Longlong Xia <xialonglong@kylinos.cn>
> ---
> Documentation/admin-guide/mm/ksm.rst | 5 +++--
> Documentation/filesystems/proc.rst | 8 ++++++++
> fs/proc/base.c | 18 ++++++++++++++++++
> include/linux/mm_types.h | 5 +++++
> mm/ksm.c | 12 ++++++++----
> 5 files changed, 42 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
> index ad8e7a41f3b5..0b33ef98930f 100644
> --- a/Documentation/admin-guide/mm/ksm.rst
> +++ b/Documentation/admin-guide/mm/ksm.rst
> @@ -256,9 +256,10 @@ several times, which are unprofitable memory consumed.
> process_profit =~ ksm_saved_pages * sizeof(page) -
> ksm_rmap_items * sizeof(rmap_item).
>
> - where ksm_saved_pages equals to the sum of ``ksm_merging_pages`` and
> + where ksm_saved_pages equals to the sum of ``ksm_pages_sharing`` and
> ``ksm_zero_pages``, both of which are shown under the directory
> - ``/proc/<pid>/ksm_stat``, and ksm_rmap_items is also shown in
> + ``/proc/<pid>/ksm_stat``, ksm_merging_pages and ksm_rmap_items are
> + also shown in
> ``/proc/<pid>/ksm_stat``. The process profit is also shown in
> ``/proc/<pid>/ksm_stat`` as ksm_process_profit.
>
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 2a17865dfe39..e14ea8389500 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -2290,6 +2290,7 @@ Example
> / # cat /proc/self/ksm_stat
> ksm_rmap_items 0
> ksm_zero_pages 0
> + ksm_pages_sharing 0
> ksm_merging_pages 0
> ksm_process_profit 0
> ksm_merge_any: no
> @@ -2312,6 +2313,13 @@ ksm_zero_pages
> When /sys/kernel/mm/ksm/use_zero_pages is enabled, it represent how many
> empty pages are merged with kernel zero pages by KSM.
>
> +ksm_pages_sharing
> +^^^^^^^^^^^^^^^^^
> +
> +It represents how many pages saved of this process.
> +(not including ksm_zero_pages). It is the same with what
> +/proc/<pid>/ksm_pages_sharing shows.
> +> ksm_merging_pages
> ^^^^^^^^^^^^^^^^^
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index c667702dc69b..327bf82acf54 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -3262,6 +3262,21 @@ static int proc_pid_ksm_merging_pages(struct seq_file *m, struct pid_namespace *
>
> return 0;
> }
> +
> +static int proc_pid_ksm_pages_sharing(struct seq_file *m, struct pid_namespace *ns,
> + struct pid *pid, struct task_struct *task)
> +{
> + struct mm_struct *mm;
> +
> + mm = get_task_mm(task);
> + if (mm) {
> + seq_printf(m, "%lu\n", mm->ksm_pages_sharing);
> + mmput(mm);
> + }
> +
> + return 0;
> +}
> +
> static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns,
> struct pid *pid, struct task_struct *task)
> {
> @@ -3272,6 +3287,7 @@ static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns,
> if (mm) {
> seq_printf(m, "ksm_rmap_items %lu\n", mm->ksm_rmap_items);
> seq_printf(m, "ksm_zero_pages %ld\n", mm_ksm_zero_pages(mm));
> + seq_printf(m, "ksm_pages_sharing %lu\n", mm->ksm_pages_sharing);
> seq_printf(m, "ksm_merging_pages %lu\n", mm->ksm_merging_pages);
> seq_printf(m, "ksm_process_profit %ld\n", ksm_process_profit(mm));
> seq_printf(m, "ksm_merge_any: %s\n",
> @@ -3421,6 +3437,7 @@ static const struct pid_entry tgid_base_stuff[] = {
> ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache),
> #endif
> #ifdef CONFIG_KSM
> + ONE("ksm_pages_sharing", S_IRUSR, proc_pid_ksm_pages_sharing),
> ONE("ksm_merging_pages", S_IRUSR, proc_pid_ksm_merging_pages),
> ONE("ksm_stat", S_IRUSR, proc_pid_ksm_stat),
> #endif
> @@ -3758,6 +3775,7 @@ static const struct pid_entry tid_base_stuff[] = {
> ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache),
> #endif
> #ifdef CONFIG_KSM
> + ONE("ksm_pages_sharing", S_IRUSR, proc_pid_ksm_pages_sharing),
> ONE("ksm_merging_pages", S_IRUSR, proc_pid_ksm_merging_pages),
> ONE("ksm_stat", S_IRUSR, proc_pid_ksm_stat),
> #endif
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index d6b91e8a66d6..d260cb09c10a 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1176,6 +1176,11 @@ struct mm_struct {
> * merging (not including ksm_zero_pages).
> */
> unsigned long ksm_merging_pages;
> + /*
> + * Represents how many pages saved of this process.
> + * (not including ksm_zero_pages).
> + */
> + unsigned long ksm_pages_sharing;
> /*
> * Represent how many pages are checked for ksm merging
> * including merged and not merged.
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 8583fb91ef13..c2d85ea07b1c 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -824,6 +824,7 @@ static void remove_node_from_stable_tree(struct ksm_stable_node *stable_node)
> hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
> if (rmap_item->hlist.next) {
> ksm_pages_sharing--;
> + rmap_item->mm->ksm_pages_sharing--;
> trace_ksm_remove_rmap_item(stable_node->kpfn, rmap_item, rmap_item->mm);
> } else {
> ksm_pages_shared--;
> @@ -976,8 +977,10 @@ static void remove_rmap_item_from_tree(struct ksm_rmap_item *rmap_item)
> folio_unlock(folio);
> folio_put(folio);
>
> - if (!hlist_empty(&stable_node->hlist))
> + if (!hlist_empty(&stable_node->hlist)) {
> ksm_pages_sharing--;
> + rmap_item->mm->ksm_pages_sharing--;
> + }
Hm, I am wondering if that works. Stable nodes are not per MM, so can't
we create an accounting imbalance for one MM somehow?
(did not look into all the details, just something that came to mind)
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] mm/ksm: add ksm_pages_sharing for each process to calculate profit more accurately
2025-06-06 10:08 ` David Hildenbrand
@ 2025-06-18 7:13 ` Longlong Xia
2025-06-18 9:14 ` xu.xin16
0 siblings, 1 reply; 6+ messages in thread
From: Longlong Xia @ 2025-06-18 7:13 UTC (permalink / raw)
To: David Hildenbrand, akpm, xu.xin16
Cc: chengming.zhou, linux-mm, linux-kernel, Stefan Roesch
Sorry for the late reply. I was thinking about how to respond to your
question.
在 2025/6/6 18:08, David Hildenbrand 写道:
> On 06.06.25 09:03, Longlong Xia wrote:
>
> CCing Stefan.
>
>> The general_profit_show() function only considers ksm_pages_sharing,
>> while ksm_process_profit() includes both ksm_pages_sharing and
>> ksm_pages_shared for each process. This leads to a mismatch between
>> the total profits from ksm_process_profit() and general_profit_show().
>>
>> Based on my tests, the sum of ksm_process_profit() for all processes
>> can be up to 20% higher than general_profit_show(), depending on
>> the size of page_shared. For individual processes, the ratio of
>> ksm_pages_sharing to ksm_merging_pages is usually not equal to 1.
>>
>> To resolve this, we suggest introducing ksm_pages_sharing for each
>> process to accurately calculate its pages_sharing, ensuring
>> ksm_process_profit() reflects shared memory benefits more accurately.
>>
>> Add a new proc file named as ksm_pages_sharing both under /proc/<pid>/
>
> It's an entry in the file, not a new file.
Thank you for pointing that out. I will fix it。
>
>> and /proc/self/ksm_stat/ to indicate the saved pages of this process.
>> (not including ksm_zero_pages)
>
> Curious, why is updating ksm_process_profit() insufficient and we also
> have to expose ksm_pages_sharing?
>
Since ksm_process_profit() uses ksm_merging_pages(pages_sharing +
pages_shared) to calculate the profit for individual processes,
while general_profit uses pages_sharing for profit calculation, this can
lead to the total profit calculated for each process being greater than
that of general_profit.
Additionally, exposing ksm_pages_sharing under /proc/self/ksm_stat/ may
be sufficient.
>>
>> Suggested-by: Xu Xin <xu.xin16@zte.com.cn>
>> Signed-off-by: Longlong Xia <xialonglong@kylinos.cn>
>> ---
>> Documentation/admin-guide/mm/ksm.rst | 5 +++--
>> Documentation/filesystems/proc.rst | 8 ++++++++
>> fs/proc/base.c | 18 ++++++++++++++++++
>> include/linux/mm_types.h | 5 +++++
>> mm/ksm.c | 12 ++++++++----
>> 5 files changed, 42 insertions(+), 6 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/mm/ksm.rst
>> b/Documentation/admin-guide/mm/ksm.rst
>> index ad8e7a41f3b5..0b33ef98930f 100644
>> --- a/Documentation/admin-guide/mm/ksm.rst
>> +++ b/Documentation/admin-guide/mm/ksm.rst
>> @@ -256,9 +256,10 @@ several times, which are unprofitable memory
>> consumed.
>> process_profit =~ ksm_saved_pages * sizeof(page) -
>> ksm_rmap_items * sizeof(rmap_item).
>> - where ksm_saved_pages equals to the sum of
>> ``ksm_merging_pages`` and
>> + where ksm_saved_pages equals to the sum of ``ksm_pages_sharing`` and
>> ``ksm_zero_pages``, both of which are shown under the directory
>> - ``/proc/<pid>/ksm_stat``, and ksm_rmap_items is also shown in
>> + ``/proc/<pid>/ksm_stat``, ksm_merging_pages and ksm_rmap_items are
>> + also shown in
>> ``/proc/<pid>/ksm_stat``. The process profit is also shown in
>> ``/proc/<pid>/ksm_stat`` as ksm_process_profit.
>> diff --git a/Documentation/filesystems/proc.rst
>> b/Documentation/filesystems/proc.rst
>> index 2a17865dfe39..e14ea8389500 100644
>> --- a/Documentation/filesystems/proc.rst
>> +++ b/Documentation/filesystems/proc.rst
>> @@ -2290,6 +2290,7 @@ Example
>> / # cat /proc/self/ksm_stat
>> ksm_rmap_items 0
>> ksm_zero_pages 0
>> + ksm_pages_sharing 0
>> ksm_merging_pages 0
>> ksm_process_profit 0
>> ksm_merge_any: no
>> @@ -2312,6 +2313,13 @@ ksm_zero_pages
>> When /sys/kernel/mm/ksm/use_zero_pages is enabled, it represent how
>> many
>> empty pages are merged with kernel zero pages by KSM.
>> +ksm_pages_sharing
>> +^^^^^^^^^^^^^^^^^
>> +
>> +It represents how many pages saved of this process.
>> +(not including ksm_zero_pages). It is the same with what
>> +/proc/<pid>/ksm_pages_sharing shows.
> > +> ksm_merging_pages
>> ^^^^^^^^^^^^^^^^^
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index c667702dc69b..327bf82acf54 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -3262,6 +3262,21 @@ static int proc_pid_ksm_merging_pages(struct
>> seq_file *m, struct pid_namespace *
>> return 0;
>> }
>> +
>> +static int proc_pid_ksm_pages_sharing(struct seq_file *m, struct
>> pid_namespace *ns,
>> + struct pid *pid, struct task_struct *task)
>> +{
>> + struct mm_struct *mm;
>> +
>> + mm = get_task_mm(task);
>> + if (mm) {
>> + seq_printf(m, "%lu\n", mm->ksm_pages_sharing);
>> + mmput(mm);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> static int proc_pid_ksm_stat(struct seq_file *m, struct
>> pid_namespace *ns,
>> struct pid *pid, struct task_struct *task)
>> {
>> @@ -3272,6 +3287,7 @@ static int proc_pid_ksm_stat(struct seq_file
>> *m, struct pid_namespace *ns,
>> if (mm) {
>> seq_printf(m, "ksm_rmap_items %lu\n", mm->ksm_rmap_items);
>> seq_printf(m, "ksm_zero_pages %ld\n", mm_ksm_zero_pages(mm));
>> + seq_printf(m, "ksm_pages_sharing %lu\n",
>> mm->ksm_pages_sharing);
>> seq_printf(m, "ksm_merging_pages %lu\n",
>> mm->ksm_merging_pages);
>> seq_printf(m, "ksm_process_profit %ld\n",
>> ksm_process_profit(mm));
>> seq_printf(m, "ksm_merge_any: %s\n",
>> @@ -3421,6 +3437,7 @@ static const struct pid_entry tgid_base_stuff[]
>> = {
>> ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache),
>> #endif
>> #ifdef CONFIG_KSM
>> + ONE("ksm_pages_sharing", S_IRUSR, proc_pid_ksm_pages_sharing),
>> ONE("ksm_merging_pages", S_IRUSR, proc_pid_ksm_merging_pages),
>> ONE("ksm_stat", S_IRUSR, proc_pid_ksm_stat),
>> #endif
>> @@ -3758,6 +3775,7 @@ static const struct pid_entry tid_base_stuff[] = {
>> ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache),
>> #endif
>> #ifdef CONFIG_KSM
>> + ONE("ksm_pages_sharing", S_IRUSR, proc_pid_ksm_pages_sharing),
>> ONE("ksm_merging_pages", S_IRUSR, proc_pid_ksm_merging_pages),
>> ONE("ksm_stat", S_IRUSR, proc_pid_ksm_stat),
>> #endif
>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> index d6b91e8a66d6..d260cb09c10a 100644
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -1176,6 +1176,11 @@ struct mm_struct {
>> * merging (not including ksm_zero_pages).
>> */
>> unsigned long ksm_merging_pages;
>> + /*
>> + * Represents how many pages saved of this process.
>> + * (not including ksm_zero_pages).
>> + */
>> + unsigned long ksm_pages_sharing;
>> /*
>> * Represent how many pages are checked for ksm merging
>> * including merged and not merged.
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index 8583fb91ef13..c2d85ea07b1c 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -824,6 +824,7 @@ static void remove_node_from_stable_tree(struct
>> ksm_stable_node *stable_node)
>> hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
>> if (rmap_item->hlist.next) {
>> ksm_pages_sharing--;
>> + rmap_item->mm->ksm_pages_sharing--;
>> trace_ksm_remove_rmap_item(stable_node->kpfn,
>> rmap_item, rmap_item->mm);
>> } else {
>> ksm_pages_shared--;
>> @@ -976,8 +977,10 @@ static void remove_rmap_item_from_tree(struct
>> ksm_rmap_item *rmap_item)
>> folio_unlock(folio);
>> folio_put(folio);
>> - if (!hlist_empty(&stable_node->hlist))
>> + if (!hlist_empty(&stable_node->hlist)) {
>> ksm_pages_sharing--;
>> + rmap_item->mm->ksm_pages_sharing--;
>> + }
>
>
> Hm, I am wondering if that works. Stable nodes are not per MM, so
> can't we create an accounting imbalance for one MM somehow?
>
> (did not look into all the details, just something that came to mind)
>
Indeed, using the method in this patch to calculate ksm_pages_sharing
for each process to determine ksm_pages_shared
can sometimes result in negative values for ksm_pages_shared.
example for calculate mm->ksm_pages_shared:
if (rmap_item->hlist.next) {
ksm_pages_sharing--;
rmap_item->mm->ksm_pages_sharing--;
} else {
ksm_pages_shared--;
rmap_item->mm->ksm_pages_shared--; // can be negative
}
rmap_item->mm->ksm_merging_pages--;
Would it be possible to compare the ratio of each process's rmap_item to
the total rmap_item and the ratio of the process's page_shared to the
total page_shared
to assess this imbalance? For now, I don't have any better ideas.
Thank you for your time.
Best regards,
Longlong Xia
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] mm/ksm: add ksm_pages_sharing for each process to calculate profit more accurately
2025-06-18 7:13 ` Longlong Xia
@ 2025-06-18 9:14 ` xu.xin16
2025-06-24 8:35 ` Longlong Xia
0 siblings, 1 reply; 6+ messages in thread
From: xu.xin16 @ 2025-06-18 9:14 UTC (permalink / raw)
To: xialonglong, david
Cc: david, akpm, chengming.zhou, linux-mm, linux-kernel, shr
> >
> >> and /proc/self/ksm_stat/ to indicate the saved pages of this process.
> >> (not including ksm_zero_pages)
> >
> > Curious, why is updating ksm_process_profit() insufficient and we also
> > have to expose ksm_pages_sharing?
> >
> Since ksm_process_profit() uses ksm_merging_pages(pages_sharing +
> pages_shared) to calculate the profit for individual processes,
>
> while general_profit uses pages_sharing for profit calculation, this can
> lead to the total profit calculated for each process being greater than
> that of general_profit.
>
> Additionally, exposing ksm_pages_sharing under /proc/self/ksm_stat/ may
> be sufficient.
>
Hi,
Althorugh it's true, however, this patch maybe not okay. It can only ensure
that the sum of each process's profit roughly equals the system's general_profit
, but gives totally wrong profit result for some one process. For example, when
two pages from two different processes are merged, one process's page_shared
increments by +1, while the other's pages_sharing increments by +1, which
resulting in different calculated profits for the two processes, even though
their actual profits are identical. If in more extreme cases, this could even
render a process's profit entirely unreadable.
Lastly, do we really need each process’s profit sum to perfectly match the general
profit, or we just want a rough estimate of the process’s profit from KSM ?
>
>
> > Hm, I am wondering if that works. Stable nodes are not per MM, so
> > can't we create an accounting imbalance for one MM somehow?
> >
> > (did not look into all the details, just something that came to mind)
> >
> Indeed, using the method in this patch to calculate ksm_pages_sharing
> for each process to determine ksm_pages_shared
>
> can sometimes result in negative values for ksm_pages_shared.
>
> example for calculate mm->ksm_pages_shared:
>
> if (rmap_item->hlist.next) {
> ksm_pages_sharing--;
> rmap_item->mm->ksm_pages_sharing--;
>
> } else {
> ksm_pages_shared--;
> rmap_item->mm->ksm_pages_shared--; // can be negative
> }
>
> rmap_item->mm->ksm_merging_pages--;
>
>
> Would it be possible to compare the ratio of each process's rmap_item to
> the total rmap_item and the ratio of the process's page_shared to the
> total page_shared
>
> to assess this imbalance? For now, I don't have any better ideas.
Although stable_node is not per-mm, if you really add ksm_shared to mm,
it won't cause negative ksm_pages_shared, because the count of ksm_shared
will only be attributed to the process of the first rmap_item.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] mm/ksm: add ksm_pages_sharing for each process to calculate profit more accurately
2025-06-18 9:14 ` xu.xin16
@ 2025-06-24 8:35 ` Longlong Xia
2025-06-24 9:47 ` 答复: " xu.xin16
0 siblings, 1 reply; 6+ messages in thread
From: Longlong Xia @ 2025-06-24 8:35 UTC (permalink / raw)
To: xu.xin16, david
Cc: akpm, chengming.zhou, linux-mm, linux-kernel, shr, corbet,
lorenzo.stoakes, Liam.Howlett, vbabka
在 2025/6/18 17:14, xu.xin16@zte.com.cn 写道:
>>>> and /proc/self/ksm_stat/ to indicate the saved pages of this process.
>>>> (not including ksm_zero_pages)
>>> Curious, why is updating ksm_process_profit() insufficient and we also
>>> have to expose ksm_pages_sharing?
>>>
>> Since ksm_process_profit() uses ksm_merging_pages(pages_sharing +
>> pages_shared) to calculate the profit for individual processes,
>>
>> while general_profit uses pages_sharing for profit calculation, this can
>> lead to the total profit calculated for each process being greater than
>> that of general_profit.
>>
>> Additionally, exposing ksm_pages_sharing under /proc/self/ksm_stat/ may
>> be sufficient.
>>
> Hi,
>
> Althorugh it's true, however, this patch maybe not okay. It can only ensure
> that the sum of each process's profit roughly equals the system's general_profit
> , but gives totally wrong profit result for some one process. For example, when
> two pages from two different processes are merged, one process's page_shared
> increments by +1, while the other's pages_sharing increments by +1, which
> resulting in different calculated profits for the two processes, even though
> their actual profits are identical. If in more extreme cases, this could even
> render a process's profit entirely unreadable.
>
> Lastly, do we really need each process’s profit sum to perfectly match the general
> profit, or we just want a rough estimate of the process’s profit from KSM ?
>
Hi,
In extreme cases, stable nodes may be distributed quite unevenly, which
is due to stable nodes not being per mm, of course.
There are also situations where there are 1000 pairs of pages, with the
pages within each pair being identical, while each pair is different
from all other pages.
This results in the number of page_sharing and page_shared being the
same. This way, using ksm_merging_pages(page_sharing + page_shared)
averages a 50% error.
In practical testing, we may only need to enable KSM for specific
applications and calculate the total benefits of these processes.
Since page_shared is also included in the statistics, this may lead to
the calculated benefits being higher than the actual ones.
In practical testing, the error may reach 20%. For example, in one test,
the total benefits of all processes were estimated to be around 528MB,
while the profit calculated through general_profit was only around 428MB.
The theoretical error may be around 50%.
If we expose the ksm_pages_sharing for each process, we can not only
calculate the actual benefits
but also determine how many ksm_pages_shared there are by the difference
between ksm_merging_pages and ksm_pages_sharing of each process.
>>
>>> Hm, I am wondering if that works. Stable nodes are not per MM, so
>>> can't we create an accounting imbalance for one MM somehow?
>>>
>>> (did not look into all the details, just something that came to mind)
>>>
>> Indeed, using the method in this patch to calculate ksm_pages_sharing
>> for each process to determine ksm_pages_shared
>>
>> can sometimes result in negative values for ksm_pages_shared.
>>
>> example for calculate mm->ksm_pages_shared:
>>
>> if (rmap_item->hlist.next) {
>> ksm_pages_sharing--;
>> rmap_item->mm->ksm_pages_sharing--;
>>
>> } else {
>> ksm_pages_shared--;
>> rmap_item->mm->ksm_pages_shared--; // can be negative
>> }
>>
>> rmap_item->mm->ksm_merging_pages--;
>>
>>
>> Would it be possible to compare the ratio of each process's rmap_item to
>> the total rmap_item and the ratio of the process's page_shared to the
>> total page_shared
>>
>> to assess this imbalance? For now, I don't have any better ideas.
> Although stable_node is not per-mm, if you really add ksm_shared to mm,
> it won't cause negative ksm_pages_shared, because the count of ksm_shared
> will only be attributed to the process of the first rmap_item.
Yes, it was the incorrect method I used during testing that led to the
negative values.
After the improvement, it has not occurred again.
Thank you for your time.
Best regards,
Longlong Xia
^ permalink raw reply [flat|nested] 6+ messages in thread
* 答复: [PATCH 1/1] mm/ksm: add ksm_pages_sharing for each process to calculate profit more accurately
2025-06-24 8:35 ` Longlong Xia
@ 2025-06-24 9:47 ` xu.xin16
0 siblings, 0 replies; 6+ messages in thread
From: xu.xin16 @ 2025-06-24 9:47 UTC (permalink / raw)
To: xialonglong
Cc: david, akpm, chengming.zhou, linux-mm, linux-kernel, shr, corbet,
lorenzo.stoakes, Liam.Howlett, vbabka
> >>>> and /proc/self/ksm_stat/ to indicate the saved pages of this process.
> >>>> (not including ksm_zero_pages)
> >>> Curious, why is updating ksm_process_profit() insufficient and we also
> >>> have to expose ksm_pages_sharing?
> >>>
> >> Since ksm_process_profit() uses ksm_merging_pages(pages_sharing +
> >> pages_shared) to calculate the profit for individual processes,
> >>
> >> while general_profit uses pages_sharing for profit calculation, this can
> >> lead to the total profit calculated for each process being greater than
> >> that of general_profit.
> >>
> >> Additionally, exposing ksm_pages_sharing under /proc/self/ksm_stat/ may
> >> be sufficient.
> >>
> > Hi,
> >
> > Althorugh it's true, however, this patch maybe not okay. It can only ensure
> > that the sum of each process's profit roughly equals the system's general_profit
> > , but gives totally wrong profit result for some one process. For example, when
> > two pages from two different processes are merged, one process's page_shared
> > increments by +1, while the other's pages_sharing increments by +1, which
> > resulting in different calculated profits for the two processes, even though
> > their actual profits are identical. If in more extreme cases, this could even
> > render a process's profit entirely unreadable.
> >
> > Lastly, do we really need each process’s profit sum to perfectly match the general
> > profit, or we just want a rough estimate of the process’s profit from KSM ?
> >
> Hi,
>
> In extreme cases, stable nodes may be distributed quite unevenly, which
> is due to stable nodes not being per mm, of course.
> There are also situations where there are 1000 pairs of pages, with the
> pages within each pair being identical, while each pair is different
> from all other pages.
> This results in the number of page_sharing and page_shared being the
> same. This way, using ksm_merging_pages(page_sharing + page_shared)
> averages a 50% error.
In your tests, I don't agree 50% error because your
assumption that process benefit equals pages_sharing is fundamentally flawed.
The issue lies in what is the most accurate definition of process KSM profit.
Since stable_node isn't per-mm, we cannot calculate a process's
benefit solely based on pages_sharing. The cost of stable_node should be split
fairly among every process sharing this stable_node, rather than being assigned
to a single individual.
It's inaccurate to claim that when two processes' pages merge into a
single KSM page, one process gains 4k - sizeof(rmap_item) while
the other gains 0 ? This is unfair to the second process, as it actively
participated in the KSM merge.
The most accurate and fair profit caculation should be:
profit = (ksm_merging_pages - united_stable_nodes)*PAGE_SIZE - sizeof(rmap_items)*ksm_rmap_items
where 'united_stable_nodes' is (stable_node)/shared_process. This is too complex.
For example: process A with one page is merged with process B with one page
process A process B
page page
\ /
\ /
\ /
\ /
Ksm Page
A: pages_sharing(1), pages_shared(0)
B: pages_sharing(0), pages_shared(1)
then
profit(A) = (pages_sharing + pages_shared - united_stable_nodes)*4K- sizeof(rmap_items)*ksm_rmap_items
= (1 + 0 - 1/2)*4k-sizeof(rmap_items)*ksm_rmap_items
= 0.5*4k - sizeof(rmap_items)*ksm_rmap_items
profit(A) = (pages_sharing + pages_shared - united_stable_nodes)*4K- sizeof(rmap_items)*ksm_rmap_items
= (0 + 1 - 1/2)*4k-sizeof(rmap_items)*ksm_rmap_items
= 0.5*4k - sizeof(rmap_items)*ksm_rmap_items
> In practical testing, we may only need to enable KSM for specific
> applications and calculate the total benefits of these processes.
> Since page_shared is also included in the statistics, this may lead to
> the calculated benefits being higher than the actual ones.
> In practical testing, the error may reach 20%. For example, in one test,
> the total benefits of all processes were estimated to be around 528MB,
> while the profit calculated through general_profit was only around 428MB.
> The theoretical error may be around 50%.
>
> If we expose the ksm_pages_sharing for each process, we can not only
> calculate the actual benefits
>
> but also determine how many ksm_pages_shared there are by the difference
> between ksm_merging_pages and ksm_pages_sharing of each process.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-06-24 9:47 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-06 7:03 [PATCH 1/1] mm/ksm: add ksm_pages_sharing for each process to calculate profit more accurately Longlong Xia
2025-06-06 10:08 ` David Hildenbrand
2025-06-18 7:13 ` Longlong Xia
2025-06-18 9:14 ` xu.xin16
2025-06-24 8:35 ` Longlong Xia
2025-06-24 9:47 ` 答复: " xu.xin16
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).