* [PATCH v2] mm: fix the inaccurate memory statistics issue for users
@ 2025-06-05 12:58 Baolin Wang
2025-06-05 13:34 ` Vlastimil Babka
2025-06-09 5:27 ` Ritesh Harjani
0 siblings, 2 replies; 14+ messages in thread
From: Baolin Wang @ 2025-06-05 12:58 UTC (permalink / raw)
To: akpm, david, shakeel.butt
Cc: lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
donettom, aboorvad, sj, baolin.wang, linux-mm, linux-fsdevel,
linux-kernel
On some large machines with a high number of CPUs running a 64K pagesize
kernel, we found that the 'RES' field is always 0 displayed by the top
command for some processes, which will cause a lot of confusion for users.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top
1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd
The main reason is that the batch size of the percpu counter is quite large
on these machines, caching a significant percpu value, since converting mm's
rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss
stats into percpu_counter"). Intuitively, the batch number should be optimized,
but on some paths, performance may take precedence over statistical accuracy.
Therefore, introducing a new interface to add the percpu statistical count
and display it to users, which can remove the confusion. In addition, this
change is not expected to be on a performance-critical path, so the modification
should be acceptable.
In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch().
In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock'
contention. This patch changes task_mem() and task_statm() to get the accurate
mm counters under the 'fbc->lock', but this should not exacerbate kernel
'mm->rss_stat' lock contention due to the percpu batch caching of the mm
counters. The following test also confirm the theoretical analysis.
I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores
machine, while simultaneously running a script that starts 32 threads to
busy-loop pread each stress-ng thread's /proc/pid/status interface. From the
following data, I did not observe any obvious impact of this patch on the
stress-ng tests.
w/o patch:
stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec
stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle)
stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec
stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec
w/patch:
stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec
stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle)
stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec
stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec
Tested-by Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: SeongJae Park <sj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
Changes from v1:
- Update the commit message to add some measurements.
- Add acked tag from Michal. Thanks.
- Drop the Fixes tag.
Changes from RFC:
- Collect reviewed and tested tags. Thanks.
- Add Fixes tag.
---
fs/proc/task_mmu.c | 14 +++++++-------
include/linux/mm.h | 5 +++++
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index b9e4fbbdf6e6..f629e6526935 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -36,9 +36,9 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
unsigned long text, lib, swap, anon, file, shmem;
unsigned long hiwater_vm, total_vm, hiwater_rss, total_rss;
- anon = get_mm_counter(mm, MM_ANONPAGES);
- file = get_mm_counter(mm, MM_FILEPAGES);
- shmem = get_mm_counter(mm, MM_SHMEMPAGES);
+ anon = get_mm_counter_sum(mm, MM_ANONPAGES);
+ file = get_mm_counter_sum(mm, MM_FILEPAGES);
+ shmem = get_mm_counter_sum(mm, MM_SHMEMPAGES);
/*
* Note: to minimize their overhead, mm maintains hiwater_vm and
@@ -59,7 +59,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
text = min(text, mm->exec_vm << PAGE_SHIFT);
lib = (mm->exec_vm << PAGE_SHIFT) - text;
- swap = get_mm_counter(mm, MM_SWAPENTS);
+ swap = get_mm_counter_sum(mm, MM_SWAPENTS);
SEQ_PUT_DEC("VmPeak:\t", hiwater_vm);
SEQ_PUT_DEC(" kB\nVmSize:\t", total_vm);
SEQ_PUT_DEC(" kB\nVmLck:\t", mm->locked_vm);
@@ -92,12 +92,12 @@ unsigned long task_statm(struct mm_struct *mm,
unsigned long *shared, unsigned long *text,
unsigned long *data, unsigned long *resident)
{
- *shared = get_mm_counter(mm, MM_FILEPAGES) +
- get_mm_counter(mm, MM_SHMEMPAGES);
+ *shared = get_mm_counter_sum(mm, MM_FILEPAGES) +
+ get_mm_counter_sum(mm, MM_SHMEMPAGES);
*text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK))
>> PAGE_SHIFT;
*data = mm->data_vm + mm->stack_vm;
- *resident = *shared + get_mm_counter(mm, MM_ANONPAGES);
+ *resident = *shared + get_mm_counter_sum(mm, MM_ANONPAGES);
return mm->total_vm;
}
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 185424858f23..15ec5cfe9515 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2568,6 +2568,11 @@ static inline unsigned long get_mm_counter(struct mm_struct *mm, int member)
return percpu_counter_read_positive(&mm->rss_stat[member]);
}
+static inline unsigned long get_mm_counter_sum(struct mm_struct *mm, int member)
+{
+ return percpu_counter_sum_positive(&mm->rss_stat[member]);
+}
+
void mm_trace_rss_stat(struct mm_struct *mm, int member);
static inline void add_mm_counter(struct mm_struct *mm, int member, long value)
--
2.43.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-05 12:58 [PATCH v2] mm: fix the inaccurate memory statistics issue for users Baolin Wang
@ 2025-06-05 13:34 ` Vlastimil Babka
2025-06-09 5:27 ` Ritesh Harjani
1 sibling, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2025-06-05 13:34 UTC (permalink / raw)
To: Baolin Wang, akpm, david, shakeel.butt
Cc: lorenzo.stoakes, Liam.Howlett, rppt, surenb, mhocko, donettom,
aboorvad, sj, linux-mm, linux-fsdevel, linux-kernel
On 6/5/25 14:58, Baolin Wang wrote:
> On some large machines with a high number of CPUs running a 64K pagesize
> kernel, we found that the 'RES' field is always 0 displayed by the top
> command for some processes, which will cause a lot of confusion for users.
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top
> 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd
>
> The main reason is that the batch size of the percpu counter is quite large
> on these machines, caching a significant percpu value, since converting mm's
> rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss
> stats into percpu_counter"). Intuitively, the batch number should be optimized,
> but on some paths, performance may take precedence over statistical accuracy.
> Therefore, introducing a new interface to add the percpu statistical count
> and display it to users, which can remove the confusion. In addition, this
> change is not expected to be on a performance-critical path, so the modification
> should be acceptable.
>
> In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
> dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch().
> In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock'
> contention. This patch changes task_mem() and task_statm() to get the accurate
> mm counters under the 'fbc->lock', but this should not exacerbate kernel
> 'mm->rss_stat' lock contention due to the percpu batch caching of the mm
> counters. The following test also confirm the theoretical analysis.
>
> I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores
> machine, while simultaneously running a script that starts 32 threads to
> busy-loop pread each stress-ng thread's /proc/pid/status interface. From the
> following data, I did not observe any obvious impact of this patch on the
> stress-ng tests.
>
> w/o patch:
> stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec
> stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle)
> stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec
> stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec
>
> w/patch:
> stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec
> stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle)
> stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec
> stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec
>
> Tested-by Donet Tom <donettom@linux.ibm.com>
> Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Acked-by: SeongJae Park <sj@kernel.org>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Thanks!
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-05 12:58 [PATCH v2] mm: fix the inaccurate memory statistics issue for users Baolin Wang
2025-06-05 13:34 ` Vlastimil Babka
@ 2025-06-09 5:27 ` Ritesh Harjani
2025-06-09 7:35 ` Michal Hocko
1 sibling, 1 reply; 14+ messages in thread
From: Ritesh Harjani @ 2025-06-09 5:27 UTC (permalink / raw)
To: Baolin Wang, akpm, david, shakeel.butt
Cc: lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
donettom, aboorvad, sj, baolin.wang, linux-mm, linux-fsdevel,
linux-kernel
Baolin Wang <baolin.wang@linux.alibaba.com> writes:
> On some large machines with a high number of CPUs running a 64K pagesize
> kernel, we found that the 'RES' field is always 0 displayed by the top
> command for some processes, which will cause a lot of confusion for users.
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top
> 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd
>
> The main reason is that the batch size of the percpu counter is quite large
> on these machines, caching a significant percpu value, since converting mm's
> rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss
> stats into percpu_counter"). Intuitively, the batch number should be optimized,
> but on some paths, performance may take precedence over statistical accuracy.
> Therefore, introducing a new interface to add the percpu statistical count
> and display it to users, which can remove the confusion. In addition, this
> change is not expected to be on a performance-critical path, so the modification
> should be acceptable.
>
> In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
> dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch().
> In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock'
> contention. This patch changes task_mem() and task_statm() to get the accurate
> mm counters under the 'fbc->lock', but this should not exacerbate kernel
> 'mm->rss_stat' lock contention due to the percpu batch caching of the mm
> counters. The following test also confirm the theoretical analysis.
>
> I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores
> machine, while simultaneously running a script that starts 32 threads to
> busy-loop pread each stress-ng thread's /proc/pid/status interface. From the
> following data, I did not observe any obvious impact of this patch on the
> stress-ng tests.
>
> w/o patch:
> stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec
> stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle)
> stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec
> stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec
>
> w/patch:
> stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec
> stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle)
> stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec
> stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec
>
> Tested-by Donet Tom <donettom@linux.ibm.com>
> Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Acked-by: SeongJae Park <sj@kernel.org>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> Changes from v1:
> - Update the commit message to add some measurements.
> - Add acked tag from Michal. Thanks.
> - Drop the Fixes tag.
Any reason why we dropped the Fixes tag? I see there were a series of
discussion on v1 and it got concluded that the fix was correct, then why
drop the fixes tag?
Background: Recently few folks internally reported this issue on Power
too. e.g.
$ ps -o rss $$
RSS
0
So it would be nice if we had fixes tag so that it gets backported
to all stable release. Does anybody sees any concern with that?
-ritesh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-09 5:27 ` Ritesh Harjani
@ 2025-06-09 7:35 ` Michal Hocko
2025-06-09 8:04 ` Baolin Wang
0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2025-06-09 7:35 UTC (permalink / raw)
To: Ritesh Harjani
Cc: Baolin Wang, akpm, david, shakeel.butt, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, donettom, aboorvad, sj,
linux-mm, linux-fsdevel, linux-kernel
On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>
> > On some large machines with a high number of CPUs running a 64K pagesize
> > kernel, we found that the 'RES' field is always 0 displayed by the top
> > command for some processes, which will cause a lot of confusion for users.
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top
> > 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd
> >
> > The main reason is that the batch size of the percpu counter is quite large
> > on these machines, caching a significant percpu value, since converting mm's
> > rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss
> > stats into percpu_counter"). Intuitively, the batch number should be optimized,
> > but on some paths, performance may take precedence over statistical accuracy.
> > Therefore, introducing a new interface to add the percpu statistical count
> > and display it to users, which can remove the confusion. In addition, this
> > change is not expected to be on a performance-critical path, so the modification
> > should be acceptable.
> >
> > In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
> > dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch().
> > In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock'
> > contention. This patch changes task_mem() and task_statm() to get the accurate
> > mm counters under the 'fbc->lock', but this should not exacerbate kernel
> > 'mm->rss_stat' lock contention due to the percpu batch caching of the mm
> > counters. The following test also confirm the theoretical analysis.
> >
> > I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores
> > machine, while simultaneously running a script that starts 32 threads to
> > busy-loop pread each stress-ng thread's /proc/pid/status interface. From the
> > following data, I did not observe any obvious impact of this patch on the
> > stress-ng tests.
> >
> > w/o patch:
> > stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec
> > stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle)
> > stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec
> > stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec
> >
> > w/patch:
> > stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec
> > stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle)
> > stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec
> > stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec
> >
> > Tested-by Donet Tom <donettom@linux.ibm.com>
> > Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> > Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> > Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> > Acked-by: SeongJae Park <sj@kernel.org>
> > Acked-by: Michal Hocko <mhocko@suse.com>
> > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> > ---
> > Changes from v1:
> > - Update the commit message to add some measurements.
> > - Add acked tag from Michal. Thanks.
> > - Drop the Fixes tag.
>
> Any reason why we dropped the Fixes tag? I see there were a series of
> discussion on v1 and it got concluded that the fix was correct, then why
> drop the fixes tag?
This seems more like an improvement than a bug fix.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-09 7:35 ` Michal Hocko
@ 2025-06-09 8:04 ` Baolin Wang
2025-06-09 8:31 ` Ritesh Harjani
0 siblings, 1 reply; 14+ messages in thread
From: Baolin Wang @ 2025-06-09 8:04 UTC (permalink / raw)
To: Michal Hocko, Ritesh Harjani
Cc: akpm, david, shakeel.butt, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, donettom, aboorvad, sj, linux-mm, linux-fsdevel,
linux-kernel
On 2025/6/9 15:35, Michal Hocko wrote:
> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>
>>> On some large machines with a high number of CPUs running a 64K pagesize
>>> kernel, we found that the 'RES' field is always 0 displayed by the top
>>> command for some processes, which will cause a lot of confusion for users.
>>>
>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>> 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top
>>> 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd
>>>
>>> The main reason is that the batch size of the percpu counter is quite large
>>> on these machines, caching a significant percpu value, since converting mm's
>>> rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss
>>> stats into percpu_counter"). Intuitively, the batch number should be optimized,
>>> but on some paths, performance may take precedence over statistical accuracy.
>>> Therefore, introducing a new interface to add the percpu statistical count
>>> and display it to users, which can remove the confusion. In addition, this
>>> change is not expected to be on a performance-critical path, so the modification
>>> should be acceptable.
>>>
>>> In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
>>> dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch().
>>> In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock'
>>> contention. This patch changes task_mem() and task_statm() to get the accurate
>>> mm counters under the 'fbc->lock', but this should not exacerbate kernel
>>> 'mm->rss_stat' lock contention due to the percpu batch caching of the mm
>>> counters. The following test also confirm the theoretical analysis.
>>>
>>> I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores
>>> machine, while simultaneously running a script that starts 32 threads to
>>> busy-loop pread each stress-ng thread's /proc/pid/status interface. From the
>>> following data, I did not observe any obvious impact of this patch on the
>>> stress-ng tests.
>>>
>>> w/o patch:
>>> stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec
>>> stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle)
>>> stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec
>>> stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec
>>>
>>> w/patch:
>>> stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec
>>> stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle)
>>> stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec
>>> stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec
>>>
>>> Tested-by Donet Tom <donettom@linux.ibm.com>
>>> Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
>>> Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
>>> Acked-by: SeongJae Park <sj@kernel.org>
>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>> ---
>>> Changes from v1:
>>> - Update the commit message to add some measurements.
>>> - Add acked tag from Michal. Thanks.
>>> - Drop the Fixes tag.
>>
>> Any reason why we dropped the Fixes tag? I see there were a series of
>> discussion on v1 and it got concluded that the fix was correct, then why
>> drop the fixes tag?
>
> This seems more like an improvement than a bug fix.
Yes. I don't have a strong opinion on this, but we (Alibaba) will
backport it manually, because some of user-space monitoring tools depend
on these statistics.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-09 8:04 ` Baolin Wang
@ 2025-06-09 8:31 ` Ritesh Harjani
2025-06-09 8:52 ` Vlastimil Babka
0 siblings, 1 reply; 14+ messages in thread
From: Ritesh Harjani @ 2025-06-09 8:31 UTC (permalink / raw)
To: Baolin Wang, Michal Hocko
Cc: akpm, david, shakeel.butt, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, donettom, aboorvad, sj, linux-mm, linux-fsdevel,
linux-kernel
Baolin Wang <baolin.wang@linux.alibaba.com> writes:
> On 2025/6/9 15:35, Michal Hocko wrote:
>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
>>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>>
>>>> On some large machines with a high number of CPUs running a 64K pagesize
>>>> kernel, we found that the 'RES' field is always 0 displayed by the top
>>>> command for some processes, which will cause a lot of confusion for users.
>>>>
>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>>> 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top
>>>> 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd
>>>>
>>>> The main reason is that the batch size of the percpu counter is quite large
>>>> on these machines, caching a significant percpu value, since converting mm's
>>>> rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss
>>>> stats into percpu_counter"). Intuitively, the batch number should be optimized,
>>>> but on some paths, performance may take precedence over statistical accuracy.
>>>> Therefore, introducing a new interface to add the percpu statistical count
>>>> and display it to users, which can remove the confusion. In addition, this
>>>> change is not expected to be on a performance-critical path, so the modification
>>>> should be acceptable.
>>>>
>>>> In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
>>>> dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch().
>>>> In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock'
>>>> contention. This patch changes task_mem() and task_statm() to get the accurate
>>>> mm counters under the 'fbc->lock', but this should not exacerbate kernel
>>>> 'mm->rss_stat' lock contention due to the percpu batch caching of the mm
>>>> counters. The following test also confirm the theoretical analysis.
>>>>
>>>> I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores
>>>> machine, while simultaneously running a script that starts 32 threads to
>>>> busy-loop pread each stress-ng thread's /proc/pid/status interface. From the
>>>> following data, I did not observe any obvious impact of this patch on the
>>>> stress-ng tests.
>>>>
>>>> w/o patch:
>>>> stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec
>>>> stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle)
>>>> stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec
>>>> stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec
>>>>
>>>> w/patch:
>>>> stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec
>>>> stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle)
>>>> stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec
>>>> stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec
>>>>
>>>> Tested-by Donet Tom <donettom@linux.ibm.com>
>>>> Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
>>>> Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
>>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
>>>> Acked-by: SeongJae Park <sj@kernel.org>
>>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>> ---
>>>> Changes from v1:
>>>> - Update the commit message to add some measurements.
>>>> - Add acked tag from Michal. Thanks.
>>>> - Drop the Fixes tag.
>>>
>>> Any reason why we dropped the Fixes tag? I see there were a series of
>>> discussion on v1 and it got concluded that the fix was correct, then why
>>> drop the fixes tag?
>>
>> This seems more like an improvement than a bug fix.
>
> Yes. I don't have a strong opinion on this, but we (Alibaba) will
> backport it manually,
>
> because some of user-space monitoring tools depend
> on these statistics.
That sounds like a regression then, isn't it?
-ritesh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-09 8:31 ` Ritesh Harjani
@ 2025-06-09 8:52 ` Vlastimil Babka
2025-06-09 8:56 ` Vlastimil Babka
0 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2025-06-09 8:52 UTC (permalink / raw)
To: Ritesh Harjani (IBM), Baolin Wang, Michal Hocko
Cc: akpm, david, shakeel.butt, lorenzo.stoakes, Liam.Howlett, rppt,
surenb, donettom, aboorvad, sj, linux-mm, linux-fsdevel,
linux-kernel
On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>
>> On 2025/6/9 15:35, Michal Hocko wrote:
>>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
>>>>
>>>> Any reason why we dropped the Fixes tag? I see there were a series of
>>>> discussion on v1 and it got concluded that the fix was correct, then why
>>>> drop the fixes tag?
>>>
>>> This seems more like an improvement than a bug fix.
>>
>> Yes. I don't have a strong opinion on this, but we (Alibaba) will
>> backport it manually,
>>
>> because some of user-space monitoring tools depend
>> on these statistics.
>
> That sounds like a regression then, isn't it?
Hm if counters were accurate before f1a7941243c1 and not afterwards, and
this is making them accurate again, and some userspace depends on it,
then Fixes: and stable is probably warranted then. If this was just a
perf improvement, then not. But AFAIU f1a7941243c1 was the perf
improvement...
> -ritesh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-09 8:52 ` Vlastimil Babka
@ 2025-06-09 8:56 ` Vlastimil Babka
2025-06-10 0:17 ` Andrew Morton
0 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2025-06-09 8:56 UTC (permalink / raw)
To: Ritesh Harjani (IBM), Baolin Wang, Michal Hocko
Cc: akpm, david, shakeel.butt, lorenzo.stoakes, Liam.Howlett, rppt,
surenb, donettom, aboorvad, sj, linux-mm, linux-fsdevel,
linux-kernel
On 6/9/25 10:52 AM, Vlastimil Babka wrote:
> On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>
>>> On 2025/6/9 15:35, Michal Hocko wrote:
>>>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
>>>>>
>>>>> Any reason why we dropped the Fixes tag? I see there were a series of
>>>>> discussion on v1 and it got concluded that the fix was correct, then why
>>>>> drop the fixes tag?
>>>>
>>>> This seems more like an improvement than a bug fix.
>>>
>>> Yes. I don't have a strong opinion on this, but we (Alibaba) will
>>> backport it manually,
>>>
>>> because some of user-space monitoring tools depend
>>> on these statistics.
>>
>> That sounds like a regression then, isn't it?
>
> Hm if counters were accurate before f1a7941243c1 and not afterwards, and
> this is making them accurate again, and some userspace depends on it,
> then Fixes: and stable is probably warranted then. If this was just a
> perf improvement, then not. But AFAIU f1a7941243c1 was the perf
> improvement...
Dang, should have re-read the commit log of f1a7941243c1 first. It seems
like the error margin due to batching existed also before f1a7941243c1.
" This patch converts the rss_stats into percpu_counter to convert the
error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2)."
so if on some systems this means worse margin than before, the above
"if" chain of thought might still hold.
>
>> -ritesh
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-09 8:56 ` Vlastimil Babka
@ 2025-06-10 0:17 ` Andrew Morton
2025-06-10 0:45 ` Shakeel Butt
2025-07-04 18:22 ` Luiz Capitulino
0 siblings, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2025-06-10 0:17 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Ritesh Harjani (IBM), Baolin Wang, Michal Hocko, david,
shakeel.butt, lorenzo.stoakes, Liam.Howlett, rppt, surenb,
donettom, aboorvad, sj, linux-mm, linux-fsdevel, linux-kernel
On Mon, 9 Jun 2025 10:56:46 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:
> On 6/9/25 10:52 AM, Vlastimil Babka wrote:
> > On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
> >> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
> >>
> >>> On 2025/6/9 15:35, Michal Hocko wrote:
> >>>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
> >>>>>
> >>>>> Any reason why we dropped the Fixes tag? I see there were a series of
> >>>>> discussion on v1 and it got concluded that the fix was correct, then why
> >>>>> drop the fixes tag?
> >>>>
> >>>> This seems more like an improvement than a bug fix.
> >>>
> >>> Yes. I don't have a strong opinion on this, but we (Alibaba) will
> >>> backport it manually,
> >>>
> >>> because some of user-space monitoring tools depend
> >>> on these statistics.
> >>
> >> That sounds like a regression then, isn't it?
> >
> > Hm if counters were accurate before f1a7941243c1 and not afterwards, and
> > this is making them accurate again, and some userspace depends on it,
> > then Fixes: and stable is probably warranted then. If this was just a
> > perf improvement, then not. But AFAIU f1a7941243c1 was the perf
> > improvement...
>
> Dang, should have re-read the commit log of f1a7941243c1 first. It seems
> like the error margin due to batching existed also before f1a7941243c1.
>
> " This patch converts the rss_stats into percpu_counter to convert the
> error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2)."
>
> so if on some systems this means worse margin than before, the above
> "if" chain of thought might still hold.
f1a7941243c1 seems like a good enough place to tell -stable
maintainers where to insert the patch (why does this sound rude).
The patch is simple enough. I'll add fixes:f1a7941243c1 and cc:stable
and, as the problem has been there for years, I'll leave the patch in
mm-unstable so it will eventually get into LTS, in a well tested state.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-10 0:17 ` Andrew Morton
@ 2025-06-10 0:45 ` Shakeel Butt
2025-06-10 9:59 ` Michal Hocko
2025-07-04 18:22 ` Luiz Capitulino
1 sibling, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2025-06-10 0:45 UTC (permalink / raw)
To: Andrew Morton
Cc: Vlastimil Babka, Ritesh Harjani (IBM), Baolin Wang, Michal Hocko,
david, lorenzo.stoakes, Liam.Howlett, rppt, surenb, donettom,
aboorvad, sj, linux-mm, linux-fsdevel, linux-kernel
On Mon, Jun 09, 2025 at 05:17:58PM -0700, Andrew Morton wrote:
> On Mon, 9 Jun 2025 10:56:46 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:
>
> > On 6/9/25 10:52 AM, Vlastimil Babka wrote:
> > > On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
> > >> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
> > >>
> > >>> On 2025/6/9 15:35, Michal Hocko wrote:
> > >>>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
> > >>>>>
> > >>>>> Any reason why we dropped the Fixes tag? I see there were a series of
> > >>>>> discussion on v1 and it got concluded that the fix was correct, then why
> > >>>>> drop the fixes tag?
> > >>>>
> > >>>> This seems more like an improvement than a bug fix.
> > >>>
> > >>> Yes. I don't have a strong opinion on this, but we (Alibaba) will
> > >>> backport it manually,
> > >>>
> > >>> because some of user-space monitoring tools depend
> > >>> on these statistics.
> > >>
> > >> That sounds like a regression then, isn't it?
> > >
> > > Hm if counters were accurate before f1a7941243c1 and not afterwards, and
> > > this is making them accurate again, and some userspace depends on it,
> > > then Fixes: and stable is probably warranted then. If this was just a
> > > perf improvement, then not. But AFAIU f1a7941243c1 was the perf
> > > improvement...
> >
> > Dang, should have re-read the commit log of f1a7941243c1 first. It seems
> > like the error margin due to batching existed also before f1a7941243c1.
> >
> > " This patch converts the rss_stats into percpu_counter to convert the
> > error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2)."
> >
> > so if on some systems this means worse margin than before, the above
> > "if" chain of thought might still hold.
>
> f1a7941243c1 seems like a good enough place to tell -stable
> maintainers where to insert the patch (why does this sound rude).
>
> The patch is simple enough. I'll add fixes:f1a7941243c1 and cc:stable
> and, as the problem has been there for years, I'll leave the patch in
> mm-unstable so it will eventually get into LTS, in a well tested state.
One thing f1a7941243c1 noted was that the percpu counter conversion
enabled us to get more accurate stats with some cpu cost and in this
patch Baolin has shown that the cpu cost of accurate stats is
reasonable, so seems safe for stable backport. Also it seems like
multiple users are impacted by this issue, so I am fine with stable
backport.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-10 0:45 ` Shakeel Butt
@ 2025-06-10 9:59 ` Michal Hocko
0 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2025-06-10 9:59 UTC (permalink / raw)
To: Shakeel Butt
Cc: Andrew Morton, Vlastimil Babka, Ritesh Harjani (IBM), Baolin Wang,
david, lorenzo.stoakes, Liam.Howlett, rppt, surenb, donettom,
aboorvad, sj, linux-mm, linux-fsdevel, linux-kernel
On Mon 09-06-25 17:45:05, Shakeel Butt wrote:
> On Mon, Jun 09, 2025 at 05:17:58PM -0700, Andrew Morton wrote:
> > On Mon, 9 Jun 2025 10:56:46 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> > > On 6/9/25 10:52 AM, Vlastimil Babka wrote:
> > > > On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
> > > >> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
> > > >>
> > > >>> On 2025/6/9 15:35, Michal Hocko wrote:
> > > >>>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
> > > >>>>>
> > > >>>>> Any reason why we dropped the Fixes tag? I see there were a series of
> > > >>>>> discussion on v1 and it got concluded that the fix was correct, then why
> > > >>>>> drop the fixes tag?
> > > >>>>
> > > >>>> This seems more like an improvement than a bug fix.
> > > >>>
> > > >>> Yes. I don't have a strong opinion on this, but we (Alibaba) will
> > > >>> backport it manually,
> > > >>>
> > > >>> because some of user-space monitoring tools depend
> > > >>> on these statistics.
> > > >>
> > > >> That sounds like a regression then, isn't it?
> > > >
> > > > Hm if counters were accurate before f1a7941243c1 and not afterwards, and
> > > > this is making them accurate again, and some userspace depends on it,
> > > > then Fixes: and stable is probably warranted then. If this was just a
> > > > perf improvement, then not. But AFAIU f1a7941243c1 was the perf
> > > > improvement...
> > >
> > > Dang, should have re-read the commit log of f1a7941243c1 first. It seems
> > > like the error margin due to batching existed also before f1a7941243c1.
> > >
> > > " This patch converts the rss_stats into percpu_counter to convert the
> > > error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2)."
> > >
> > > so if on some systems this means worse margin than before, the above
> > > "if" chain of thought might still hold.
> >
> > f1a7941243c1 seems like a good enough place to tell -stable
> > maintainers where to insert the patch (why does this sound rude).
> >
> > The patch is simple enough. I'll add fixes:f1a7941243c1 and cc:stable
> > and, as the problem has been there for years, I'll leave the patch in
> > mm-unstable so it will eventually get into LTS, in a well tested state.
>
> One thing f1a7941243c1 noted was that the percpu counter conversion
> enabled us to get more accurate stats with some cpu cost and in this
> patch Baolin has shown that the cpu cost of accurate stats is
> reasonable, so seems safe for stable backport. Also it seems like
> multiple users are impacted by this issue, so I am fine with stable
> backport.
Fair point.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-06-10 0:17 ` Andrew Morton
2025-06-10 0:45 ` Shakeel Butt
@ 2025-07-04 18:22 ` Luiz Capitulino
2025-07-04 20:11 ` Andrew Morton
1 sibling, 1 reply; 14+ messages in thread
From: Luiz Capitulino @ 2025-07-04 18:22 UTC (permalink / raw)
To: Andrew Morton, Vlastimil Babka, Baolin Wang
Cc: Ritesh Harjani (IBM), Michal Hocko, david, shakeel.butt,
lorenzo.stoakes, Liam.Howlett, rppt, surenb, donettom, aboorvad,
sj, linux-mm, linux-fsdevel, linux-kernel
On 2025-06-09 20:17, Andrew Morton wrote:
> On Mon, 9 Jun 2025 10:56:46 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:
>
>> On 6/9/25 10:52 AM, Vlastimil Babka wrote:
>>> On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
>>>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>>>
>>>>> On 2025/6/9 15:35, Michal Hocko wrote:
>>>>>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
>>>>>>>
>>>>>>> Any reason why we dropped the Fixes tag? I see there were a series of
>>>>>>> discussion on v1 and it got concluded that the fix was correct, then why
>>>>>>> drop the fixes tag?
>>>>>>
>>>>>> This seems more like an improvement than a bug fix.
>>>>>
>>>>> Yes. I don't have a strong opinion on this, but we (Alibaba) will
>>>>> backport it manually,
>>>>>
>>>>> because some of user-space monitoring tools depend
>>>>> on these statistics.
>>>>
>>>> That sounds like a regression then, isn't it?
>>>
>>> Hm if counters were accurate before f1a7941243c1 and not afterwards, and
>>> this is making them accurate again, and some userspace depends on it,
>>> then Fixes: and stable is probably warranted then. If this was just a
>>> perf improvement, then not. But AFAIU f1a7941243c1 was the perf
>>> improvement...
>>
>> Dang, should have re-read the commit log of f1a7941243c1 first. It seems
>> like the error margin due to batching existed also before f1a7941243c1.
>>
>> " This patch converts the rss_stats into percpu_counter to convert the
>> error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2)."
>>
>> so if on some systems this means worse margin than before, the above
>> "if" chain of thought might still hold.
>
> f1a7941243c1 seems like a good enough place to tell -stable
> maintainers where to insert the patch (why does this sound rude).
>
> The patch is simple enough. I'll add fixes:f1a7941243c1 and cc:stable
> and, as the problem has been there for years, I'll leave the patch in
> mm-unstable so it will eventually get into LTS, in a well tested state.
Andrew, are you considering submitting this patch for 6.16? I think
we should, it does look like a regression for larger systems built
with 64k base page size.
On comparing a very simple app which just allocates & touches some
memory against v6.1 (which doesn't have f1a7941243c1) and latest
Linus tree (4c06e63b9203) I can see that on latest Linus tree the
values for VmRSS, RssAnon and RssFile from /proc/self/status are
all zeroes while they do report values on v6.1 and a Linus tree
with this patch.
My test setup is a arm64 VM with 80 CPUs running a kernel with 64k
pagesize. The kernel only reports the RSS values starting at 10MB
(which makes sense since the Per-CPU counters will cache up to two
times the number of CPUs and the kernel accounts pages). The situation
will be worse on larger systems, of course.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-07-04 18:22 ` Luiz Capitulino
@ 2025-07-04 20:11 ` Andrew Morton
2025-07-04 20:14 ` Luiz Capitulino
0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2025-07-04 20:11 UTC (permalink / raw)
To: Luiz Capitulino
Cc: Vlastimil Babka, Baolin Wang, Ritesh Harjani (IBM), Michal Hocko,
david, shakeel.butt, lorenzo.stoakes, Liam.Howlett, rppt, surenb,
donettom, aboorvad, sj, linux-mm, linux-fsdevel, linux-kernel
On Fri, 4 Jul 2025 14:22:11 -0400 Luiz Capitulino <luizcap@redhat.com> wrote:
> > The patch is simple enough. I'll add fixes:f1a7941243c1 and cc:stable
> > and, as the problem has been there for years, I'll leave the patch in
> > mm-unstable so it will eventually get into LTS, in a well tested state.
>
> Andrew, are you considering submitting this patch for 6.16? I think
> we should, it does look like a regression for larger systems built
> with 64k base page size.
I wasn't planning on 6.16-rcX because it's been there for years but
sure, I moved it into the mm-hotfixes pile so it'll go Linuswards next
week.
> On comparing a very simple app which just allocates & touches some
> memory against v6.1 (which doesn't have f1a7941243c1) and latest
> Linus tree (4c06e63b9203) I can see that on latest Linus tree the
> values for VmRSS, RssAnon and RssFile from /proc/self/status are
> all zeroes while they do report values on v6.1 and a Linus tree
> with this patch.
Cool, I'll paste this para into the changelog to help people link this
patch with wrong behavior which they are observing.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
2025-07-04 20:11 ` Andrew Morton
@ 2025-07-04 20:14 ` Luiz Capitulino
0 siblings, 0 replies; 14+ messages in thread
From: Luiz Capitulino @ 2025-07-04 20:14 UTC (permalink / raw)
To: Andrew Morton
Cc: Vlastimil Babka, Baolin Wang, Ritesh Harjani (IBM), Michal Hocko,
david, shakeel.butt, lorenzo.stoakes, Liam.Howlett, rppt, surenb,
donettom, aboorvad, sj, linux-mm, linux-fsdevel, linux-kernel
On 2025-07-04 16:11, Andrew Morton wrote:
> On Fri, 4 Jul 2025 14:22:11 -0400 Luiz Capitulino <luizcap@redhat.com> wrote:
>
>>> The patch is simple enough. I'll add fixes:f1a7941243c1 and cc:stable
>>> and, as the problem has been there for years, I'll leave the patch in
>>> mm-unstable so it will eventually get into LTS, in a well tested state.
>>
>> Andrew, are you considering submitting this patch for 6.16? I think
>> we should, it does look like a regression for larger systems built
>> with 64k base page size.
>
> I wasn't planning on 6.16-rcX because it's been there for years but
> sure, I moved it into the mm-hotfixes pile so it'll go Linuswards next
> week.
Wonderful, thank you!
>
>> On comparing a very simple app which just allocates & touches some
>> memory against v6.1 (which doesn't have f1a7941243c1) and latest
>> Linus tree (4c06e63b9203) I can see that on latest Linus tree the
>> values for VmRSS, RssAnon and RssFile from /proc/self/status are
>> all zeroes while they do report values on v6.1 and a Linus tree
>> with this patch.
>
> Cool, I'll paste this para into the changelog to help people link this
> patch with wrong behavior which they are observing.
OK.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-07-04 20:14 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-05 12:58 [PATCH v2] mm: fix the inaccurate memory statistics issue for users Baolin Wang
2025-06-05 13:34 ` Vlastimil Babka
2025-06-09 5:27 ` Ritesh Harjani
2025-06-09 7:35 ` Michal Hocko
2025-06-09 8:04 ` Baolin Wang
2025-06-09 8:31 ` Ritesh Harjani
2025-06-09 8:52 ` Vlastimil Babka
2025-06-09 8:56 ` Vlastimil Babka
2025-06-10 0:17 ` Andrew Morton
2025-06-10 0:45 ` Shakeel Butt
2025-06-10 9:59 ` Michal Hocko
2025-07-04 18:22 ` Luiz Capitulino
2025-07-04 20:11 ` Andrew Morton
2025-07-04 20:14 ` Luiz Capitulino
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).