All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>,
	akpm@linux-foundation.org, david@redhat.com,
	shakeel.butt@linux.dev
Cc: lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, donettom@linux.ibm.com, aboorvad@linux.ibm.com,
	sj@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users
Date: Mon, 09 Jun 2025 10:57:41 +0530	[thread overview]
Message-ID: <87bjqx4h82.fsf@gmail.com> (raw)
In-Reply-To: <f4586b17f66f97c174f7fd1f8647374fdb53de1c.1749119050.git.baolin.wang@linux.alibaba.com>

Baolin Wang <baolin.wang@linux.alibaba.com> writes:

> On some large machines with a high number of CPUs running a 64K pagesize
> kernel, we found that the 'RES' field is always 0 displayed by the top
> command for some processes, which will cause a lot of confusion for users.
>
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
>  875525 root      20   0   12480      0      0 R   0.3   0.0   0:00.08 top
>       1 root      20   0  172800      0      0 S   0.0   0.0   0:04.52 systemd
>
> The main reason is that the batch size of the percpu counter is quite large
> on these machines, caching a significant percpu value, since converting mm's
> rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss
> stats into percpu_counter"). Intuitively, the batch number should be optimized,
> but on some paths, performance may take precedence over statistical accuracy.
> Therefore, introducing a new interface to add the percpu statistical count
> and display it to users, which can remove the confusion. In addition, this
> change is not expected to be on a performance-critical path, so the modification
> should be acceptable.
>
> In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
> dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch().
> In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock'
> contention. This patch changes task_mem() and task_statm() to get the accurate
> mm counters under the 'fbc->lock', but this should not exacerbate kernel
> 'mm->rss_stat' lock contention due to the percpu batch caching of the mm
> counters. The following test also confirm the theoretical analysis.
>
> I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores
> machine, while simultaneously running a script that starts 32 threads to
> busy-loop pread each stress-ng thread's /proc/pid/status interface. From the
> following data, I did not observe any obvious impact of this patch on the
> stress-ng tests.
>
> w/o patch:
> stress-ng: info:  [6848]          4,399,219,085,152 CPU Cycles          67.327 B/sec
> stress-ng: info:  [6848]          1,616,524,844,832 Instructions          24.740 B/sec (0.367 instr. per cycle)
> stress-ng: info:  [6848]          39,529,792 Page Faults Total           0.605 M/sec
> stress-ng: info:  [6848]          39,529,792 Page Faults Minor           0.605 M/sec
>
> w/patch:
> stress-ng: info:  [2485]          4,462,440,381,856 CPU Cycles          68.382 B/sec
> stress-ng: info:  [2485]          1,615,101,503,296 Instructions          24.750 B/sec (0.362 instr. per cycle)
> stress-ng: info:  [2485]          39,439,232 Page Faults Total           0.604 M/sec
> stress-ng: info:  [2485]          39,439,232 Page Faults Minor           0.604 M/sec
>
> Tested-by Donet Tom <donettom@linux.ibm.com>
> Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Acked-by: SeongJae Park <sj@kernel.org>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> Changes from v1:
>  - Update the commit message to add some measurements.
>  - Add acked tag from Michal. Thanks.
>  - Drop the Fixes tag.

Any reason why we dropped the Fixes tag? I see there were a series of
discussion on v1 and it got concluded that the fix was correct, then why
drop the fixes tag? 

Background: Recently few folks internally reported this issue on Power
too. e.g. 

$ ps -o rss $$
  RSS
    0

So it would be nice if we had fixes tag so that it gets backported
to all stable release. Does anybody sees any concern with that?

-ritesh

  parent reply	other threads:[~2025-06-09  5:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-05 12:58 [PATCH v2] mm: fix the inaccurate memory statistics issue for users Baolin Wang
2025-06-05 13:34 ` Vlastimil Babka
2025-06-09  5:27 ` Ritesh Harjani [this message]
2025-06-09  7:35   ` Michal Hocko
2025-06-09  8:04     ` Baolin Wang
2025-06-09  8:31       ` Ritesh Harjani
2025-06-09  8:52         ` Vlastimil Babka
2025-06-09  8:56           ` Vlastimil Babka
2025-06-10  0:17             ` Andrew Morton
2025-06-10  0:45               ` Shakeel Butt
2025-06-10  9:59                 ` Michal Hocko
2025-07-04 18:22               ` Luiz Capitulino
2025-07-04 20:11                 ` Andrew Morton
2025-07-04 20:14                   ` Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bjqx4h82.fsf@gmail.com \
    --to=ritesh.list@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aboorvad@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=donettom@linux.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=sj@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.