From: Michal Hocko <mhocko@suse.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
david@redhat.com, shakeel.butt@linux.dev,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
donettom@linux.ibm.com, aboorvad@linux.ibm.com, sj@kernel.org,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: fix the inaccurate memory statistics issue for users
Date: Tue, 3 Jun 2025 10:15:27 +0200 [thread overview]
Message-ID: <aD6vHzRhwyTxBqcl@tiehlicka> (raw)
In-Reply-To: <72f0dc8c-def3-447c-b54e-c390705f8c26@linux.alibaba.com>
On Tue 03-06-25 16:08:21, Baolin Wang wrote:
>
>
> On 2025/5/30 21:39, Michal Hocko wrote:
> > On Thu 29-05-25 20:53:13, Andrew Morton wrote:
> > > On Sat, 24 May 2025 09:59:53 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> > >
> > > > On some large machines with a high number of CPUs running a 64K pagesize
> > > > kernel, we found that the 'RES' field is always 0 displayed by the top
> > > > command for some processes, which will cause a lot of confusion for users.
> > > >
> > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > > > 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top
> > > > 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd
> > > >
> > > > The main reason is that the batch size of the percpu counter is quite large
> > > > on these machines, caching a significant percpu value, since converting mm's
> > > > rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss
> > > > stats into percpu_counter"). Intuitively, the batch number should be optimized,
> > > > but on some paths, performance may take precedence over statistical accuracy.
> > > > Therefore, introducing a new interface to add the percpu statistical count
> > > > and display it to users, which can remove the confusion. In addition, this
> > > > change is not expected to be on a performance-critical path, so the modification
> > > > should be acceptable.
> > > >
> > > > Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter")
> > >
> > > Three years ago.
> > >
> > > > Tested-by Donet Tom <donettom@linux.ibm.com>
> > > > Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> > > > Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
> > > > Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> > > > Acked-by: SeongJae Park <sj@kernel.org>
> > > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> > >
> > > Thanks, I added cc:stable to this.
> >
> > I have only noticed this new posting now. I do not think this is a
> > stable material. I am also not convinced that the impact of the pcp lock
> > exposure to the userspace has been properly analyzed and documented in
> > the changelog. I am not nacking the patch (yet) but I would like to see
> > a serious analyses that this has been properly thought through.
>
> Good point. I did a quick measurement on my 32 cores Arm machine. I ran two
> workloads, one is the 'top' command: top -d 1 (updating every second).
> Another workload is kernel building (time make -j32).
>
> From the following data, I did not see any significant impact of the patch
> changes on the execution of the kernel building workload.
I do not think this is really representative of an adverse workload. I
believe you need to have a look which potentially sensitive kernel code
paths run with the lock held how would a busy loop over affected proc
files influence those in the worst case. Maybe there are none of such
kernel code paths to really worry about. This should be a part of the
changelog though.
Thanks!
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2025-06-03 8:15 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-24 1:59 [PATCH] mm: fix the inaccurate memory statistics issue for users Baolin Wang
2025-05-30 3:53 ` Andrew Morton
2025-05-30 13:39 ` Michal Hocko
2025-05-30 23:00 ` Andrew Morton
2025-06-03 8:08 ` Baolin Wang
2025-06-03 8:15 ` Michal Hocko [this message]
2025-06-03 8:32 ` Baolin Wang
2025-06-03 10:28 ` Michal Hocko
2025-06-03 14:22 ` Baolin Wang
2025-06-03 14:48 ` Michal Hocko
2025-06-03 17:29 ` Shakeel Butt
2025-06-04 12:46 ` Baolin Wang
2025-06-04 13:46 ` Vlastimil Babka
2025-06-04 14:16 ` Baolin Wang
2025-06-04 14:27 ` Vlastimil Babka
2025-06-04 16:54 ` Shakeel Butt
2025-06-05 0:48 ` Baolin Wang
2025-06-05 6:32 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aD6vHzRhwyTxBqcl@tiehlicka \
--to=mhocko@suse.com \
--cc=Liam.Howlett@oracle.com \
--cc=aboorvad@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=donettom@linux.ibm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=sj@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.