From: Kairui Song <ryncsn@gmail.com>
To: "zhangpeng (AS)" <zhangpeng362@huawei.com>
Cc: Rongwei Wang <rongwei.wrw@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
akpm@linux-foundation.org, dennisszhou@gmail.com,
shakeelb@google.com, jack@suse.cz, surenb@google.com,
kent.overstreet@linux.dev, mhocko@suse.cz, vbabka@suse.cz,
yuzhao@google.com, yu.ma@intel.com, wangkefeng.wang@huawei.com,
sunnanyong@huawei.com
Subject: Re: [RFC PATCH v2 2/2] mm: convert mm's rss stats to use atomic mode
Date: Thu, 16 May 2024 19:50:52 +0800 [thread overview]
Message-ID: <CAMgjq7DHUgyR0vtkYXH4PuzBHUVZ5cyCzi58TfShL57TUSL+Tg@mail.gmail.com> (raw)
In-Reply-To: <c1c79eb5-4d48-40e5-6f17-f8bc42f2d274@huawei.com>
On Fri, Apr 19, 2024 at 11:32 AM zhangpeng (AS) <zhangpeng362@huawei.com> wrote:
> On 2024/4/19 10:30, Rongwei Wang wrote:
> > On 2024/4/18 22:20, Peng Zhang wrote:
> >> From: ZhangPeng <zhangpeng362@huawei.com>
> >>
> >> Since commit f1a7941243c1 ("mm: convert mm's rss stats into
> >> percpu_counter"), the rss_stats have converted into percpu_counter,
> >> which convert the error margin from (nr_threads * 64) to approximately
> >> (nr_cpus ^ 2). However, the new percpu allocation in mm_init() causes a
> >> performance regression on fork/exec/shell. Even after commit
> >> 14ef95be6f55
> >> ("kernel/fork: group allocation/free of per-cpu counters for mm
> >> struct"),
> >> the performance of fork/exec/shell is still poor compared to previous
> >> kernel versions.
> >>
> >> To mitigate performance regression, we delay the allocation of percpu
> >> memory for rss_stats. Therefore, we convert mm's rss stats to use
> >> percpu_counter atomic mode. For single-thread processes, rss_stat is in
> >> atomic mode, which reduces the memory consumption and performance
> >> regression caused by using percpu. For multiple-thread processes,
> >> rss_stat is switched to the percpu mode to reduce the error margin.
> >> We convert rss_stats from atomic mode to percpu mode only when the
> >> second thread is created.
> > Hi, Zhang Peng
> >
> > This regression we also found it in lmbench these days. I have not
> > test your patch, but it seems will solve a lot for it.
> > And I see this patch not fix the regression in multi-threads, that's
> > because of the rss_stat switched to percpu mode?
> > (If I'm wrong, please correct me.) And It seems percpu_counter also
> > has a bad effect in exit_mmap().
> >
> > If so, I'm wondering if we can further improving it on the exit_mmap()
> > path in multi-threads scenario, e.g. to determine which CPUs the
> > process has run on (mm_cpumask()? I'm not sure).
> >
> Hi, Rongwei,
>
> Yes, this patch only fixes the regression in single-thread processes. How
> much bad effect does percpu_counter have in exit_mmap()? IMHO, the addition
> of mm counter is already in batch mode, maybe I miss something?
>
Hi, Peng Zhang, Rongwei, and all:
I've a patch series that is earlier than commit f1a7941243c1 ("mm:
convert mm's rss stats into
percpu_counter"):
https://lwn.net/ml/linux-kernel/20220728204511.56348-1-ryncsn@gmail.com/
Instead of a per-mm-per-cpu cache, it used only one global per-cpu
cache, and flush it on schedule. Or, if the arch supports, flush and
fetch it use mm bitmap as an optimization (like tlb shootdown).
Unfortunately it didn't get much attention and I moved to work on other things.
I also noticed the fork regression issue, so I did a local rebase of
my previous patch, and revert f1a7941243c1.
The result is looking good, on my 32 core VM machine, I see similar
improvement as the one you posted (alloc/free on fork/exit is gone), I
also see minor improvement with database tests, memory usage is lower
by a little bit too (no more per-mm cache), and I think the error
margin in my patch should be close to zero.
I hope I can get some attention here for my idea...
next prev parent reply other threads:[~2024-05-16 11:51 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-18 14:20 [RFC PATCH v2 0/2] mm: convert mm's rss stats to use atomic mode Peng Zhang
2024-04-18 14:20 ` [RFC PATCH v2 1/2] percpu_counter: introduce atomic mode for percpu_counter Peng Zhang
2024-04-18 19:40 ` Andrew Morton
2024-04-19 2:55 ` zhangpeng (AS)
2024-04-26 8:11 ` Dennis Zhou
2024-04-29 7:45 ` zhangpeng (AS)
2024-04-18 14:20 ` [RFC PATCH v2 2/2] mm: convert mm's rss stats to use atomic mode Peng Zhang
2024-04-19 2:30 ` Rongwei Wang
2024-04-19 3:32 ` zhangpeng (AS)
2024-04-20 3:13 ` Rongwei Wang
2024-04-20 8:44 ` zhangpeng (AS)
2024-05-16 11:50 ` Kairui Song [this message]
2024-05-16 15:14 ` Mateusz Guzik
2024-05-17 3:29 ` Kairui Song
2024-05-17 18:08 ` Mateusz Guzik
2024-05-19 14:13 ` Dennis Zhou
2024-04-24 4:29 ` [RFC PATCH v2 0/2] " zhangpeng (AS)
2024-04-24 4:51 ` Dennis Zhou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMgjq7DHUgyR0vtkYXH4PuzBHUVZ5cyCzi58TfShL57TUSL+Tg@mail.gmail.com \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=dennisszhou@gmail.com \
--cc=jack@suse.cz \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=rongwei.wrw@gmail.com \
--cc=shakeelb@google.com \
--cc=sunnanyong@huawei.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=wangkefeng.wang@huawei.com \
--cc=yu.ma@intel.com \
--cc=yuzhao@google.com \
--cc=zhangpeng362@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).