public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yibin Liu <liuyibin@hygon.cn>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"brauner@kernel.org" <brauner@kernel.org>,
	Jianyong Wu <wujianyong@hygon.cn>, Huangsj <huangsj@hygon.cn>,
	Yuan Zhong <zhongyuan@hygon.cn>, "jack@suse.cz" <jack@suse.cz>,
	"jlayton@kernel.org" <jlayton@kernel.org>,
	"chuck.lever@oracle.com" <chuck.lever@oracle.com>,
	"alex.aring@gmail.com" <alex.aring@gmail.com>,
	"vbabka@kernel.org" <vbabka@kernel.org>,
	"jannh@google.com" <jannh@google.com>,
	"pfalcato@suse.de" <pfalcato@suse.de>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>
Subject: 答复: [PATCH] mm: Add RWH_RMAP_EXCLUDE flag to exclude files from rmap sharing
Date: Wed, 22 Apr 2026 13:03:19 +0000	[thread overview]
Message-ID: <67a71b489eb3413ab0907351d3c56d31@hygon.cn> (raw)
In-Reply-To: <CAGudoHHki3gv-HXXMALePDoC+tmao4oWcYgCo9kXNDkEhW4E4g@mail.gmail.com>

> On Tue, Apr 21, 2026 at 4:11 AM Yibin Liu <liuyibin@hygon.cn> wrote:
> >
> > UnixBench execl/shellscript (dynamically linked binaries) at 64+ cores are
> > bottlenecked on the i_mmap_rwsem semaphore due to heavy vma
> insert/remove
> > operations on the i_mmap tree, where libc.so.6 is the most frequent,
> > followed by ld-linux-x86-64.so.2 and the test executable itself.
> >
> > This patch marks such files to skip rmap operations, avoiding frequent
> > interval tree insert/remove that cause i_mmap_rwsem lock contention.
> > The downside is these files can no longer be reclaimed (along with compact
> > and ksm), but since they are small and resident anyway, it's acceptable.
> > When all mapping processes exit, files can still be reclaimed normally.
> >
> > Performance testing shows ~80% improvement in UnixBench execl/shellscript
> > scores on Hygon 7490, AMD zen4 9754 and Intel emerald rapids platform.
> >
> 
> The other responders have been a little harsh and despite raising
> valid points I don't think they gave a proper review.
> 
> The bigger picture is that the problematic rwsem is taken several
> times during fork + exec + exit cycle. Normally you end up with 5
> distinct mappings per binary/so, each created with a separate lock
> acquire.
> 
> Some time ago I patched exit to batch processing, leaving 1 acquire in
> that codepath. fork can and should be patched in a similar vein, but I
> don't know if unixbench runs it in this benchmark (i.e., real
> workloads certainly suffer from it, I don't know if this particular
> bench includes that aspect). This is on top of forking itself being
> avoidable should the kernel grow a better interface for executing
> binaries.
> 
Thank you for your opnions and advices, I'll try this way
> This leaves us with mapping creation on exec. This problem is
> unfixable without introduction of better APIs for userspace, which
> constitutes quite a challenge.
> 
> The end result is the absolutely horrible case of multiple acquires of
> the same lock per iteration.
> 
> One common idea how to reduce contention boils down to shortening lock
> hold time. This has very limited effect in face of the aforementioned
> multiple acquires and is at best a stop gap -- no matter what, the
> ceiling is dictated by the extra acquires and it is incredibly low.
> 
> Your patch keeps the problematic acquire pattern intact and while the
> 80% win might sound encouraging, the end result is still severely
> underperforming even a state where the lock is taken once in total
> during exec.
> 
> Besides that, the internally-visible side effect of non-functional
> rmap is pretty bad (and thus e.g., truncate) is pretty bad in its own
> right, but let's ignore it. The primary problem here is that the patch
> exposes a mechanism for userspace to dictate this in the first place.
> Even ignoring the question of who should be using it and when, the
> real solution to the problem would be confined to the kernel. Suppose
> this patch lands and such a solution is implemented later -- now the
> kernel is stuck having to support a now-useless (if not outright
> harmful) feature.
OK. I understand it now.
> 
> What will fix the problem is sharding the state in some capacity,
> provided no unfixable stopgap shows up.
> 
> Any other approach is putting small bandaids on it and can be a
> consideration only if the decentralizing locking is proven too
> problematic.
> 
> Pedro apparently volunteered to do the work, so I think we can wait to
> see what he is going to end up cooking.
> 
> I hope this helps.
> 


      parent reply	other threads:[~2026-04-22 13:03 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260421020932.3212532-1-liuyibin@hygon.cn>
     [not found] ` <aeikm2e5Gh5reJ30@lucifer>
2026-04-22 12:51   ` 答复: [PATCH] mm: Add RWH_RMAP_EXCLUDE flag to exclude files from rmap sharing Yibin Liu
2026-04-22 16:16     ` Lorenzo Stoakes
     [not found] ` <CAGudoHHki3gv-HXXMALePDoC+tmao4oWcYgCo9kXNDkEhW4E4g@mail.gmail.com>
2026-04-22 13:03   ` Yibin Liu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67a71b489eb3413ab0907351d3c56d31@hygon.cn \
    --to=liuyibin@hygon.cn \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.aring@gmail.com \
    --cc=brauner@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=huangsj@hygon.cn \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=pfalcato@suse.de \
    --cc=vbabka@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wujianyong@hygon.cn \
    --cc=zhongyuan@hygon.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox