From: Lorenzo Stoakes <ljs@kernel.org>
To: Huang Shijie <huangsj@hygon.cn>
Cc: Pedro Falcato <pfalcato@suse.de>,
akpm@linux-foundation.org, viro@zeniv.linux.org.uk,
brauner@kernel.org, jack@suse.cz, muchun.song@linux.dev,
osalvador@suse.de, david@kernel.org, surenb@google.com,
mjguzik@gmail.com, liam@infradead.org, vbabka@kernel.org,
shakeel.butt@linux.dev, rppt@kernel.org, mhocko@suse.com,
corbet@lwn.net, skhan@linuxfoundation.org, linux@armlinux.org.uk,
dinguyen@kernel.org, schuster.simon@siemens-energy.com,
James.Bottomley@hansenpartnership.com, deller@gmx.de,
djbw@kernel.org, willy@infradead.org, peterz@infradead.org,
mingo@redhat.com, acme@kernel.org, namhyung@kernel.org,
mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com,
james.clark@linaro.org, mhiramat@kernel.org, oleg@redhat.com,
ziy@nvidia.com, baolin.wang@linux.alibaba.com,
npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
baohua@kernel.org, lance.yang@linux.dev, linmiaohe@huawei.com,
nao.horiguchi@gmail.com, jannh@google.com, riel@surriel.com,
harry@kernel.org, will@kernel.org, brian.ruley@gehealthcare.com,
rmk+kernel@armlinux.org.uk, dave.anglin@bell.net,
linux-mm@kvack.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-parisc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
nvdimm@lists.linux.dev, linux-perf-users@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, zhongyuan@hygon.cn,
fangbaoshun@hygon.cn, yingzhiwei@hygon.cn
Subject: Re: [PATCH v2 3/4] mm/fs: split the file's i_mmap tree
Date: Thu, 11 Jun 2026 16:48:13 +0100 [thread overview]
Message-ID: <airVLmVdVRDOlZm4@lucifer> (raw)
In-Reply-To: <aiqFgGbIo1Psy3pI@pedro-suse.lan>
On Thu, Jun 11, 2026 at 12:11:27PM +0100, Pedro Falcato wrote:
> Hi,
>
> On Thu, Jun 11, 2026 at 02:18:59PM +0800, Huang Shijie wrote:
> > In the UnixBench tests, there is a test "execl" which tests
> > the execve system call.
> > For example, a Hygon's server has 12 NUMA nodes, and 384 CPUs.
> > When we test our server with "./Run -c 384 execl",
> > the test result is not good enough. The i_mmap locks contended heavily on
> > "libc.so" and "ld.so". The i_mmap tree for "libc.so" can be
> > over 6000 VMAs, all the VMAs can be in different NUMA mode. The insert/remove
> > operations do not run quickly enough.
>
> I _really_ would have appreciated some coordination here, because I said I was
> going to take a look at it. I have something that I think is much simpler
Agreed, this is the second (or in fact third?) time in recent weeks that
I'm aware of where publicly discussed work has been duplicated with a
series that came in later.
It's really important, when doing work that impact core stuff to have a
look around and see if others are looking at it, as there's nothing more
frustrating than to work on something, discuss it publicly, only to find
somebody sends a competing series.
It can be tricky, as sometimes it's not obvious, or it might not be so
easily found, but I would strongly suggest always making an effort on that
front.
But you didn't even try to send this as an RFC either :)
> in practice. These patches are also way too complex to be dropped just before
> the merge window.
This late in the cycle means -> next cycle. So you'd have needed to resend
it at rc1 in a couple weeks anyway.
>
> Some comments:
>
> >
> > In order to reduce the competition of the i_mmap lock, this patch does
> > following:
> > 1.) Split the single i_mmap tree into several sibling trees:
> > Each tree has a lock. The CONFIG_SPLIT_I_MMAP is used to
> > turn on/off this feature.
>
> There is no need for a config option. This needs to Just Work.
Yeah, this is just a no-go. We don't add config options for changes to core
rmap code.
>
> > 2.) Introduce a new field "tree_idx" for vm_area_struct to save the
> > sibling tree index for this VMA.
>
> This is possibly contentious, but there are holes in vm_area_struct.
> So I think this is fine.
Yeah no thanks for the extra field, I already have plans for those gaps in
vm_area_struct.
I am in fact writing code right now that uses them...
>
> > 3.) Introduce a new field "vma_count" for address_space.
> > The new mapping_mapped() will use it.
> > 4.) Rewrite the vma_interval_tree_foreach()
I also intend to send a series that does a bunch of changes in the rmap
code that this would conflict with.
So let's all coordinate please.
> > 5.) Rewrite the lock functions.
Yeah looping on file rmap lock/unlock is gross.
> >
> > After this patch, the VMA insert/remove operations will work faster,
> > and we can get over 400% performance improvement with the above test.
> >
> > Signed-off-by: Huang Shijie <huangsj@hygon.cn>
I had a look through and this code is really overwrought and you're putting
a bunch of confusing open-coded all over the codebase without comments.
This isn't upstreamable quality and you really should have sent this as an
RFC first so we could discuss the approach.
Thanks, Lorenzo
next prev parent reply other threads:[~2026-06-11 15:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-11 6:18 [PATCH v2 0/4] mm: split the file's i_mmap tree for NUMA Huang Shijie
2026-06-11 6:18 ` [PATCH v2 1/4] mm: use mapping_mapped to simplify the code Huang Shijie
2026-06-11 11:13 ` Pedro Falcato
2026-06-11 15:52 ` Lorenzo Stoakes
2026-06-11 6:18 ` [PATCH v2 2/4] mm: use get_i_mmap_root to access the file's i_mmap Huang Shijie
2026-06-11 6:31 ` sashiko-bot
2026-06-11 6:18 ` [PATCH v2 3/4] mm/fs: split the file's i_mmap tree Huang Shijie
2026-06-11 6:37 ` sashiko-bot
2026-06-11 11:11 ` Pedro Falcato
2026-06-11 15:48 ` Lorenzo Stoakes [this message]
2026-06-11 6:19 ` [PATCH v2 4/4] docs/mm: update document for split " Huang Shijie
2026-06-11 6:26 ` sashiko-bot
2026-06-11 16:00 ` [PATCH v2 0/4] mm: split the file's i_mmap tree for NUMA Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=airVLmVdVRDOlZm4@lucifer \
--to=ljs@kernel.org \
--cc=James.Bottomley@hansenpartnership.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=brauner@kernel.org \
--cc=brian.ruley@gehealthcare.com \
--cc=corbet@lwn.net \
--cc=dave.anglin@bell.net \
--cc=david@kernel.org \
--cc=deller@gmx.de \
--cc=dev.jain@arm.com \
--cc=dinguyen@kernel.org \
--cc=djbw@kernel.org \
--cc=fangbaoshun@hygon.cn \
--cc=harry@kernel.org \
--cc=huangsj@hygon.cn \
--cc=irogers@google.com \
--cc=jack@suse.cz \
--cc=james.clark@linaro.org \
--cc=jannh@google.com \
--cc=jolsa@kernel.org \
--cc=lance.yang@linux.dev \
--cc=liam@infradead.org \
--cc=linmiaohe@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=mark.rutland@arm.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=muchun.song@linux.dev \
--cc=namhyung@kernel.org \
--cc=nao.horiguchi@gmail.com \
--cc=npache@redhat.com \
--cc=nvdimm@lists.linux.dev \
--cc=oleg@redhat.com \
--cc=osalvador@suse.de \
--cc=peterz@infradead.org \
--cc=pfalcato@suse.de \
--cc=riel@surriel.com \
--cc=rmk+kernel@armlinux.org.uk \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=schuster.simon@siemens-energy.com \
--cc=shakeel.butt@linux.dev \
--cc=skhan@linuxfoundation.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yingzhiwei@hygon.cn \
--cc=zhongyuan@hygon.cn \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox