From: Uladzislau Rezki <urezki@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
Pranjal Arya <pranjal.arya@oss.qualcomm.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Liam R. Howlett" <liam@infradead.org>,
Alice Ryhl <aliceryhl@google.com>,
Andrew Ballance <andrewjballance@gmail.com>,
linux-arm-msm@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org,
Lorenzo Stoakes <ljs@kernel.org>,
Pranjal Shrivastava <praan@google.com>,
Will Deacon <will@kernel.org>,
Suzuki K Poulose <Suzuki.Poulose@arm.com>,
Neil Armstrong <neil.armstrong@linaro.org>,
Mostafa Saleh <smostafa@google.com>,
Balbir Singh <balbirs@nvidia.com>,
Suren Baghdasaryan <surenb@google.com>,
Marco Elver <elver@google.com>,
Dmitry Vyukov <dvyukov@google.com>,
Alexander Potapenko <glider@google.com>,
Shuah Khan <shuah@kernel.org>, Dev Jain <dev.jain@arm.com>,
Brendan Jackman <jackmanb@google.com>,
Puranjay Mohan <puranjay@kernel.org>,
Santosh Shukla <santosh.shukla@amd.com>,
Wyes Karny <wkarny@gmail.com>,
Sudeep Holla <sudeep.holla@kernel.org>
Subject: Re: [PATCH RFC 00/12] mm/vmalloc: migrate vmap_area indexing from rb-tree to maple-tree
Date: Thu, 18 Jun 2026 12:06:06 +0200 [thread overview]
Message-ID: <ajPDDkV7yqWlhPIh@pc636> (raw)
In-Reply-To: <ajGQ_WPT3Ra2kPCQ@casper.infradead.org>
On Tue, Jun 16, 2026 at 07:07:57PM +0100, Matthew Wilcox wrote:
> On Mon, Jun 15, 2026 at 11:52:22AM +0200, Uladzislau Rezki wrote:
> > On Sun, Jun 14, 2026 at 12:15:28AM +0100, Matthew Wilcox wrote:
> > > What I don't understand is why you maintain a separate "free" tree.
> > > It should not be necessary any more, but maybe you tried removing it
> > > already and found a performance problem?
> >
> > We maintain it in order to split several entities. That prevents
> > interfering between allocated data and vmap-free-space manager.
> > So in that case one context can easily access allocated data, for
> > example vread iterator, etc., whereas another can do an allocation.
> >
> > So by splitting parts i minimize lock-contention.
>
> Sure, but there are many ways to reduce lock contention. One is to not
> take locks at all; the maple tree is RCU-safe, so you can read the tree
> holding only the RCU read lock, as long as you obey the RCU rules.
>
> Specifically:
> - Write side has to RCU-free the objects that are stored in the tree
> - Read side has to trylock the objects it finds (and retry the walk
> if the trylock fails)
> - Read side can see a mixture of objects if the tree is changed while
> it is reading, but for any given index in the tree it is guaranteed
> to see one of the objects which has been referred to by that index.
> That is, if the write side overwrites an index that referred to
> object A with object B, the reader will see either object A or B.
> It will not see NULL and it will not see any other object.
> - If the write side stores both object C and object D in the tree,
> the read side may see neither, both, only C or only D.
>
Some thoughts about it.
Having the tree which is RCU safe is good for sure. We can benefit from
at least in the: vmallocinfo scanning/dumping, possibly in the vread_iter()
when access to /proc/kcore and other places(which i need to check carefully).
But this is for read-only traversal.
Switching to gap-based approach requires quite a bit of refactoring and it
should be a full switch without any hybrid schemes or mixes. I expect that
we remove more code then adding because of some parts will become hidden
like lookups/reserving range/erase, etc which is good.
- replacing free_vmap_area to maple-tree gap based approach;
- rewriting pcpu-allocator which lives in the end of vmalloc space;
- refactoring per-cpu allocator which is also part of vmalloc space;
- vread iterator;
- vmalloc dump path;
- vmap_node logic(use gap-reserve to minimize contention);
- and more...
To me such rewrite makes sense if we end up in something structural not
just because maple tree exists. The criteria i would go with are: at least
same performance level, remove more then add, the design stays at least in
same good shape.
There are some drawback i am thinking of. One of them is maple insert path,
mas_store_gfp()? First we need to find an empty area, then set-range and do
mas_store_gfp() that uses gfp flag for its internal allocation. If we are
under spin-lock sleeping is not possible, using NOWAIT or ATOMIC is not a
case thus we should somehow pre-allocate outside the lock and store the range
without any allocation.
The allocator operation:
- finds an empty range;
- publishes VA that blocks that range.
those two have to be serialized among other writes. Otherwise two CPUs can use
same empty range and both try to reserve them. If preallocate outside the lock,
the "alloc" side has to validate that a selected range is still empty and only
then store VA to block the range.
I think it is worth to prototype something to see how it would go. I may be
missing something for sure.
Thank you for your input!
--
Uladzislau Rezki
next prev parent reply other threads:[~2026-06-18 10:06 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-13 17:19 [PATCH RFC 00/12] mm/vmalloc: migrate vmap_area indexing from rb-tree to maple-tree Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 01/12] mm/vmalloc: introduce maple_tree-based indexing for vmap_area Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 02/12] mm/vmalloc: convert allocation-side gap finding and insertion to maple_tree Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 03/12] mm/vmalloc: convert free, purge, and pcpu paths " Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 04/12] mm/vmalloc: finalize maple-only indexing and shrink struct vmap_area Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 05/12] mm/vmalloc: tighten failure handling under memory pressure Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 06/12] mm/vmalloc: tighten alloc/free hot paths Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 07/12] mm/vmalloc: consolidate occupied tree as authoritative index on hot path Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 08/12] mm/vmalloc: track lazy-purge queue as a list_head Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 09/12] mm/vmalloc: collapse busy-tree find-then-unlink into a single mas_erase Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 10/12] mm/vmalloc: per-CPU caching of free ranges from the maple_tree allocator Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 11/12] mm/vmalloc: O(1) lookup of cached vmap_areas with bounded fast-reject Pranjal Arya
2026-06-13 17:19 ` [PATCH RFC 12/12] mm/vmalloc: harden bump-allocator alloc/free against UBSAN array bounds Pranjal Arya
2026-06-13 23:15 ` [PATCH RFC 00/12] mm/vmalloc: migrate vmap_area indexing from rb-tree to maple-tree Matthew Wilcox
2026-06-15 9:52 ` Uladzislau Rezki
2026-06-16 18:07 ` Matthew Wilcox
2026-06-18 10:06 ` Uladzislau Rezki [this message]
2026-06-14 6:35 ` [syzbot ci] " syzbot ci
2026-06-14 6:58 ` [PATCH RFC 00/12] " Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajPDDkV7yqWlhPIh@pc636 \
--to=urezki@gmail.com \
--cc=Suzuki.Poulose@arm.com \
--cc=akpm@linux-foundation.org \
--cc=aliceryhl@google.com \
--cc=andrewjballance@gmail.com \
--cc=balbirs@nvidia.com \
--cc=dev.jain@arm.com \
--cc=dvyukov@google.com \
--cc=elver@google.com \
--cc=glider@google.com \
--cc=jackmanb@google.com \
--cc=liam@infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=maple-tree@lists.infradead.org \
--cc=neil.armstrong@linaro.org \
--cc=praan@google.com \
--cc=pranjal.arya@oss.qualcomm.com \
--cc=puranjay@kernel.org \
--cc=santosh.shukla@amd.com \
--cc=shuah@kernel.org \
--cc=smostafa@google.com \
--cc=sudeep.holla@kernel.org \
--cc=surenb@google.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=wkarny@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox