From: Uladzislau Rezki <urezki@gmail.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Uladzislau Rezki <urezki@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>,
Thomas Gleixner <tglx@linutronix.de>,
"Theodore Y . Ts'o" <tytso@mit.edu>,
Joel Fernandes <joel@joelfernandes.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>,
Mel Gorman <mgorman@suse.de>
Subject: Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.
Date: Tue, 22 Sep 2020 15:12:57 +0200 [thread overview]
Message-ID: <20200922131257.GA29241@pc636> (raw)
In-Reply-To: <20200922075002.GU12990@dhcp22.suse.cz>
> > > > Yes, I do well remember that you are unhappy with this approach.
> > > > Unfortunately, thus far, there is no solution that makes all developers
> > > > happy. You might be glad to hear that we are also looking into other
> > > > solutions, each of which makes some other developers unhappy. So we
> > > > are at least not picking on you alone. :-/
> > >
> > > No worries I do not feel like a whipping boy here. But do expect me to
> > > argue against the approach. I would also appreciate it if there was some
> > > more information on other attempts, why they have failed. E.g. why
> > > pre-allocation is not an option that works well enough in most
> > > reasonable workloads.
> > Pre-allocating has some drawbacks:
> >
> > a) It is impossible to predict how many pages will be required to
> > cover a demand that is controlled by different workloads on
> > various systems.
>
> Yes, this is not trivial but not a rocket science either. Remember that
> you are relying on a very dumb watermark based pcp pool from the
> allocator.
>
We rely on it, indeed. If the pcp-cache is depleted our special work is
triggered to charge our local cache(few pages) such way will also initiate
the process of pre-featching pages from the buddy allocator populating
the depleted pcp-cache. I do not have any concern here.
>
> Mimicing a similar implementation shouldn't be all that hard
> and you will get your own pool which doesn't affect other page allocator
> users as much as a bonus.
>
I see your point Michal. As i mentioned before, it is important to avoid of
having such own pools, because the aim is not to waste memory resources. A
page will be returned back to "page allocator" as soon as a scheduler place
our reclaim thread on a CPU and grace period is passed. So, the resource
can be used for other needs. What is important.
Otherwise a memory footprint is increased what is bad for low memory
conditions when OOM is involved. Just in case, it is a big issue for
mobile devices.
> > b) Memory overhead since we do not know how much pages should be
> > preloaded: 100, 200 or 300
>
> Does anybody who really needs this optimization actually cares about 300
> pages?
>
It might be an issue for embedded devices when such devices run into a
low memory condition resulting in OOM or slow allocations due to mentioned
condition. For servers and big system it will not be visible.
> > As for memory overhead, it is important to reduce it because of
> > embedded devices like phones, where a low memory condition is a
> > big issue. In that sense pre-allocating is something that we strongly
> > would like to avoid.
>
> How big "machines" are we talking about here? I would expect that really
> tiny machines would have hard times to really fill up thousands of pages
> with pointers to free...
>
I mentioned above. We can not rely on static model. We would like to
have a mechanism that gives back ASAP used pages to page allocator
for other needs.
>
> Would a similar scaling as the page allocator feasible. Really I mostly
> do care about shared nature of the pcp allocator list that one user can
> easily monopolize with this API.
>
I see your concern. pcplist can be monopolized by already existing API:
while (i < 100)
__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
> > > I would also appreciate some more thoughts why we
> > > need to optimize for heavy abusers of RCU (like close(open) extremes).
> > >
> > I think here is a small misunderstanding. Please note, that is not only
> > about performance and corner cases. There is a single argument support
> > of the kvfree_rcu(ptr), where maintaining an array in time is needed.
> > The fallback of the single argument case is extrimely slow.
>
> This should be part of the changelog.
>
Hmm.. I think it is. Sorry if i missed that but i hope i mentioned about it.
> >
> > Single-argument details is here: https://lkml.org/lkml/2020/4/28/1626
>
> Error 501
>
Could you please elaborate? Do not want to speculate :)
> > > > > I strongly agree with Thomas http://lkml.kernel.org/r/87tux4kefm.fsf@nanos.tec.linutronix.de
> > > > > that this optimization is not aiming at reasonable workloads. Really, go
> > > > > with pre-allocated buffer and fallback to whatever slow path you have
> > > > > already. Exposing more internals of the allocator is not going to do any
> > > > > good for long term maintainability.
> > > >
> > > > I suggest that you carefully re-read the thread following that email.
> > >
> > > I clearly remember Thomas not being particularly happy that you optimize
> > > for a corner case. I do not remember there being a consensus that this
> > > is the right approach. There was some consensus that this is better than
> > > a gfp flag. Still quite bad though if you ask me.
> > >
> > > > Given a choice between making users unhappy and making developers
> > > > unhappy, I will side with the users each and every time.
> > >
> > > Well, let me rephrase. It is not only about me (as a developer) being
> > > unhappy but also all the side effects this would have for users when
> > > performance of their favorite workload declines for no apparent reason
> > > just because pcp caches are depleted by an unrelated process.
> > >
> > If depleted, we have a special worker that charge it. From the other hand,
> > the pcplist can be depleted by its nature, what _is_ not wrong. But just
> > in case we secure it since you had a concern about it.
>
> pcp free lists should ever get empty when we run out of memory and need
> to reclaim. Otherwise they are constantly refilled/rebalanced on demand.
> The fact that you are refilling them from outside just suggest that you
> are operating on a wrong layer. Really, create your own pool of pages
> and rebalance them based on the workload.
>
I covered it above.
> > Could you please specify a real test case or workload you are talking about?
>
> I am not a performance expert but essentially any memory allocator heavy
> workload might notice. I am pretty sure Mel would tell you more.
>
OK.
Thank you for your comments, Michal!
--
Vlad Rezki
next prev parent reply other threads:[~2020-09-22 13:13 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-18 19:48 [PATCH 0/4] kvfree_rcu() and _LOCK_NESTING/_PREEMPT_RT Uladzislau Rezki (Sony)
2020-09-18 19:48 ` [PATCH 1/4] rcu/tree: Add a work to allocate pages from regular context Uladzislau Rezki (Sony)
2020-09-18 19:48 ` [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func Uladzislau Rezki (Sony)
2020-09-21 7:47 ` Michal Hocko
2020-09-21 15:45 ` Paul E. McKenney
2020-09-21 16:03 ` Michal Hocko
2020-09-21 19:48 ` Uladzislau Rezki
2020-09-22 7:50 ` Michal Hocko
2020-09-22 13:12 ` Uladzislau Rezki [this message]
2020-09-22 15:35 ` Michal Hocko
2020-09-23 10:37 ` Mel Gorman
2020-09-23 15:41 ` Paul E. McKenney
2020-09-23 23:22 ` Mel Gorman
2020-09-24 8:16 ` Uladzislau Rezki
2020-09-24 11:16 ` Peter Zijlstra
2020-09-24 15:16 ` Uladzislau Rezki
2020-09-24 11:19 ` Peter Zijlstra
2020-09-24 15:21 ` Uladzislau Rezki
2020-09-25 8:15 ` Peter Zijlstra
2020-09-25 10:25 ` Uladzislau Rezki
2020-09-24 15:38 ` Paul E. McKenney
2020-09-25 8:26 ` Peter Zijlstra
2020-09-26 14:37 ` Paul E. McKenney
2020-09-25 8:05 ` Michal Hocko
2020-09-25 15:31 ` Uladzislau Rezki
2020-09-25 15:47 ` Michal Hocko
2020-09-29 16:25 ` Uladzislau Rezki
2020-09-30 9:27 ` Michal Hocko
2020-09-30 12:35 ` Uladzislau Rezki
2020-09-30 12:44 ` Michal Hocko
2020-09-30 13:39 ` Uladzislau Rezki
2020-09-30 16:46 ` Michal Hocko
2020-09-30 20:36 ` Uladzislau Rezki
2020-09-30 15:25 ` Joel Fernandes
2020-09-30 16:48 ` Michal Hocko
2020-09-30 17:03 ` Joel Fernandes
2020-09-30 17:22 ` Michal Hocko
2020-09-30 17:48 ` Joel Fernandes
2020-09-25 16:17 ` Mel Gorman
2020-09-25 17:57 ` Uladzislau Rezki
2020-09-22 15:49 ` Paul E. McKenney
2020-09-22 3:35 ` Paul E. McKenney
2020-09-22 8:03 ` Michal Hocko
2020-09-22 15:46 ` Paul E. McKenney
2020-09-23 11:27 ` Uladzislau Rezki
2020-09-29 10:15 ` Vlastimil Babka
2020-09-29 22:07 ` Uladzislau Rezki
2020-09-30 10:35 ` Michal Hocko
2020-10-01 19:32 ` Uladzislau Rezki
2020-09-30 14:39 ` Vlastimil Babka
2020-09-30 15:37 ` Joel Fernandes
2020-10-01 19:26 ` Uladzislau Rezki
2020-10-02 7:11 ` Michal Hocko
2020-10-02 8:50 ` Mel Gorman
2020-10-02 9:05 ` Michal Hocko
2020-10-05 15:08 ` Uladzislau Rezki
2020-10-05 15:41 ` Michal Hocko
2020-10-06 22:25 ` Uladzislau Rezki
2020-10-07 10:02 ` Michal Hocko
2020-10-07 11:02 ` Uladzislau Rezki
2020-10-02 9:07 ` Peter Zijlstra
2020-10-02 9:45 ` Mel Gorman
2020-10-02 9:58 ` Peter Zijlstra
2020-10-02 10:19 ` Mel Gorman
2020-10-02 14:41 ` Paul E. McKenney
2020-10-06 10:03 ` Mel Gorman
2020-10-06 15:41 ` Paul E. McKenney
2020-10-05 13:58 ` Uladzislau Rezki
2020-10-02 8:06 ` Mel Gorman
2020-10-05 14:12 ` Uladzislau Rezki
2020-09-18 19:48 ` [PATCH 3/4] rcu/tree: use " Uladzislau Rezki (Sony)
2020-09-18 19:48 ` [PATCH 4/4] rcu/tree: Use schedule_delayed_work() instead of WQ_HIGHPRI queue Uladzislau Rezki (Sony)
2020-09-20 15:06 ` Paul E. McKenney
2020-09-21 13:27 ` Uladzislau Rezki
2020-09-18 22:15 ` [PATCH 0/4] kvfree_rcu() and _LOCK_NESTING/_PREEMPT_RT Paul E. McKenney
2020-09-30 15:52 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200922131257.GA29241@pc636 \
--to=urezki@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=oleksiy.avramchenko@sonymobile.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.