From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>,
linux-mm <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
John Dias <joaodias@google.com>
Subject: Re: [RESEND][PATCH v2] mm: don't call lru draining in the nested lru_cache_disable
Date: Tue, 18 Jan 2022 16:12:54 -0800 [thread overview]
Message-ID: <YedXhpwURNTkW1Z3@google.com> (raw)
In-Reply-To: <YeVzWlrojI1+buQx@dhcp22.suse.cz>
On Mon, Jan 17, 2022 at 02:47:06PM +0100, Michal Hocko wrote:
> On Thu 30-12-21 11:36:27, Minchan Kim wrote:
> > lru_cache_disable involves IPIs to drain pagevec of each core,
> > which sometimes takes quite long time to complete depending
> > on cpu's business, which makes allocation too slow up to
> > sveral hundredth milliseconds. Furthermore, the repeated draining
> > in the alloc_contig_range makes thing worse considering caller
> > of alloc_contig_range usually tries multiple times in the loop.
> >
> > This patch makes the lru_cache_disable aware of the fact the
> > pagevec was already disabled. With that, user of alloc_contig_range
> > can disable the lru cache in advance in their context during the
> > repeated trial so they can avoid the multiple costly draining
> > in cma allocation.
>
> Do you have any numbers on any improvements?
The LRU draining consumed above 50% overhead for the 20M CMA alloc.
>
> Now to the change. I do not like this much to be honest. LRU cache
> disabling is a complex synchronization scheme implemented in
> __lru_add_drain_all now you are stacking another level on top of that.
>
> More fundamentally though. I am not sure I understand the problem TBH.
The problem is that kinds of IPI using normal prority workqueue to drain
takes much time depending on the system CPU business.
> What prevents you from calling lru_cache_disable at the cma level in the
> first place?
You meant moving the call from alloc_contig_range to caller layer?
So, virtio_mem_fake_online, too? It could and make sense from
performance perspective since upper layer usually calls the
alloc_contig_range multiple times on retrial loop.
Havid said, semantically, not good in that why upper layer should
know how alloc_contig_range works(LRU disable is too low level stuff)
internally but I chose the performance here.
There is an example why the stacking is needed.
cma_alloc also can be called from outside.
A usecase is try to call
lru_cache_disable
for (order = 10; order >= 0; order) {
page = cma_alloc(1<<order)
if (page)
break;
}
lru_cacne_enable
Here, putting the disable lru outside of cma_alloc is
much better than inside. That's why I put it outside.
>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> > * from v1 - https://lore.kernel.org/lkml/20211206221006.946661-1-minchan@kernel.org/
> > * fix lru_cache_disable race - akpm
> >
> > include/linux/swap.h | 14 ++------------
> > mm/cma.c | 5 +++++
> > mm/swap.c | 30 ++++++++++++++++++++++++++++--
> > 3 files changed, 35 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/linux/swap.h b/include/linux/swap.h
> > index ba52f3a3478e..fe18e86a4f13 100644
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -348,19 +348,9 @@ extern void lru_note_cost_page(struct page *);
> > extern void lru_cache_add(struct page *);
> > extern void mark_page_accessed(struct page *);
> >
> > -extern atomic_t lru_disable_count;
> > -
> > -static inline bool lru_cache_disabled(void)
> > -{
> > - return atomic_read(&lru_disable_count);
> > -}
> > -
> > -static inline void lru_cache_enable(void)
> > -{
> > - atomic_dec(&lru_disable_count);
> > -}
> > -
> > +extern bool lru_cache_disabled(void);
> > extern void lru_cache_disable(void);
> > +extern void lru_cache_enable(void);
> > extern void lru_add_drain(void);
> > extern void lru_add_drain_cpu(int cpu);
> > extern void lru_add_drain_cpu_zone(struct zone *zone);
> > diff --git a/mm/cma.c b/mm/cma.c
> > index 995e15480937..60be555c5b95 100644
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> > @@ -30,6 +30,7 @@
> > #include <linux/cma.h>
> > #include <linux/highmem.h>
> > #include <linux/io.h>
> > +#include <linux/swap.h>
> > #include <linux/kmemleak.h>
> > #include <trace/events/cma.h>
> >
> > @@ -453,6 +454,8 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
> > if (bitmap_count > bitmap_maxno)
> > goto out;
> >
> > + lru_cache_disable();
> > +
> > for (;;) {
> > spin_lock_irq(&cma->lock);
> > bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap,
> > @@ -492,6 +495,8 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
> > start = bitmap_no + mask + 1;
> > }
> >
> > + lru_cache_enable();
> > +
> > trace_cma_alloc_finish(cma->name, pfn, page, count, align);
> >
> > /*
> > diff --git a/mm/swap.c b/mm/swap.c
> > index af3cad4e5378..5f89d7c9a54e 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -847,7 +847,17 @@ void lru_add_drain_all(void)
> > }
> > #endif /* CONFIG_SMP */
> >
> > -atomic_t lru_disable_count = ATOMIC_INIT(0);
> > +static atomic_t lru_disable_count = ATOMIC_INIT(0);
> > +
> > +bool lru_cache_disabled(void)
> > +{
> > + return atomic_read(&lru_disable_count) != 0;
> > +}
> > +
> > +void lru_cache_enable(void)
> > +{
> > + atomic_dec(&lru_disable_count);
> > +}
> >
> > /*
> > * lru_cache_disable() needs to be called before we start compiling
> > @@ -859,7 +869,21 @@ atomic_t lru_disable_count = ATOMIC_INIT(0);
> > */
> > void lru_cache_disable(void)
> > {
> > - atomic_inc(&lru_disable_count);
> > + static DEFINE_MUTEX(lock);
> > +
> > + /*
> > + * The lock gaurantees lru_cache is drained when the function
> > + * returned.
> > + */
> > + mutex_lock(&lock);
> > + /*
> > + * If someone is already disabled lru_cache, just return with
> > + * increasing the lru_disable_count.
> > + */
> > + if (atomic_inc_not_zero(&lru_disable_count)) {
> > + mutex_unlock(&lock);
> > + return;
> > + }
> > #ifdef CONFIG_SMP
> > /*
> > * lru_add_drain_all in the force mode will schedule draining on
> > @@ -873,6 +897,8 @@ void lru_cache_disable(void)
> > #else
> > lru_add_and_bh_lrus_drain();
> > #endif
> > + atomic_inc(&lru_disable_count);
> > + mutex_unlock(&lock);
> > }
> >
> > /**
> > --
> > 2.34.1.448.ga2b2bfdf31-goog
>
> --
> Michal Hocko
> SUSE Labs
>
next prev parent reply other threads:[~2022-01-19 0:12 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-30 19:36 [RESEND][PATCH v2] mm: don't call lru draining in the nested lru_cache_disable Minchan Kim
2022-01-06 18:14 ` Minchan Kim
2022-01-17 13:47 ` Michal Hocko
2022-01-19 0:12 ` Minchan Kim [this message]
2022-01-19 9:20 ` Michal Hocko
2022-01-20 4:25 ` Minchan Kim
2022-01-20 8:24 ` Michal Hocko
2022-01-20 21:07 ` Minchan Kim
2022-01-21 9:59 ` Michal Hocko
2022-01-21 21:56 ` Minchan Kim
2022-01-24 9:57 ` Michal Hocko
2022-01-24 22:22 ` Minchan Kim
2022-01-25 9:23 ` Michal Hocko
2022-01-25 21:06 ` Minchan Kim
2022-01-26 12:09 ` Michal Hocko
2022-01-20 8:42 ` David Hildenbrand
2022-01-20 21:22 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YedXhpwURNTkW1Z3@google.com \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=joaodias@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.