From: Joel Fernandes <joel@joelfernandes.org>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.com>,
Matthew Wilcox <willy@infradead.org>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Thomas Garnier <thgarnie@google.com>,
Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v1 2/2] mm: add priority threshold to __purge_vmap_area_lazy()
Date: Wed, 6 Mar 2019 11:25:19 -0500 [thread overview]
Message-ID: <20190306162519.GB193418@google.com> (raw)
In-Reply-To: <20190129173936.4sscooiybzbhos77@pc636>
On Tue, Jan 29, 2019 at 06:39:36PM +0100, Uladzislau Rezki wrote:
> On Mon, Jan 28, 2019 at 05:45:28PM -0500, Joel Fernandes wrote:
> > On Thu, Jan 24, 2019 at 12:56:48PM +0100, Uladzislau Rezki (Sony) wrote:
> > > commit 763b218ddfaf ("mm: add preempt points into
> > > __purge_vmap_area_lazy()")
> > >
> > > introduced some preempt points, one of those is making an
> > > allocation more prioritized over lazy free of vmap areas.
> > >
> > > Prioritizing an allocation over freeing does not work well
> > > all the time, i.e. it should be rather a compromise.
> > >
> > > 1) Number of lazy pages directly influence on busy list length
> > > thus on operations like: allocation, lookup, unmap, remove, etc.
> > >
> > > 2) Under heavy stress of vmalloc subsystem i run into a situation
> > > when memory usage gets increased hitting out_of_memory -> panic
> > > state due to completely blocking of logic that frees vmap areas
> > > in the __purge_vmap_area_lazy() function.
> > >
> > > Establish a threshold passing which the freeing is prioritized
> > > back over allocation creating a balance between each other.
> >
> > I'm a bit concerned that this will introduce the latency back if vmap_lazy_nr
> > is greater than half of lazy_max_pages(). Which IIUC will be more likely if
> > the number of CPUs is large.
> >
> The threshold that we establish is two times more than lazy_max_pages(),
> i.e. in case of 4 system CPUs lazy_max_pages() is 24576, therefore the
> threshold is 49152, if PAGE_SIZE is 4096.
>
> It means that we allow rescheduling if vmap_lazy_nr < 49152. If vmap_lazy_nr
> is higher then we forbid rescheduling and free areas until it becomes lower
> again to stabilize the system. By doing that, we will not allow vmap_lazy_nr
> to be enormously increased.
Sorry for late reply.
This sounds reasonable. Such an extreme situation of vmap_lazy_nr being twice
the lazy_max_pages() is probably only possible using a stress test anyway
since (hopefully) the try_purge_vmap_area_lazy() call is happening often
enough to keep the vmap_lazy_nr low.
Have you experimented with what is the highest threshold that prevents the
issues you're seeing? Have you tried 3x or 4x the vmap_lazy_nr?
I also wonder what is the cost these days of the global TLB flush on the most
common Linux architectures and if the whole purge vmap_area lazy stuff is
starting to get a bit dated, and if we can do the purging inline as areas are
freed. There is a cost to having this mechanism too as you said, which is as
the list size grows, all other operations also take time.
thanks,
- Joel
> > In fact, when vmap_lazy_nr is high, that's when the latency will be the worst
> > so one could say that that's when you *should* reschedule since the frees are
> > taking too long and hurting real-time tasks.
> >
> > Could this be better solved by tweaking lazy_max_pages() such that purging is
> > more aggressive?
> >
> > Another approach could be to detect the scenario you brought up (allocations
> > happening faster than free), somehow, and avoid a reschedule?
> >
> This is what i am trying to achieve by this change.
>
> Thank you for your comments.
>
> --
> Vlad Rezki
> > >
> > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > ---
> > > mm/vmalloc.c | 18 ++++++++++++------
> > > 1 file changed, 12 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index fb4fb5fcee74..abe83f885069 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -661,23 +661,27 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
> > > struct llist_node *valist;
> > > struct vmap_area *va;
> > > struct vmap_area *n_va;
> > > - bool do_free = false;
> > > + int resched_threshold;
> > >
> > > lockdep_assert_held(&vmap_purge_lock);
> > >
> > > valist = llist_del_all(&vmap_purge_list);
> > > + if (unlikely(valist == NULL))
> > > + return false;
> > > +
> > > + /*
> > > + * TODO: to calculate a flush range without looping.
> > > + * The list can be up to lazy_max_pages() elements.
> > > + */
> > > llist_for_each_entry(va, valist, purge_list) {
> > > if (va->va_start < start)
> > > start = va->va_start;
> > > if (va->va_end > end)
> > > end = va->va_end;
> > > - do_free = true;
> > > }
> > >
> > > - if (!do_free)
> > > - return false;
> > > -
> > > flush_tlb_kernel_range(start, end);
> > > + resched_threshold = (int) lazy_max_pages() << 1;
> > >
> > > spin_lock(&vmap_area_lock);
> > > llist_for_each_entry_safe(va, n_va, valist, purge_list) {
> > > @@ -685,7 +689,9 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
> > >
> > > __free_vmap_area(va);
> > > atomic_sub(nr, &vmap_lazy_nr);
> > > - cond_resched_lock(&vmap_area_lock);
> > > +
> > > + if (atomic_read(&vmap_lazy_nr) < resched_threshold)
> > > + cond_resched_lock(&vmap_area_lock);
> > > }
> > > spin_unlock(&vmap_area_lock);
> > > return true;
> > > --
> > > 2.11.0
> > >
next prev parent reply other threads:[~2019-03-06 16:25 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-24 11:56 [PATCH v1 0/2] stability fixes for vmalloc allocator Uladzislau Rezki (Sony)
2019-01-24 11:56 ` [PATCH v1 1/2] mm/vmalloc: fix kernel BUG at mm/vmalloc.c:512! Uladzislau Rezki (Sony)
2019-01-24 11:56 ` [PATCH v1 2/2] mm: add priority threshold to __purge_vmap_area_lazy() Uladzislau Rezki (Sony)
2019-01-28 20:04 ` Andrew Morton
2019-01-29 16:17 ` Uladzislau Rezki
2019-01-29 18:03 ` Andrew Morton
2019-01-28 22:45 ` Joel Fernandes
2019-01-29 17:39 ` Uladzislau Rezki
2019-03-06 16:25 ` Joel Fernandes [this message]
2019-03-07 11:15 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190306162519.GB193418@google.com \
--to=joel@joelfernandes.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mingo@elte.hu \
--cc=oleksiy.avramchenko@sonymobile.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=thgarnie@google.com \
--cc=tj@kernel.org \
--cc=urezki@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.