Re: [PATCH] mm: use vmalloc fallback path for certain memcg allocations

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@suse.com>
To: Anatoly Stepanov <astepanov@cloudlinux.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	vdavydov.dev@gmail.com, umka@cloudlinux.com,
	panda@cloudlinux.com, vmeshkov@cloudlinux.com
Subject: Re: [PATCH] mm: use vmalloc fallback path for certain memcg allocations
Date: Thu, 8 Dec 2016 09:45:25 +0100	[thread overview]
Message-ID: <20161208084525.GA8330@dhcp22.suse.cz> (raw)
In-Reply-To: <20161203155522.GA648490@stepanov.centos7>

On Sat 03-12-16 18:55:22, Anatoly Stepanov wrote:
> On Tue, Dec 06, 2016 at 09:47:35AM +0100, Michal Hocko wrote:
> > On Sat 03-12-16 01:09:13, Anatoly Stepanov wrote:
> > > On Mon, Dec 05, 2016 at 06:23:26AM +0100, Michal Hocko wrote:
> > > > On Fri 02-12-16 09:54:17, Anatoly Stepanov wrote:
> > > > > Alex, Vlasimil, Michal, thanks for your responses!
> > > > > 
> > > > > On Fri, Dec 02, 2016 at 10:19:33AM +0100, Michal Hocko wrote:
> > > > > > Thanks for CCing me Vlastimil
> > > > > > 
> > > > > > On Fri 02-12-16 09:44:23, Vlastimil Babka wrote:
> > > > > > > On 12/01/2016 02:16 AM, Anatoly Stepanov wrote:
> > > > > > > > As memcg array size can be up to:
> > > > > > > > sizeof(struct memcg_cache_array) + kmemcg_id * sizeof(void *);
> > > > > > > > 
> > > > > > > > where kmemcg_id can be up to MEMCG_CACHES_MAX_SIZE.
> > > > > > > > 
> > > > > > > > When a memcg instance count is large enough it can lead
> > > > > > > > to high order allocations up to order 7.
> > > > > > 
> > > > > > This is definitely not nice and worth fixing! I am just wondering
> > > > > > whether this is something you have encountered in the real life. Having
> > > > > > thousands of memcgs sounds quite crazy^Wscary to me. I am not at all
> > > > > > sure we are prepared for that and some controllers would have real
> > > > > > issues with it AFAIR.
> > > > > 
> > > > > In our company we use custom-made lightweight container technology, the thing is
> > > > > we can have up to several thousands of them on a server.
> > > > > So those high-order allocations were observed on a real production workload.
> > > > 
> > > > OK, this is interesting. Definitely worth mentioning in the changelog!
> > > > 
> > > > [...]
> > > > > > 	/*
> > > > > > 	 * Do not invoke OOM killer for larger requests as we can fall
> > > > > > 	 * back to the vmalloc
> > > > > > 	 */
> > > > > > 	if (size > PAGE_SIZE)
> > > > > > 		gfp_mask |= __GFP_NORETRY | __GFP_NOWARN;
> > > > > 
> > > > > I think we should check against PAGE_ALLOC_COSTLY_ORDER anyway, as
> > > > > there's no big need to allocate large contiguous chunks here, at the
> > > > > same time someone in the kernel might really need them.
> > > > 
> > > > PAGE_ALLOC_COSTLY_ORDER is and should remain the page allocator internal
> > > > implementation detail and shouldn't spread out much outside. GFP_NORETRY
> > > > will already make sure we do not push hard here.
> > > 
> > > May be i didn't put my thoughts well, so let's discuss in more detail:
> > > 
> > > 1. Yes, we don't try that hard to allocate high-order blocks with
> > > __GFP_NORETRY, but we still can do compaction and direct reclaim,
> > > which can be heavy for large chunk.  In the worst case we can even
> > > fail to find the chunk, after all reclaim/compaction steps were made.
> > 
> > Yes this is correct. But I am not sure what you are trying to tell
> > by that. Highorder requests are a bit of a problem. That's why
> > __GFP_NORETRY is implicit here. It also guarantees that we won't hit
> > the OOM killer because we do have a reasonable fallback. I do not see a
> > point to play with COSTLY_ORDER though. The page allocator knows how to
> > handle those and we are trying hard that those requests are not too
> > disruptive. Or am I still missing your point?
> 
> My point is, while we're trying to get a pretty big contig. chunk
> (let's say of COSTLY_SIZE), the reclaim can induce a lot of disk I/O

Not really, as I've tried to explain above. The page allocator really
doesn't try hard for costly orders and bail out early after the first
round of reclaim compaction.

> which can be crucial for overall system performance, at the same time
> we don't need that contig. chunk.
> 
> So, for COSTLY_SIZE chunks, vmalloc should perform better, as it's
> obviosly more likely to find order-0 blocks w/o reclaim.

Again, vmalloc is not free either and a problem especially on 32b
arches.

Anyway, I think we are going in circles here and repeating the same
arguments. Let me post what I think is the right implementation of
kvmalloc and you can build on top of that.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2016-12-08  8:45 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-01  1:16 [PATCH] mm: use vmalloc fallback path for certain memcg allocations Anatoly Stepanov
2016-12-02  8:19 ` Alexey Lyashkov
2016-12-02  8:44 ` Vlastimil Babka
2016-12-02  9:19   ` Michal Hocko
2016-12-02  6:54     ` Anatoly Stepanov
2016-12-05  5:23       ` Michal Hocko
2016-12-02 22:09         ` Anatoly Stepanov
2016-12-06  8:47           ` Michal Hocko
2016-12-03 15:55             ` Anatoly Stepanov
2016-12-08  8:45               ` Michal Hocko [this message]
2016-12-05 14:09         ` Heiko Carstens
2016-12-05 14:19           ` Michal Hocko
2016-12-02 22:15             ` Anatoly Stepanov
2016-12-06  8:34               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161208084525.GA8330@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=astepanov@cloudlinux.com \
    --cc=linux-mm@kvack.org \
    --cc=panda@cloudlinux.com \
    --cc=umka@cloudlinux.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=vmeshkov@cloudlinux.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.