linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Anatoly Stepanov <astepanov@cloudlinux.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	vdavydov.dev@gmail.com, umka@cloudlinux.com,
	panda@cloudlinux.com, vmeshkov@cloudlinux.com
Subject: Re: [PATCH] mm: use vmalloc fallback path for certain memcg allocations
Date: Tue, 6 Dec 2016 09:47:35 +0100	[thread overview]
Message-ID: <20161206084734.GC18664@dhcp22.suse.cz> (raw)
In-Reply-To: <20161202220913.GA536156@stepanov.centos7>

On Sat 03-12-16 01:09:13, Anatoly Stepanov wrote:
> On Mon, Dec 05, 2016 at 06:23:26AM +0100, Michal Hocko wrote:
> > On Fri 02-12-16 09:54:17, Anatoly Stepanov wrote:
> > > Alex, Vlasimil, Michal, thanks for your responses!
> > > 
> > > On Fri, Dec 02, 2016 at 10:19:33AM +0100, Michal Hocko wrote:
> > > > Thanks for CCing me Vlastimil
> > > > 
> > > > On Fri 02-12-16 09:44:23, Vlastimil Babka wrote:
> > > > > On 12/01/2016 02:16 AM, Anatoly Stepanov wrote:
> > > > > > As memcg array size can be up to:
> > > > > > sizeof(struct memcg_cache_array) + kmemcg_id * sizeof(void *);
> > > > > > 
> > > > > > where kmemcg_id can be up to MEMCG_CACHES_MAX_SIZE.
> > > > > > 
> > > > > > When a memcg instance count is large enough it can lead
> > > > > > to high order allocations up to order 7.
> > > > 
> > > > This is definitely not nice and worth fixing! I am just wondering
> > > > whether this is something you have encountered in the real life. Having
> > > > thousands of memcgs sounds quite crazy^Wscary to me. I am not at all
> > > > sure we are prepared for that and some controllers would have real
> > > > issues with it AFAIR.
> > > 
> > > In our company we use custom-made lightweight container technology, the thing is
> > > we can have up to several thousands of them on a server.
> > > So those high-order allocations were observed on a real production workload.
> > 
> > OK, this is interesting. Definitely worth mentioning in the changelog!
> > 
> > [...]
> > > > 	/*
> > > > 	 * Do not invoke OOM killer for larger requests as we can fall
> > > > 	 * back to the vmalloc
> > > > 	 */
> > > > 	if (size > PAGE_SIZE)
> > > > 		gfp_mask |= __GFP_NORETRY | __GFP_NOWARN;
> > > 
> > > I think we should check against PAGE_ALLOC_COSTLY_ORDER anyway, as
> > > there's no big need to allocate large contiguous chunks here, at the
> > > same time someone in the kernel might really need them.
> > 
> > PAGE_ALLOC_COSTLY_ORDER is and should remain the page allocator internal
> > implementation detail and shouldn't spread out much outside. GFP_NORETRY
> > will already make sure we do not push hard here.
> 
> May be i didn't put my thoughts well, so let's discuss in more detail:
> 
> 1. Yes, we don't try that hard to allocate high-order blocks with
> __GFP_NORETRY, but we still can do compaction and direct reclaim,
> which can be heavy for large chunk.  In the worst case we can even
> fail to find the chunk, after all reclaim/compaction steps were made.

Yes this is correct. But I am not sure what you are trying to tell
by that. Highorder requests are a bit of a problem. That's why
__GFP_NORETRY is implicit here. It also guarantees that we won't hit
the OOM killer because we do have a reasonable fallback. I do not see a
point to play with COSTLY_ORDER though. The page allocator knows how to
handle those and we are trying hard that those requests are not too
disruptive. Or am I still missing your point?

> 2. The second point is, even if we got the desired chunk quickly, we
> end up wasting large contiguous chunks, which might be needed for CMA
> or some h/w driver (DMA for inst.), when they can't use non-contiguous
> chunks.

On the other hand vmalloc is not free either.

> BTW, in the kernel there are few examples like alloc_fdmem() for inst., which
> use that "costly order" idea of the fallback.

I am not familiar with this code much so it is hard for me to comment.
Anyway I am not entirely sure the code is still valid. We do not do
excessive reclaim nor compaction for costly orders. THey are mostly an
optimistic try without __GFP_REPEAT these days. So the assumption which
it was based on back in 2011 might be no longer true.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-12-06  8:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-01  1:16 [PATCH] mm: use vmalloc fallback path for certain memcg allocations Anatoly Stepanov
2016-12-02  8:19 ` Alexey Lyashkov
2016-12-02  8:44 ` Vlastimil Babka
2016-12-02  9:19   ` Michal Hocko
2016-12-02  6:54     ` Anatoly Stepanov
2016-12-05  5:23       ` Michal Hocko
2016-12-02 22:09         ` Anatoly Stepanov
2016-12-06  8:47           ` Michal Hocko [this message]
2016-12-03 15:55             ` Anatoly Stepanov
2016-12-08  8:45               ` Michal Hocko
2016-12-05 14:09         ` Heiko Carstens
2016-12-05 14:19           ` Michal Hocko
2016-12-02 22:15             ` Anatoly Stepanov
2016-12-06  8:34               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161206084734.GC18664@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=astepanov@cloudlinux.com \
    --cc=linux-mm@kvack.org \
    --cc=panda@cloudlinux.com \
    --cc=umka@cloudlinux.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=vmeshkov@cloudlinux.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).