All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: avi@redhat.com, nate@cpanel.net, cl@linux-foundation.org,
	oleg@redhat.com, axboe@kernel.dk, vgoyal@redhat.com,
	linux-kernel@vger.kernel.org, jaxboe@fusionio.com
Subject: Re: [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock
Date: Tue, 27 Dec 2011 14:22:23 -0800	[thread overview]
Message-ID: <20111227222223.GI17712@google.com> (raw)
In-Reply-To: <20111227135836.7102f41b.akpm@linux-foundation.org>

Hello, Andrew.

On Tue, Dec 27, 2011 at 01:58:36PM -0800, Andrew Morton wrote:
> >  But that would
> > involved a lot more churn and complexity without much added benefit,
> > given that this type of use cases aren't expected to be common - and
> > I'm fairly sure it isn't given track record of past few years.
> 
> I don't think it would be too hard to add an alloc_percpu_gfp().  Add
> the gfp_t to a small number of functions (two or three?) then change
> pcpu_mem_zalloc() to always use kzalloc() if (flags & GFP_KERNEL !=
> GFP_KERNEL).  And that's it?

Hmmm... What do you mean "always use kzalloc()"?  Percpu memory can't
be served by kzalloc().  They need to be congruent.

Currently, percpu allocation happens in two stages.  The first stage
is locating congruent address areas in vmalloc space (so that static
percpu offsets among different CPUs can be maintained for dynamic
percpu areas).  The second stage is actually allocating pages and
populating those areas.  Because percpu memory is expensive, the
allocator keeps both steps on demand.  When it runs out of address
space, it gets some and the addresses are filled only when the memory
areas are actually handed out.

Unfortunately, both steps involve vmalloc machinery and GFP_KERNEL
assumption reaches into arch specific code for page table allocation.
So, making all those steps reclaim path friendly would be quite a
churn.

*If* we're gonna solve this at the allocator level, we'll probably end
up with front buffer in percpu allocator, which buffers populated
percpu areas to give out for reclaim path allocations, which is
doable, would involve a lot more complexity and would probably need to
keep more percpu memory reserved to stay generic.

> But the question is: is this a *good* thing to do?  It would be nice if
> kernel developers understood that GFP_KERNEL is strongly preferred and
> that they should put in effort to use it.  But there's a strong
> tendency for people to get themselves into a sticky corner then take
> the easy way out, resulting in less robust code.  Maybe calling the
> function alloc_percpu_i_really_suck() would convey the hint.

I fully agree and have been pushing back pretty hard against people
trying to allocate percpu memory from !GFP_KERNEL context and I want
to keep it that way.  But, to me, this one seems like a valid use
case.  If we want to track dynamic associations in IO path, short of
allocating for all combinatorial combinations, we can't avoid
performing NOIO allocations.  Of course, the code path should be
careful so that it's not too demanding on the allocator and must be
able to degrade operation gracefully on allocation failures.  It's a
special case but a valid one at that.

Thanks.

-- 
tejun

  reply	other threads:[~2011-12-27 22:22 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-22 21:45 [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock Tejun Heo
2011-12-22 21:45 ` [PATCH 1/7] mempool: fix and document synchronization and memory barrier usage Tejun Heo
2011-12-22 21:45 ` [PATCH 2/7] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Tejun Heo
2011-12-22 21:45 ` [PATCH 3/7] mempool: fix first round failure behavior Tejun Heo
2011-12-22 21:45 ` [PATCH 4/7] mempool: factor out mempool_fill() Tejun Heo
2011-12-22 21:45 ` [PATCH 5/7] mempool: separate out __mempool_create() Tejun Heo
2011-12-22 21:45 ` [PATCH 6/7] mempool, percpu: implement percpu mempool Tejun Heo
2011-12-22 21:45 ` [PATCH 7/7] block: fix deadlock through percpu allocation in blk-cgroup Tejun Heo
2011-12-23  1:00   ` Vivek Goyal
2011-12-23 22:54     ` Tejun Heo
2011-12-22 21:59 ` [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock Andrew Morton
2011-12-22 22:09   ` Tejun Heo
2011-12-22 22:20     ` Andrew Morton
2011-12-22 22:41       ` Tejun Heo
2011-12-22 22:54         ` Andrew Morton
2011-12-22 23:00           ` Tejun Heo
2011-12-22 23:16             ` Andrew Morton
2011-12-22 23:24               ` Tejun Heo
2011-12-22 23:41                 ` Andrew Morton
2011-12-22 23:54                   ` Tejun Heo
2011-12-23  1:14                     ` Andrew Morton
2011-12-23 15:17                       ` Vivek Goyal
2011-12-27 18:34                       ` Tejun Heo
2011-12-27 21:20                         ` Andrew Morton
2011-12-27 21:44                           ` Tejun Heo
2011-12-27 21:58                             ` Andrew Morton
2011-12-27 22:22                               ` Tejun Heo [this message]
2011-12-23  1:21                   ` Vivek Goyal
2011-12-23  1:38                     ` Andrew Morton
2011-12-23  2:54                       ` Vivek Goyal
2011-12-23  3:11                         ` Andrew Morton
2011-12-23 14:58                           ` Vivek Goyal
2011-12-27 21:25                             ` Andrew Morton
2011-12-27 22:07                               ` Tejun Heo
2011-12-27 22:21                                 ` Andrew Morton
2011-12-27 22:30                                   ` Tejun Heo
2012-01-16 15:26                                     ` Vivek Goyal
2011-12-23  1:40       ` Vivek Goyal
2011-12-23  1:58         ` Andrew Morton
2011-12-23  2:56           ` Vivek Goyal
2011-12-26  6:05             ` KAMEZAWA Hiroyuki
2011-12-27 17:52               ` Tejun Heo
2011-12-28  0:14                 ` KAMEZAWA Hiroyuki
2011-12-28  0:41                   ` Tejun Heo
2012-01-05  1:28                     ` Tejun Heo
2012-01-16 15:28                       ` Vivek Goyal
2012-02-09 23:58                       ` Tejun Heo
2012-02-10 16:26                         ` Vivek Goyal
2012-02-13 22:31                           ` Tejun Heo
2012-02-15 15:43                             ` Vivek Goyal
2011-12-23 14:46           ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111227222223.GI17712@google.com \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=cl@linux-foundation.org \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nate@cpanel.net \
    --cc=oleg@redhat.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.