All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Tejun Heo <tj@kernel.org>
Cc: avi@redhat.com, nate@cpanel.net, cl@linux-foundation.org,
	oleg@redhat.com, axboe@kernel.dk, vgoyal@redhat.com,
	linux-kernel@vger.kernel.org, jaxboe@fusionio.com
Subject: Re: [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock
Date: Tue, 27 Dec 2011 13:20:56 -0800	[thread overview]
Message-ID: <20111227132056.44ee9281.akpm@linux-foundation.org> (raw)
In-Reply-To: <20111227183402.GF17712@google.com>

On Tue, 27 Dec 2011 10:34:02 -0800 Tejun Heo <tj@kernel.org> wrote:

> > Separately...
> > 
> > Mixing mempools and percpu_alloc() in the proposed fashion seems a
> > pretty poor fit.  mempools are for high-frequency low-level allocations
> > which have key characteristics: there are typically a finite number of
> > elements in flight and we *know* that elements are being freed in a
> > timely manner.
> 
> mempool is a pool of memory allocations.  Nothing more, nothing less.

No, there is more.  An important concept in mempool is that it is safe
to indefinitely wait for items because it is guaranteed that the
in-flight ones will be returned in a timely fashion.

> > More significantly, it's pretty unreliable: if the allocations outpace
> > the kernel thread's ability to refill the pool, all we can do is to
> > wait for the kernel thread to do some work.  But we're holding
> > low-level locks while doing that wait, which will block the kernel
> > thread.  Deadlock.
> 
> Ummm... if you try to allocate with GFP_KERNEL from IO path, deadlock
> of course is possible.

It is deadlockable when used with GFP_NOFS, or GFP_NOIO.  Anything with
__GFP_WAIT.

The only "safe" usage of the proposed interface is when the caller
performs the per-cpu allocation with !__GFP_WAIT (ie: GFP_ATOMIC) and
when the caller is prepared to repeatedly poll until the allocation
succeeds.

That's a narrow and rare calling pattern which doesn't merit being
formalised into a core kernel API.  It is one which encourages poor
code in callers, which should instead be reworked to use GFP_KERNEL, in
which case the interface isn't needed.

It's also a dangerous API because people could call it with GFP_NOIO
(for example) and risk deadlocks.  Waiting for a kernel thread to
perform a GFP_KERNEL allocation is equivalent to directly performing
that allocation, with GFP_KERNEL.  The only real effect of handing over
to a kernel thread to do the work is to hide your bug from lockdep.

>  That's the *whole* reason why allocation
> buffering is used there.  It's filled from GFP_KERNEL context and
> consumed from GFO_NOIO and as pointed out multiple times the allocaion
> there is infrequent and can be opportunistic.  Sans the use of small
> buffering items, this isn't any different from deferring allocation to
> different context.  There's no guarantee when that allocation would
> happen but in practice both will be reliable enough for the given use
> case.

"reliable enough" is not the standard to which we aspire.

Look, the core problem here is that this block code is trying to
allocate from an inappropriate context.  Rather than hacking around
adding stuff to make this work most-of-the-time, how about fixing the
block code to perform the allocation from the correct context?

> I don't necessarily insist on using mempool here but all the given
> objections seem bogus to me.  The amount of code added is minimal and
> straight-forward.  It doesn't change what mempool is or what it does
> at all and the usage fits the problem to be solved.  I can't really
> understand what the objection is about.

It's adding complexity to core kernel.  It's adding a weak and
dangerous interface which will encourage poor code in callers.

And why are we doing all of this?  So we can continue to use what is
now poor code at one particular site in the block layer!  If the block
code wants to run alloc_percpu() (which requires GFP_KERNEL context)
then it should be reworked to do so from GFP_KERNEL context.  For that
is the *best* solution, no?


  reply	other threads:[~2011-12-27 21:21 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-22 21:45 [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock Tejun Heo
2011-12-22 21:45 ` [PATCH 1/7] mempool: fix and document synchronization and memory barrier usage Tejun Heo
2011-12-22 21:45 ` [PATCH 2/7] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Tejun Heo
2011-12-22 21:45 ` [PATCH 3/7] mempool: fix first round failure behavior Tejun Heo
2011-12-22 21:45 ` [PATCH 4/7] mempool: factor out mempool_fill() Tejun Heo
2011-12-22 21:45 ` [PATCH 5/7] mempool: separate out __mempool_create() Tejun Heo
2011-12-22 21:45 ` [PATCH 6/7] mempool, percpu: implement percpu mempool Tejun Heo
2011-12-22 21:45 ` [PATCH 7/7] block: fix deadlock through percpu allocation in blk-cgroup Tejun Heo
2011-12-23  1:00   ` Vivek Goyal
2011-12-23 22:54     ` Tejun Heo
2011-12-22 21:59 ` [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock Andrew Morton
2011-12-22 22:09   ` Tejun Heo
2011-12-22 22:20     ` Andrew Morton
2011-12-22 22:41       ` Tejun Heo
2011-12-22 22:54         ` Andrew Morton
2011-12-22 23:00           ` Tejun Heo
2011-12-22 23:16             ` Andrew Morton
2011-12-22 23:24               ` Tejun Heo
2011-12-22 23:41                 ` Andrew Morton
2011-12-22 23:54                   ` Tejun Heo
2011-12-23  1:14                     ` Andrew Morton
2011-12-23 15:17                       ` Vivek Goyal
2011-12-27 18:34                       ` Tejun Heo
2011-12-27 21:20                         ` Andrew Morton [this message]
2011-12-27 21:44                           ` Tejun Heo
2011-12-27 21:58                             ` Andrew Morton
2011-12-27 22:22                               ` Tejun Heo
2011-12-23  1:21                   ` Vivek Goyal
2011-12-23  1:38                     ` Andrew Morton
2011-12-23  2:54                       ` Vivek Goyal
2011-12-23  3:11                         ` Andrew Morton
2011-12-23 14:58                           ` Vivek Goyal
2011-12-27 21:25                             ` Andrew Morton
2011-12-27 22:07                               ` Tejun Heo
2011-12-27 22:21                                 ` Andrew Morton
2011-12-27 22:30                                   ` Tejun Heo
2012-01-16 15:26                                     ` Vivek Goyal
2011-12-23  1:40       ` Vivek Goyal
2011-12-23  1:58         ` Andrew Morton
2011-12-23  2:56           ` Vivek Goyal
2011-12-26  6:05             ` KAMEZAWA Hiroyuki
2011-12-27 17:52               ` Tejun Heo
2011-12-28  0:14                 ` KAMEZAWA Hiroyuki
2011-12-28  0:41                   ` Tejun Heo
2012-01-05  1:28                     ` Tejun Heo
2012-01-16 15:28                       ` Vivek Goyal
2012-02-09 23:58                       ` Tejun Heo
2012-02-10 16:26                         ` Vivek Goyal
2012-02-13 22:31                           ` Tejun Heo
2012-02-15 15:43                             ` Vivek Goyal
2011-12-23 14:46           ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111227132056.44ee9281.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=cl@linux-foundation.org \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nate@cpanel.net \
    --cc=oleg@redhat.com \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.