From: Vivek Goyal <vgoyal@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>,
avi@redhat.com, nate@cpanel.net, cl@linux-foundation.org,
oleg@redhat.com, axboe@kernel.dk, linux-kernel@vger.kernel.org
Subject: Re: [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock
Date: Fri, 23 Dec 2011 10:17:25 -0500 [thread overview]
Message-ID: <20111223151725.GC16818@redhat.com> (raw)
In-Reply-To: <20111222171432.e429c041.akpm@linux-foundation.org>
On Thu, Dec 22, 2011 at 05:14:32PM -0800, Andrew Morton wrote:
[..]
> Separately...
>
> Mixing mempools and percpu_alloc() in the proposed fashion seems a
> pretty poor fit. mempools are for high-frequency low-level allocations
> which have key characteristics: there are typically a finite number of
> elements in flight and we *know* that elements are being freed in a
> timely manner.
>
> This doesn't fit with percpu_alloc(), which is a very heavyweight
> operation requiring GFP_KERNEL and it doesn't fit with
> blkio_group_stats_cpu because blkio_group_stats_cpu does not have the
> "freed in a timely manner" behaviour.
>
> To resolve these things you've added the workqueue to keep the pool
> populated, which turns percpu_mempool into a quite different concept
> which happens to borrow some mempool code (not necessarily a bad thing).
> This will result in some memory wastage, keeping that pool full.
Tejun seems to have created one global pool for all the device with
minimum number of objects at 8. So keeping just 8 objects aside for
the whole system is not a significant memory wastage, IMHO. This is
much better then creating all the possible cgroup, device pair cgroup
irrespective of the fact whether IO is happening on that pair or not.
>
> More significantly, it's pretty unreliable: if the allocations outpace
> the kernel thread's ability to refill the pool, all we can do is to
> wait for the kernel thread to do some work. But we're holding
> low-level locks while doing that wait, which will block the kernel
> thread. Deadlock.
Sorry, I did not understand this point. In the IO path we don't wait
for kernel thread to do some work. If there are no elements in the pool,
we will simply return and not create the group and IO will be accounted to
root group. Once the new IO comes in, it will try again and if by
that time worker thread succeeded in filling the pool, we will allocate
the group.
So in tight memory situation we will not account the IO to actual cgroup
but to root cgroup. Once the memory situation eases, IO will be acccounted
to right cgroup.
Can't see any deadlock happening. Did I miss something?
Thanks
Vivek
next prev parent reply other threads:[~2011-12-23 15:17 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-22 21:45 [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock Tejun Heo
2011-12-22 21:45 ` [PATCH 1/7] mempool: fix and document synchronization and memory barrier usage Tejun Heo
2011-12-22 21:45 ` [PATCH 2/7] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Tejun Heo
2011-12-22 21:45 ` [PATCH 3/7] mempool: fix first round failure behavior Tejun Heo
2011-12-22 21:45 ` [PATCH 4/7] mempool: factor out mempool_fill() Tejun Heo
2011-12-22 21:45 ` [PATCH 5/7] mempool: separate out __mempool_create() Tejun Heo
2011-12-22 21:45 ` [PATCH 6/7] mempool, percpu: implement percpu mempool Tejun Heo
2011-12-22 21:45 ` [PATCH 7/7] block: fix deadlock through percpu allocation in blk-cgroup Tejun Heo
2011-12-23 1:00 ` Vivek Goyal
2011-12-23 22:54 ` Tejun Heo
2011-12-22 21:59 ` [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock Andrew Morton
2011-12-22 22:09 ` Tejun Heo
2011-12-22 22:20 ` Andrew Morton
2011-12-22 22:41 ` Tejun Heo
2011-12-22 22:54 ` Andrew Morton
2011-12-22 23:00 ` Tejun Heo
2011-12-22 23:16 ` Andrew Morton
2011-12-22 23:24 ` Tejun Heo
2011-12-22 23:41 ` Andrew Morton
2011-12-22 23:54 ` Tejun Heo
2011-12-23 1:14 ` Andrew Morton
2011-12-23 15:17 ` Vivek Goyal [this message]
2011-12-27 18:34 ` Tejun Heo
2011-12-27 21:20 ` Andrew Morton
2011-12-27 21:44 ` Tejun Heo
2011-12-27 21:58 ` Andrew Morton
2011-12-27 22:22 ` Tejun Heo
2011-12-23 1:21 ` Vivek Goyal
2011-12-23 1:38 ` Andrew Morton
2011-12-23 2:54 ` Vivek Goyal
2011-12-23 3:11 ` Andrew Morton
2011-12-23 14:58 ` Vivek Goyal
2011-12-27 21:25 ` Andrew Morton
2011-12-27 22:07 ` Tejun Heo
2011-12-27 22:21 ` Andrew Morton
2011-12-27 22:30 ` Tejun Heo
2012-01-16 15:26 ` Vivek Goyal
2011-12-23 1:40 ` Vivek Goyal
2011-12-23 1:58 ` Andrew Morton
2011-12-23 2:56 ` Vivek Goyal
2011-12-26 6:05 ` KAMEZAWA Hiroyuki
2011-12-27 17:52 ` Tejun Heo
2011-12-28 0:14 ` KAMEZAWA Hiroyuki
2011-12-28 0:41 ` Tejun Heo
2012-01-05 1:28 ` Tejun Heo
2012-01-16 15:28 ` Vivek Goyal
2012-02-09 23:58 ` Tejun Heo
2012-02-10 16:26 ` Vivek Goyal
2012-02-13 22:31 ` Tejun Heo
2012-02-15 15:43 ` Vivek Goyal
2011-12-23 14:46 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111223151725.GC16818@redhat.com \
--to=vgoyal@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=axboe@kernel.dk \
--cc=cl@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nate@cpanel.net \
--cc=oleg@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).