From: Tejun Heo <tj@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vivek Goyal <vgoyal@redhat.com>,
avi@redhat.com, nate@cpanel.net, cl@linux-foundation.org,
oleg@redhat.com, axboe@kernel.dk, linux-kernel@vger.kernel.org
Subject: Re: [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock
Date: Tue, 27 Dec 2011 14:07:53 -0800 [thread overview]
Message-ID: <20111227220753.GH17712@google.com> (raw)
In-Reply-To: <20111227132501.ad7f895f.akpm@linux-foundation.org>
Hello,
On Tue, Dec 27, 2011 at 01:25:01PM -0800, Andrew Morton wrote:
> umm, we've already declared that it is OK to completely waste this
> memory for the users (probably a majority) who will not be using
> these stats.
We're talking about combinatorial combinations where only small subset
is usually expected to be used and, in addition to the absolute usage,
there's big advantage in showing behavior which users would expect.
If 1000 cgroups are doing IOs to 1000 devices, it's expected to
consume some amount of resource.
The whole io_context / blk_cgroup - request_queue association
mechanism is based on opportunistic allocation. It might not be the
prettiest thing in the world but given the circumstances IMHO the
approach fits the constraints defined by the problem.
Given the restricted nature of percpu allocation, it would be nice to
punt it to GFP_KERNEL context *somewhere* and for block layer that
somewhere probably can only be userland access. I just don't see that
fitting better here. The suggested alternative seems much nastier
with userland visible side effects and possibility for combinatorial
increase in memory usage for something as benign as single cat of stat
files.
Also, such erratic userland visible behavior is deviation from the
current one and at the same time we would be bound to the
idiosyncracies later when we can improve the implementation.
I can't see how that can be a better tradeoff. It shifts the problem
to even more cumbersome corner.
> Also, has this stuff been tested at that scale? I fear to think what
> 10000 allocations will do to fragmetnation of the vmalloc() arena.
Percpu allocator doesn't use vmalloc directly. It maps address ranges
(which is at least 32k and usually much larger) from vmalloc space and
allocate it using simplistic extent allocator.
Thanks.
--
tejun
next prev parent reply other threads:[~2011-12-27 22:08 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-22 21:45 [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock Tejun Heo
2011-12-22 21:45 ` [PATCH 1/7] mempool: fix and document synchronization and memory barrier usage Tejun Heo
2011-12-22 21:45 ` [PATCH 2/7] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Tejun Heo
2011-12-22 21:45 ` [PATCH 3/7] mempool: fix first round failure behavior Tejun Heo
2011-12-22 21:45 ` [PATCH 4/7] mempool: factor out mempool_fill() Tejun Heo
2011-12-22 21:45 ` [PATCH 5/7] mempool: separate out __mempool_create() Tejun Heo
2011-12-22 21:45 ` [PATCH 6/7] mempool, percpu: implement percpu mempool Tejun Heo
2011-12-22 21:45 ` [PATCH 7/7] block: fix deadlock through percpu allocation in blk-cgroup Tejun Heo
2011-12-23 1:00 ` Vivek Goyal
2011-12-23 22:54 ` Tejun Heo
2011-12-22 21:59 ` [PATCHSET] block, mempool, percpu: implement percpu mempool and fix blkcg percpu alloc deadlock Andrew Morton
2011-12-22 22:09 ` Tejun Heo
2011-12-22 22:20 ` Andrew Morton
2011-12-22 22:41 ` Tejun Heo
2011-12-22 22:54 ` Andrew Morton
2011-12-22 23:00 ` Tejun Heo
2011-12-22 23:16 ` Andrew Morton
2011-12-22 23:24 ` Tejun Heo
2011-12-22 23:41 ` Andrew Morton
2011-12-22 23:54 ` Tejun Heo
2011-12-23 1:14 ` Andrew Morton
2011-12-23 15:17 ` Vivek Goyal
2011-12-27 18:34 ` Tejun Heo
2011-12-27 21:20 ` Andrew Morton
2011-12-27 21:44 ` Tejun Heo
2011-12-27 21:58 ` Andrew Morton
2011-12-27 22:22 ` Tejun Heo
2011-12-23 1:21 ` Vivek Goyal
2011-12-23 1:38 ` Andrew Morton
2011-12-23 2:54 ` Vivek Goyal
2011-12-23 3:11 ` Andrew Morton
2011-12-23 14:58 ` Vivek Goyal
2011-12-27 21:25 ` Andrew Morton
2011-12-27 22:07 ` Tejun Heo [this message]
2011-12-27 22:21 ` Andrew Morton
2011-12-27 22:30 ` Tejun Heo
2012-01-16 15:26 ` Vivek Goyal
2011-12-23 1:40 ` Vivek Goyal
2011-12-23 1:58 ` Andrew Morton
2011-12-23 2:56 ` Vivek Goyal
2011-12-26 6:05 ` KAMEZAWA Hiroyuki
2011-12-27 17:52 ` Tejun Heo
2011-12-28 0:14 ` KAMEZAWA Hiroyuki
2011-12-28 0:41 ` Tejun Heo
2012-01-05 1:28 ` Tejun Heo
2012-01-16 15:28 ` Vivek Goyal
2012-02-09 23:58 ` Tejun Heo
2012-02-10 16:26 ` Vivek Goyal
2012-02-13 22:31 ` Tejun Heo
2012-02-15 15:43 ` Vivek Goyal
2011-12-23 14:46 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111227220753.GH17712@google.com \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=axboe@kernel.dk \
--cc=cl@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nate@cpanel.net \
--cc=oleg@redhat.com \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.