public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Benjamin Blum <bblum@google.com>
Cc: Paul Menage <menage@google.com>,
	lizf@cn.fujitzu.com, serue@us.ibm.com,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] Adds a read-only "procs" file similar to "tasks" that  shows only unique tgids
Date: Thu, 2 Jul 2009 19:08:45 -0700	[thread overview]
Message-ID: <20090702190845.0cafc46a.akpm@linux-foundation.org> (raw)
In-Reply-To: <2f86c2480907021817o79fce75yd9785aab682f7bb4@mail.gmail.com>

On Thu, 2 Jul 2009 18:17:56 -0700 Benjamin Blum <bblum@google.com> wrote:

> On Thu, Jul 2, 2009 at 6:08 PM, Paul Menage<menage@google.com> wrote:
> > On Thu, Jul 2, 2009 at 5:53 PM, Andrew Morton<akpm@linux-foundation.org> wrote:
> >>> In the first snippet, count will be at most equal to length. As length
> >>> is determined from cgroup_task_count, it can be no greater than the
> >>> total number of pids on the system.
> >>
> >> Well that's a problem, because there can be tens or hundreds of
> >> thousands of pids, and there's a fairly low maximum size for kmalloc()s
> >> (include/linux/kmalloc_sizes.h).
> >>
> >> And even if this allocation attempt doesn't exceed KMALLOC_MAX_SIZE,
> >> large allocations are less unreliable. __There is a large break point at
> >> 8*PAGE_SIZE (PAGE_ALLOC_COSTLY_ORDER).
> >
> > This has been a long-standing problem with the tasks file, ever since
> > the cpusets days.
> >
> > There are ways around it - Lai Jiangshan <laijs@cn.fujitsu.com> posted
> > a patch that allocated an array of pages to store pids in, with a
> > custom sorting function that let you specify indirection rather than
> > assuming everything was in one contiguous array. This was technically
> > the right approach in terms of not needing vmalloc and never doing
> > large allocations, but it was very complex; an alternative that was
> > mooted was to use kmalloc for small cgroups and vmalloc for large
> > ones, so the vmalloc penalty wouldn't be paid generally. The thread
> > fizzled AFAICS.
> 
> As it is currently, the kmalloc call will simply fail if there are too
> many pids, correct? Do we prefer not being able to read the file in
> this case, or would we rather use vmalloc?

We'd prefer that we not use vmalloc and that the reads not fail!



Why are we doing all this anyway?  To avoid presenting duplicated pids
to userspace?  Nothing else?

If so, why not stop doing that - userspace can remove dupes (if it
cares) more easily than the kernel can?


Or we can do it the other way?  Create an initially-empty local IDR
tree or radix tree and, within that, mark off any pids which we've
already emitted?  That'll have a worst-case memory consumption of
approximately PID_MAX_LIMIT bits -- presently that's half a megabyte. 
With no large allocations needed?


btw, did pidlist_uniq() actually needs to allocate new memory for the
output array?  Could it have done the filtering in-place?


  reply	other threads:[~2009-07-03  2:08 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-02 23:26 [PATCH 0/2] CGroups: cgroup member list enhancement/fix Paul Menage
2009-07-02 23:26 ` [PATCH 1/2] Adds a read-only "procs" file similar to "tasks" that shows only unique tgids Paul Menage
2009-07-02 23:46   ` Andrew Morton
2009-07-03  0:31     ` Benjamin Blum
2009-07-03  0:53       ` Andrew Morton
2009-07-03  1:08         ` Paul Menage
2009-07-03  1:17           ` Benjamin Blum
2009-07-03  2:08             ` Andrew Morton [this message]
2009-07-03  4:16               ` Paul Menage
2009-07-03  6:55                 ` Andrew Morton
2009-07-03  7:54                   ` KAMEZAWA Hiroyuki
2009-07-03 16:11                   ` Paul Menage
2009-07-03 16:50                     ` Andrew Morton
2009-07-03 17:54                       ` Paul Menage
2009-07-03 18:10                         ` Andrew Morton
2009-07-15  8:33                           ` Eric W. Biederman
2009-07-15 16:18                             ` Paul Menage
2009-07-03  2:25             ` Matt Helsley
2009-07-03  3:49               ` Paul Menage
2009-07-03  7:08               ` Benjamin Blum
2009-07-03  1:30           ` Andrew Morton
2009-07-03  5:54             ` KAMEZAWA Hiroyuki
2009-07-03 15:52               ` Paul Menage
2009-07-04  2:07                 ` KAMEZAWA Hiroyuki
2009-07-04 16:10                   ` Paul Menage
2009-07-05 23:53                     ` KAMEZAWA Hiroyuki
2009-07-02 23:26 ` [PATCH 2/2] Ensures correct concurrent opening/reading of pidlists across pid namespaces Paul Menage
2009-07-02 23:54   ` Andrew Morton
2009-07-03  0:22     ` Paul Menage
2009-07-03  0:26       ` Paul Menage
2009-07-03  0:43     ` Benjamin Blum
2009-07-03  1:15 ` [PATCH 0/2] CGroups: cgroup member list enhancement/fix Li Zefan
2009-07-05  6:38 ` Balbir Singh
2009-07-10 23:58   ` Paul Menage
2009-07-13 12:11     ` Balbir Singh
2009-07-13 16:26       ` Paul Menage
2009-07-14  5:56         ` Balbir Singh
2009-07-14  6:49           ` Paul Menage
2009-07-14  7:16             ` Balbir Singh
2009-07-14 17:34               ` Benjamin Blum
2009-07-14 17:43                 ` Paul Menage
2009-07-14 20:38                   ` Paul Menage
2009-07-14 23:08                     ` Matt Helsley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090702190845.0cafc46a.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=bblum@google.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitzu.com \
    --cc=menage@google.com \
    --cc=serue@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox