From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
Vasiliy Kulikov <segoon@openwall.com>,
containers@lists.osdl.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, Nathan Lynch <ntl@pobox.com>,
Oren Laadan <orenl@cs.columbia.edu>,
Daniel Lezcano <dlezcano@fr.ibm.com>,
Glauber Costa <glommer@parallels.com>,
James Bottomley <jbottomley@parallels.com>,
Tejun Heo <tj@kernel.org>, Alexey Dobriyan <adobriyan@gmail.com>,
Al Viro <viro@ZenIV.linux.org.uk>,
Pavel Emelyanov <xemul@parallels.com>
Subject: Re: [patch 2/2] fs, proc: Introduce the /proc/<pid>/map_files/ directory v6
Date: Fri, 2 Sep 2011 09:53:50 +0400 [thread overview]
Message-ID: <20110902055350.GK30615@sun> (raw)
In-Reply-To: <20110901154946.b43ba91b.akpm@linux-foundation.org>
On Thu, Sep 01, 2011 at 03:49:46PM -0700, Andrew Morton wrote:
> On Thu, 1 Sep 2011 14:46:34 +0400
> Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>
> > On Wed, Aug 31, 2011 at 03:10:23PM -0700, Andrew Morton wrote:
> > >
> > > This function would benefit from a code comment.
> > >
> > > Given that it's pretty generic (indeed there might be open-coded code
> > > which already does this elsewhere), perhaps it should be in mm/mmap.c
> > > as a kernel-wide utility function. That will add a little overhead to
> > > CONFIG_PROC_FS=n builds, which doesn't seem terribly important.
> > >
> >
> > Andrew, here is an attempt to address concerns. Please review.
> > Complains are welcome as always!
>
> Changelog still doesn't explain why /proc/pid/maps is unfixably unsuitable.
>
Will update.
> >
> > ...
> >
> > +static int map_files_d_revalidate(struct dentry *dentry, struct nameidata *nd)
> > +{
> > + unsigned long vm_start, vm_end;
> > + struct task_struct *task;
> > + const struct cred *cred;
> > + struct mm_struct *mm;
> > + struct inode *inode;
> > +
> > + bool exact_vma_exists = false;
> >
>
> Extraneous newline there.
ok, thanks
...
> > +
> > + /*
> > + * We need two passes here:
> > + *
> > + * 1) Collect vmas of mapped files with mmap_sem taken
> > + * 2) Release mmap_sem and instantiate entries
> > + *
> > + * otherwise we get lockdep complained, since filldir()
> > + * routine might require mmap_sem taken in might_fault().
> > + */
> > +
> > + for (vma = mm->mmap; vma; vma = vma->vm_next) {
> > + if (vma->vm_file)
> > + nr_files++;
> > + }
> > +
> > + if (nr_files) {
> > + mem_size = nr_files * sizeof(*info);
> > + if (mem_size <= KMALLOC_MAX_SIZE)
> > + info = kmalloc(mem_size, GFP_KERNEL);
> > + else
> > + info = vmalloc(mem_size);
>
> This still sucks :(
>
> A KMALLOC_MAX_SIZE allocation is huuuuuuuuuuuuge! I don't know how big
kmalloc_sizes.h said it could be quite a big :(
> it is nowadays, but over 100 kbytes. This will frequently send page
> reclaim on a berzerk rampage freeing *thousands* of pages (or
> relocating pages) until it manages to generate 20 or 30 physically
> contiguous free pages.
>
> Also, vmalloc sucks. The more often we perform vmallocs (with a mix of
> differently-sized ones), the more internally fragmented the vmalloc
> arena will become. With some workloads we'll run out of
> sufficiently-large contiguous free spaces and things will start
> failing. This doesn't happen often. Yet. But the more vmalloc()
> callsites we add, the more likely and the more frequent it becomes. So
> vmalloc is something we should only use as a last resort.
>
> The most robust implementation here is to allocate a large number of
> small objects - one per vma. A list would be a suitable way of
> managing them.
Actually I though about slab-cache with 64 bytes per object, but it
requires more code to push here, that is why I stopped. Still this
doesn't justify me. So yes, bad idea. Thanks!
>
> But do we *really* need to do it in two passes? Avoiding the temporary
> storage would involve doing more work under mmap_sem, and a put_filp()
> under mmap_sem might be problematic.
>
In real, for the particular case where we use this /proc/pid/map_files
it could be done in one pass without mmap_sem taken (since we use it
when task is frozen and we know noone is poking vmas) but the problem
is that people might start using it not for c/r but in various different
cases when task is pretty running and I wanna give them more-less robust
result in ls -l over this directory.
Andrew, I'll think some more, probably I'll find a way to drop this two
passes requirement.
Cyrill
next prev parent reply other threads:[~2011-09-02 5:53 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-31 7:58 [patch 0/2] Introduce /proc/pid/map_files v6 Cyrill Gorcunov
2011-08-31 7:58 ` [patch 1/2] fs, proc: Make proc_get_link to use dentry instead of inode Cyrill Gorcunov
2011-08-31 7:58 ` [patch 2/2] fs, proc: Introduce the /proc/<pid>/map_files/ directory v6 Cyrill Gorcunov
2011-08-31 9:06 ` Vasiliy Kulikov
2011-08-31 10:12 ` Cyrill Gorcunov
2011-08-31 11:26 ` Cyrill Gorcunov
2011-08-31 14:04 ` Kirill A. Shutemov
2011-08-31 14:09 ` Cyrill Gorcunov
2011-08-31 14:26 ` Cyrill Gorcunov
2011-08-31 22:10 ` Andrew Morton
2011-09-01 3:07 ` Kyle Moffett
2011-09-01 3:07 ` Kyle Moffett
2011-09-01 7:58 ` Pavel Emelyanov
2011-09-01 11:50 ` Tejun Heo
2011-09-01 12:13 ` Pavel Emelyanov
2011-09-01 17:13 ` Tejun Heo
2011-09-02 19:15 ` Matt Helsley
2011-09-02 0:09 ` Matt Helsley
2011-09-01 8:05 ` Cyrill Gorcunov
2011-09-02 16:37 ` Vasiliy Kulikov
2011-09-02 16:37 ` [kernel-hardening] " Vasiliy Kulikov
2011-09-05 18:53 ` Vasiliy Kulikov
2011-09-05 18:53 ` [kernel-hardening] " Vasiliy Kulikov
2011-09-05 19:20 ` Cyrill Gorcunov
2011-09-05 19:20 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-05 19:49 ` Vasiliy Kulikov
2011-09-05 19:49 ` [kernel-hardening] " Vasiliy Kulikov
2011-09-05 20:36 ` Cyrill Gorcunov
2011-09-05 20:36 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-06 10:15 ` Vasiliy Kulikov
2011-09-06 10:15 ` [kernel-hardening] " Vasiliy Kulikov
2011-09-06 16:51 ` Tejun Heo
2011-09-06 16:51 ` [kernel-hardening] " Tejun Heo
2011-09-06 17:29 ` Vasiliy Kulikov
2011-09-06 17:29 ` [kernel-hardening] " Vasiliy Kulikov
2011-09-06 17:33 ` Tejun Heo
2011-09-06 17:33 ` [kernel-hardening] " Tejun Heo
2011-09-06 18:15 ` Cyrill Gorcunov
2011-09-06 18:15 ` [kernel-hardening] " Cyrill Gorcunov
[not found] ` <20110906173341.GM18425-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-09-07 11:23 ` Vasiliy Kulikov
2011-09-07 11:23 ` [kernel-hardening] " Vasiliy Kulikov
2011-09-07 21:53 ` Cyrill Gorcunov
2011-09-07 21:53 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-07 22:13 ` Andrew Morton
2011-09-07 22:13 ` Andrew Morton
2011-09-07 22:13 ` [kernel-hardening] " Andrew Morton
2011-09-07 22:42 ` Cyrill Gorcunov
2011-09-07 22:42 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-07 22:53 ` Andrew Morton
2011-09-07 22:53 ` Andrew Morton
2011-09-07 22:53 ` [kernel-hardening] " Andrew Morton
2011-09-08 5:48 ` Cyrill Gorcunov
2011-09-08 5:48 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-08 5:50 ` Cyrill Gorcunov
2011-09-08 5:50 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-08 6:04 ` Cyrill Gorcunov
2011-09-08 6:04 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-08 23:52 ` Andrew Morton
2011-09-08 23:52 ` Andrew Morton
2011-09-08 23:52 ` [kernel-hardening] " Andrew Morton
2011-09-09 0:24 ` Pavel Emelyanov
2011-09-09 0:24 ` [kernel-hardening] " Pavel Emelyanov
2011-09-09 5:48 ` Cyrill Gorcunov
2011-09-09 5:48 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-09 6:00 ` Andrew Morton
2011-09-09 6:00 ` [kernel-hardening] " Andrew Morton
2011-09-09 6:22 ` Cyrill Gorcunov
2011-09-09 6:22 ` [kernel-hardening] " Cyrill Gorcunov
2011-09-10 13:21 ` Vasiliy Kulikov
2011-09-10 13:49 ` Cyrill Gorcunov
2011-09-01 10:46 ` Cyrill Gorcunov
2011-09-01 22:49 ` Andrew Morton
2011-09-01 23:04 ` Tejun Heo
2011-09-02 5:54 ` Cyrill Gorcunov
2011-09-02 5:53 ` Cyrill Gorcunov [this message]
2011-08-31 22:50 ` Andrew Morton
2011-09-02 1:54 ` Nicholas Miell
2011-09-02 1:58 ` Tejun Heo
2011-09-02 2:04 ` Nicholas Miell
2011-09-02 2:29 ` Tejun Heo
2011-09-02 8:07 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110902055350.GK30615@sun \
--to=gorcunov@gmail.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.osdl.org \
--cc=dlezcano@fr.ibm.com \
--cc=glommer@parallels.com \
--cc=jbottomley@parallels.com \
--cc=kirill@shutemov.name \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ntl@pobox.com \
--cc=orenl@cs.columbia.edu \
--cc=segoon@openwall.com \
--cc=tj@kernel.org \
--cc=viro@ZenIV.linux.org.uk \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.