linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Colin Cross <ccross@android.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>, Shaohua Li <shli@fb.com>,
	Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>,
	Linux-MM <linux-mm@kvack.org>,
	lkml <linux-kernel@vger.kernel.org>,
	kernel-team@fb.com
Subject: Re: [PATCH] proc: revert /proc/<pid>/maps [stack:TID] annotation
Date: Thu, 28 Jan 2016 12:25:45 +0200	[thread overview]
Message-ID: <20160128102544.GA2396@node.shutemov.name> (raw)
In-Reply-To: <CAMbhsRT-XsxkznXzygkdP2tmVr4Xgfi9TCQ2i66dqz8vGfJD3Q@mail.gmail.com>

On Mon, Jan 25, 2016 at 03:53:01PM -0800, Colin Cross wrote:
> On Mon, Jan 25, 2016 at 3:14 PM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> > On Mon, Jan 25, 2016 at 01:30:00PM -0800, Colin Cross wrote:
> >> On Tue, Jan 19, 2016 at 3:30 PM, Kirill A. Shutemov
> >> <kirill@shutemov.name> wrote:
> >> > On Tue, Jan 19, 2016 at 02:14:30PM -0800, Andrew Morton wrote:
> >> >> On Tue, 19 Jan 2016 13:02:39 -0500 Johannes Weiner <hannes@cmpxchg.org> wrote:
> >> >>
> >> >> > b764375 ("procfs: mark thread stack correctly in proc/<pid>/maps")
> >> >> > added [stack:TID] annotation to /proc/<pid>/maps. Finding the task of
> >> >> > a stack VMA requires walking the entire thread list, turning this into
> >> >> > quadratic behavior: a thousand threads means a thousand stacks, so the
> >> >> > rendering of /proc/<pid>/maps needs to look at a million threads. The
> >> >> > cost is not in proportion to the usefulness as described in the patch.
> >> >> >
> >> >> > Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
> >> >> > /proc/<pid>/numa_maps) usable again for higher thread counts.
> >> >> >
> >> >> > The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained,
> >> >> > as identifying the stack VMA there is an O(1) operation.
> >> >>
> >> >> Four years ago, ouch.
> >> >>
> >> >> Any thoughts on the obvious back-compatibility concerns?  ie, why did
> >> >> Siddhesh implement this in the first place?  My bad for not ensuring
> >> >> that the changelog told us this.
> >> >>
> >> >> https://lkml.org/lkml/2012/1/14/25 has more info:
> >> >>
> >> >> : Memory mmaped by glibc for a thread stack currently shows up as a
> >> >> : simple anonymous map, which makes it difficult to differentiate between
> >> >> : memory usage of the thread on stack and other dynamic allocation.
> >> >> : Since glibc already uses MAP_STACK to request this mapping, the
> >> >> : attached patch uses this flag to add additional VM_STACK_FLAGS to the
> >> >> : resulting vma so that the mapping is treated as a stack and not any
> >> >> : regular anonymous mapping.  Also, one may use vm_flags to decide if a
> >> >> : vma is a stack.
> >> >>
> >> >> But even that doesn't really tell us what the actual *value* of the
> >> >> patch is to end-users.
> >> >
> >> > I doubt it can be very useful as it's unreliable: if two stacks are
> >> > allocated end-to-end (which is not good idea, but still) it can only
> >> > report [stack:XXX] for the first one as they are merged into one VMA.
> >> > Any other anon VMA merged with the stack will be also claimed as stack,
> >> > which is not always correct.
> >> >
> >> > I think report the VMA as anon is the best we can know about it,
> >> > everything else just rather expensive guesses.
> >>
> >> An alternative to guessing is the anonymous VMA naming patch used on
> >> Android, https://lkml.org/lkml/2013/10/30/518.  It allows userspace to
> >> name anonymous memory however it wishes, and prevents vma merging
> >> adjacent regions with different names.  Android uses it to label
> >> native heap memory, but it would work well for stacks too.
> >
> > I don't think preventing vma merging is fair price for the feature: you
> > would pay extra in every find_vma() (meaning all page faults).
> >
> > I think it would be nice to have a way to store this kind of sideband info
> > without impacting critical code path.
> >
> > One other use case I see for such sideband info is storing hits from
> > MADV_HUGEPAGE/MADV_NOHUGEPAGE: need to split vma just for these hints is
> > unfortunate.
> 
> In practice we don't see many extra VMAs from naming; alignment
> requirements, guard pages, and permissions differences are usually
> enough to keep adjacent anonymous VMAs from merging.  Here's an
> example from a process on Android:
> 7f9086c000-7f9086d000 rw-p 00006000 fd:00 1495
>   /system/lib64/libhardware_legacy.so
> 7f9086d000-7f9086e000 rw-p 00000000 00:00 0
> 7f9086e000-7f9086f000 rw-p 00000000 00:00 0
>   [anon:linker_alloc]
> 7f90875000-7f90876000 r--p 00000000 00:00 0
>   [anon:linker_alloc]
> 7f9087c000-7f9087d000 r--p 00000000 00:00 0
>   [anon:linker_alloc]
> 7f90901000-7f90902000 ---p 00000000 00:00 0
>   [anon:thread stack guard page]
> 7f90902000-7f90a00000 rw-p 00000000 00:00 0
>   [stack:410]
> 7f90a00000-7f90c00000 rw-p 00000000 00:00 0
>   [anon:libc_malloc]
> 7f90c02000-7f90c03000 ---p 00000000 00:00 0
>   [anon:thread stack guard page]
> 7f90c03000-7f90d01000 rw-p 00000000 00:00 0
>   [stack:409]
> 7f90d01000-7f90d02000 ---p 00000000 00:00 0
>   [anon:thread stack guard page]
> 7f90d02000-7f90e00000 rw-p 00000000 00:00 0
>   [stack:408]
> 7f90e00000-7f91200000 rw-p 00000000 00:00 0
>   [anon:libc_malloc]
> 7f91206000-7f91207000 r--p 00000000 00:00 0
>   [anon:linker_alloc]
> 7f91237000-7f91238000 ---p 00000000 00:00 0
>   [anon:thread signal stack guard page]
> 7f91238000-7f9123c000 rw-p 00000000 00:00 0
>   [anon:thread signal stack]
> 7f9123c000-7f9123d000 ---p 00000000 00:00 0
>   [anon:thread signal stack guard page]
> 7f9123d000-7f91241000 rw-p 00000000 00:00 0
>   [anon:thread signal stack]
> 7f91246000-7f91247000 ---p 00000000 00:00 0
>   [anon:thread signal stack guard page]
> 7f91247000-7f9124b000 rw-p 00000000 00:00 0
>   [anon:thread signal stack]
> 7f9124b000-7f9124c000 ---p 00000000 00:00 0
>   [anon:thread signal stack guard page]
> 7f9124c000-7f91250000 rw-p 00000000 00:00 0
>   [anon:thread signal stack]
> 
> I only see 2 extra VMAs here, the "[stack:410]" and "[stack:408]"
> regions would have been merged with the following "[anon:libc_malloc]"
> regions.

Fair enough.

I wanted to trick you to implemented feature I want. Failed. ;)

The naming approach looks good to me. Storing strings in userspace is
somewhat unusual, but probably okay.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-01-28 10:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-25 21:30 [PATCH] proc: revert /proc/<pid>/maps [stack:TID] annotation Colin Cross
2016-01-25 23:14 ` Kirill A. Shutemov
2016-01-25 23:53   ` Colin Cross
2016-01-28 10:25     ` Kirill A. Shutemov [this message]
  -- strict thread matches above, loose matches on Subject: below --
2016-01-19 18:02 Johannes Weiner
2016-01-19 22:14 ` Andrew Morton
2016-01-19 23:30   ` Kirill A. Shutemov
2016-01-20  3:21     ` Siddhesh Poyarekar
2016-01-19 23:38   ` Johannes Weiner
2016-01-20  3:17   ` Siddhesh Poyarekar
2016-01-20  5:27     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160128102544.GA2396@node.shutemov.name \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=ccross@android.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shli@fb.com \
    --cc=siddhesh.poyarekar@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).