From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Colin Cross <ccross@android.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>, Shaohua Li <shli@fb.com>,
Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>,
Linux-MM <linux-mm@kvack.org>,
lkml <linux-kernel@vger.kernel.org>,
kernel-team@fb.com
Subject: Re: [PATCH] proc: revert /proc/<pid>/maps [stack:TID] annotation
Date: Tue, 26 Jan 2016 01:14:51 +0200 [thread overview]
Message-ID: <20160125231451.GA15513@node.shutemov.name> (raw)
In-Reply-To: <CAMbhsRTAeobrQAqujusAVpw+wZyr3WsdKd4iQPi62GWyLB3gJA@mail.gmail.com>
On Mon, Jan 25, 2016 at 01:30:00PM -0800, Colin Cross wrote:
> On Tue, Jan 19, 2016 at 3:30 PM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> > On Tue, Jan 19, 2016 at 02:14:30PM -0800, Andrew Morton wrote:
> >> On Tue, 19 Jan 2016 13:02:39 -0500 Johannes Weiner <hannes@cmpxchg.org> wrote:
> >>
> >> > b764375 ("procfs: mark thread stack correctly in proc/<pid>/maps")
> >> > added [stack:TID] annotation to /proc/<pid>/maps. Finding the task of
> >> > a stack VMA requires walking the entire thread list, turning this into
> >> > quadratic behavior: a thousand threads means a thousand stacks, so the
> >> > rendering of /proc/<pid>/maps needs to look at a million threads. The
> >> > cost is not in proportion to the usefulness as described in the patch.
> >> >
> >> > Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
> >> > /proc/<pid>/numa_maps) usable again for higher thread counts.
> >> >
> >> > The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained,
> >> > as identifying the stack VMA there is an O(1) operation.
> >>
> >> Four years ago, ouch.
> >>
> >> Any thoughts on the obvious back-compatibility concerns? ie, why did
> >> Siddhesh implement this in the first place? My bad for not ensuring
> >> that the changelog told us this.
> >>
> >> https://lkml.org/lkml/2012/1/14/25 has more info:
> >>
> >> : Memory mmaped by glibc for a thread stack currently shows up as a
> >> : simple anonymous map, which makes it difficult to differentiate between
> >> : memory usage of the thread on stack and other dynamic allocation.
> >> : Since glibc already uses MAP_STACK to request this mapping, the
> >> : attached patch uses this flag to add additional VM_STACK_FLAGS to the
> >> : resulting vma so that the mapping is treated as a stack and not any
> >> : regular anonymous mapping. Also, one may use vm_flags to decide if a
> >> : vma is a stack.
> >>
> >> But even that doesn't really tell us what the actual *value* of the
> >> patch is to end-users.
> >
> > I doubt it can be very useful as it's unreliable: if two stacks are
> > allocated end-to-end (which is not good idea, but still) it can only
> > report [stack:XXX] for the first one as they are merged into one VMA.
> > Any other anon VMA merged with the stack will be also claimed as stack,
> > which is not always correct.
> >
> > I think report the VMA as anon is the best we can know about it,
> > everything else just rather expensive guesses.
>
> An alternative to guessing is the anonymous VMA naming patch used on
> Android, https://lkml.org/lkml/2013/10/30/518. It allows userspace to
> name anonymous memory however it wishes, and prevents vma merging
> adjacent regions with different names. Android uses it to label
> native heap memory, but it would work well for stacks too.
I don't think preventing vma merging is fair price for the feature: you
would pay extra in every find_vma() (meaning all page faults).
I think it would be nice to have a way to store this kind of sideband info
without impacting critical code path.
One other use case I see for such sideband info is storing hits from
MADV_HUGEPAGE/MADV_NOHUGEPAGE: need to split vma just for these hints is
unfortunate.
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Colin Cross <ccross@android.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>, Shaohua Li <shli@fb.com>,
Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>,
Linux-MM <linux-mm@kvack.org>,
lkml <linux-kernel@vger.kernel.org>,
kernel-team@fb.com
Subject: Re: [PATCH] proc: revert /proc/<pid>/maps [stack:TID] annotation
Date: Tue, 26 Jan 2016 01:14:51 +0200 [thread overview]
Message-ID: <20160125231451.GA15513@node.shutemov.name> (raw)
In-Reply-To: <CAMbhsRTAeobrQAqujusAVpw+wZyr3WsdKd4iQPi62GWyLB3gJA@mail.gmail.com>
On Mon, Jan 25, 2016 at 01:30:00PM -0800, Colin Cross wrote:
> On Tue, Jan 19, 2016 at 3:30 PM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> > On Tue, Jan 19, 2016 at 02:14:30PM -0800, Andrew Morton wrote:
> >> On Tue, 19 Jan 2016 13:02:39 -0500 Johannes Weiner <hannes@cmpxchg.org> wrote:
> >>
> >> > b764375 ("procfs: mark thread stack correctly in proc/<pid>/maps")
> >> > added [stack:TID] annotation to /proc/<pid>/maps. Finding the task of
> >> > a stack VMA requires walking the entire thread list, turning this into
> >> > quadratic behavior: a thousand threads means a thousand stacks, so the
> >> > rendering of /proc/<pid>/maps needs to look at a million threads. The
> >> > cost is not in proportion to the usefulness as described in the patch.
> >> >
> >> > Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
> >> > /proc/<pid>/numa_maps) usable again for higher thread counts.
> >> >
> >> > The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained,
> >> > as identifying the stack VMA there is an O(1) operation.
> >>
> >> Four years ago, ouch.
> >>
> >> Any thoughts on the obvious back-compatibility concerns? ie, why did
> >> Siddhesh implement this in the first place? My bad for not ensuring
> >> that the changelog told us this.
> >>
> >> https://lkml.org/lkml/2012/1/14/25 has more info:
> >>
> >> : Memory mmaped by glibc for a thread stack currently shows up as a
> >> : simple anonymous map, which makes it difficult to differentiate between
> >> : memory usage of the thread on stack and other dynamic allocation.
> >> : Since glibc already uses MAP_STACK to request this mapping, the
> >> : attached patch uses this flag to add additional VM_STACK_FLAGS to the
> >> : resulting vma so that the mapping is treated as a stack and not any
> >> : regular anonymous mapping. Also, one may use vm_flags to decide if a
> >> : vma is a stack.
> >>
> >> But even that doesn't really tell us what the actual *value* of the
> >> patch is to end-users.
> >
> > I doubt it can be very useful as it's unreliable: if two stacks are
> > allocated end-to-end (which is not good idea, but still) it can only
> > report [stack:XXX] for the first one as they are merged into one VMA.
> > Any other anon VMA merged with the stack will be also claimed as stack,
> > which is not always correct.
> >
> > I think report the VMA as anon is the best we can know about it,
> > everything else just rather expensive guesses.
>
> An alternative to guessing is the anonymous VMA naming patch used on
> Android, https://lkml.org/lkml/2013/10/30/518. It allows userspace to
> name anonymous memory however it wishes, and prevents vma merging
> adjacent regions with different names. Android uses it to label
> native heap memory, but it would work well for stacks too.
I don't think preventing vma merging is fair price for the feature: you
would pay extra in every find_vma() (meaning all page faults).
I think it would be nice to have a way to store this kind of sideband info
without impacting critical code path.
One other use case I see for such sideband info is storing hits from
MADV_HUGEPAGE/MADV_NOHUGEPAGE: need to split vma just for these hints is
unfortunate.
--
Kirill A. Shutemov
next prev parent reply other threads:[~2016-01-25 23:14 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-25 21:30 [PATCH] proc: revert /proc/<pid>/maps [stack:TID] annotation Colin Cross
2016-01-25 21:30 ` Colin Cross
2016-01-25 23:14 ` Kirill A. Shutemov [this message]
2016-01-25 23:14 ` Kirill A. Shutemov
2016-01-25 23:53 ` Colin Cross
2016-01-25 23:53 ` Colin Cross
2016-01-28 10:25 ` Kirill A. Shutemov
2016-01-28 10:25 ` Kirill A. Shutemov
-- strict thread matches above, loose matches on Subject: below --
2016-01-19 18:02 Johannes Weiner
2016-01-19 18:02 ` Johannes Weiner
2016-01-19 22:14 ` Andrew Morton
2016-01-19 22:14 ` Andrew Morton
2016-01-19 23:30 ` Kirill A. Shutemov
2016-01-19 23:30 ` Kirill A. Shutemov
2016-01-20 3:21 ` Siddhesh Poyarekar
2016-01-20 3:21 ` Siddhesh Poyarekar
2016-01-19 23:38 ` Johannes Weiner
2016-01-19 23:38 ` Johannes Weiner
2016-01-20 3:17 ` Siddhesh Poyarekar
2016-01-20 3:17 ` Siddhesh Poyarekar
2016-01-20 5:27 ` Andrew Morton
2016-01-20 5:27 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160125231451.GA15513@node.shutemov.name \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=ccross@android.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=shli@fb.com \
--cc=siddhesh.poyarekar@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.