From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47]) by kanga.kvack.org (Postfix) with ESMTP id 216616B0009 for ; Mon, 25 Jan 2016 18:14:55 -0500 (EST) Received: by mail-wm0-f47.google.com with SMTP id n5so104475655wmn.0 for ; Mon, 25 Jan 2016 15:14:55 -0800 (PST) Received: from mail-wm0-x22e.google.com (mail-wm0-x22e.google.com. [2a00:1450:400c:c09::22e]) by mx.google.com with ESMTPS id v18si31571959wju.157.2016.01.25.15.14.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Jan 2016 15:14:53 -0800 (PST) Received: by mail-wm0-x22e.google.com with SMTP id r129so83787608wmr.0 for ; Mon, 25 Jan 2016 15:14:53 -0800 (PST) Date: Tue, 26 Jan 2016 01:14:51 +0200 From: "Kirill A. Shutemov" Subject: Re: [PATCH] proc: revert /proc//maps [stack:TID] annotation Message-ID: <20160125231451.GA15513@node.shutemov.name> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Colin Cross Cc: Andrew Morton , Johannes Weiner , Shaohua Li , Siddhesh Poyarekar , Linux-MM , lkml , kernel-team@fb.com On Mon, Jan 25, 2016 at 01:30:00PM -0800, Colin Cross wrote: > On Tue, Jan 19, 2016 at 3:30 PM, Kirill A. Shutemov > wrote: > > On Tue, Jan 19, 2016 at 02:14:30PM -0800, Andrew Morton wrote: > >> On Tue, 19 Jan 2016 13:02:39 -0500 Johannes Weiner wrote: > >> > >> > b764375 ("procfs: mark thread stack correctly in proc//maps") > >> > added [stack:TID] annotation to /proc//maps. Finding the task of > >> > a stack VMA requires walking the entire thread list, turning this into > >> > quadratic behavior: a thousand threads means a thousand stacks, so the > >> > rendering of /proc//maps needs to look at a million threads. The > >> > cost is not in proportion to the usefulness as described in the patch. > >> > > >> > Drop the [stack:TID] annotation to make /proc//maps (and > >> > /proc//numa_maps) usable again for higher thread counts. > >> > > >> > The [stack] annotation inside /proc//task//maps is retained, > >> > as identifying the stack VMA there is an O(1) operation. > >> > >> Four years ago, ouch. > >> > >> Any thoughts on the obvious back-compatibility concerns? ie, why did > >> Siddhesh implement this in the first place? My bad for not ensuring > >> that the changelog told us this. > >> > >> https://lkml.org/lkml/2012/1/14/25 has more info: > >> > >> : Memory mmaped by glibc for a thread stack currently shows up as a > >> : simple anonymous map, which makes it difficult to differentiate between > >> : memory usage of the thread on stack and other dynamic allocation. > >> : Since glibc already uses MAP_STACK to request this mapping, the > >> : attached patch uses this flag to add additional VM_STACK_FLAGS to the > >> : resulting vma so that the mapping is treated as a stack and not any > >> : regular anonymous mapping. Also, one may use vm_flags to decide if a > >> : vma is a stack. > >> > >> But even that doesn't really tell us what the actual *value* of the > >> patch is to end-users. > > > > I doubt it can be very useful as it's unreliable: if two stacks are > > allocated end-to-end (which is not good idea, but still) it can only > > report [stack:XXX] for the first one as they are merged into one VMA. > > Any other anon VMA merged with the stack will be also claimed as stack, > > which is not always correct. > > > > I think report the VMA as anon is the best we can know about it, > > everything else just rather expensive guesses. > > An alternative to guessing is the anonymous VMA naming patch used on > Android, https://lkml.org/lkml/2013/10/30/518. It allows userspace to > name anonymous memory however it wishes, and prevents vma merging > adjacent regions with different names. Android uses it to label > native heap memory, but it would work well for stacks too. I don't think preventing vma merging is fair price for the feature: you would pay extra in every find_vma() (meaning all page faults). I think it would be nice to have a way to store this kind of sideband info without impacting critical code path. One other use case I see for such sideband info is storing hits from MADV_HUGEPAGE/MADV_NOHUGEPAGE: need to split vma just for these hints is unfortunate. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org