public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@redhat.com>
To: Stephane Eranian <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
	Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@redhat.com>,
	namhyung.kim@kernel.org, acme@kernel.org
Subject: Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale
Date: Wed, 15 Mar 2017 10:50:59 -0300	[thread overview]
Message-ID: <20170315135059.GC2177@redhat.com> (raw)
In-Reply-To: <1489561041-19778-1-git-send-email-eranian@google.com>

Em Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian escreveu:
> This patch significantly improves the execution time of
> perf_event__synthesize_mmap_events() when running perf record
> on systems where processes have lots of threads. It just happens
> that cat /proc/pid/maps support uses a O(N^2) algorithm to generate
> each map line in the maps file.  If you have 1000 threads, then you have
> necessarily 1000 stacks.  For each vma, you need to check if it corresponds
> to a thread's stack.  With a large number of threads, this can take a very long time. I have seen latencies >> 10mn.
> 
> As of today, perf does not use the fact that a mapping is a stack,
> therefore we can work around the issue by using /proc/pid/tasks/pid/maps.
> This entry does not try to map a vma to stack and is thus much
> faster with no loss of functonality.
> 
> The proc-map-timeout logic is kept in case user still want some uppre limit.
> 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/event.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 4ea7ce7..b137566 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
>  	if (machine__is_default_guest(machine))
>  		return 0;
>  
> -	snprintf(filename, sizeof(filename), "%s/proc/%d/maps",
> -		 machine->root_dir, pid);
> +	snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps",
> +		 machine->root_dir, pid, pid);

Humm...

[root@jouet ~]# ps -C firefox -L
  PID   LWP TTY          TIME CMD
25023 25023 tty2     01:32:10 firefox
25023 25048 tty2     00:00:00 gmain
25023 25049 tty2     00:00:00 gdbus
25023 25056 tty2     00:00:00 Gecko_IOThread
25023 25058 tty2     00:00:00 Link Monitor
25023 25059 tty2     00:03:52 Socket Thread
25023 25060 tty2     00:00:00 JS Watchdog
25023 25061 tty2     00:00:12 JS Helper
25023 25062 tty2     00:00:12 JS Helper
25023 25063 tty2     00:00:13 JS Helper
25023 25064 tty2     00:00:12 JS Helper
25023 25065 tty2     00:00:13 JS Helper
25023 25066 tty2     00:00:12 JS Helper
25023 25067 tty2     00:00:13 JS Helper
25023 25068 tty2     00:00:13 JS Helper
25023 25069 tty2     00:00:00 Hang Monitor
25023 25070 tty2     00:00:00 BgHangManager
25023 25117 tty2     00:00:12 Cache2 I/O
25023 25118 tty2     00:05:37 Timer
25023 25146 tty2     00:00:00 DataStorage
25023 25150 tty2     00:00:00 GMPThread
25023 25152 tty2     00:04:12 Compositor
25023 25156 tty2     00:00:02 ImgDecoder #1
25023 25157 tty2     00:00:02 ImgDecoder #2
25023 25158 tty2     00:00:02 ImgDecoder #3
25023 25159 tty2     00:00:00 ImageIO
25023 25160 tty2     00:02:26 SoftwareVsyncTh
25023 25202 tty2     00:00:00 HTML5 Parser
25023 25204 tty2     00:00:00 IPDL Background
25023 25212 tty2     00:00:22 DOM Worker
25023 25219 tty2     00:00:23 URL Classifier
25023 25220 tty2     00:00:05 ImageBridgeChil
25023 25221 tty2     00:00:04 mozStorage #1
25023 25222 tty2     00:00:00 Proxy R~olution
25023 25223 tty2     00:00:00 DataStorage
25023 25234 tty2     00:00:00 Cache I/O
25023 25235 tty2     00:00:01 mozStorage #2
25023 25236 tty2     00:00:00 mozStorage #3
25023 25252 tty2     00:00:20 DOM Worker
25023 25256 tty2     00:00:00 mozStorage #4
25023 25257 tty2     00:00:00 localStorage DB
25023 25260 tty2     00:00:00 mozStorage #5
25023  2066 tty2     00:00:19 DOM Worker
25023  5988 tty2     00:00:00 threaded-ml
25023  6757 tty2     00:00:00 firefox
25023  6999 tty2     00:00:00 mozStorage #6
25023 19239 tty2     00:00:02 mozStorage #7
25023 12038 tty2     00:00:05 threaded-ml
25023 19301 tty2     00:00:00 firefox
[root@jouet ~]# 

[root@jouet ~]# file /proc/25023/tasks
/proc/25023/tasks: cannot open `/proc/25023/tasks' (No such file or directory)
[root@jouet ~]#

But...

[root@jouet ~]# file /proc/25023/task
/proc/25023/task: directory
[root@jouet ~]# wc -l /proc/25023/task/25023/maps
1200 /proc/25023/task/25023/maps
[root@jouet ~]# wc -l /proc/25023/task/25048/maps
1201 /proc/25023/task/25048/maps
[root@jouet ~]# wc -l /proc/25048/task/25048/maps
1201 /proc/25048/task/25048/maps
[root@jouet ~]#

[acme@jouet linux]$ grep \"tasks\"  fs/proc/*.c
[acme@jouet linux]$ grep \"task\"  fs/proc/*.c
fs/proc/base.c:	DIR("task",       S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations),
fs/proc/base.c:	name.name = "task";
[acme@jouet linux]$

These end up mapping to the later of these two:

const struct file_operations proc_pid_maps_operations = {
        .open           = pid_maps_open,
        .read           = seq_read,
        .llseek         = seq_lseek,
        .release        = proc_map_release,
};

const struct file_operations proc_tid_maps_operations = {
        .open           = tid_maps_open,
        .read           = seq_read, 
        .llseek         = seq_lseek,
        .release        = proc_map_release,
};


Which ends up going down to:

show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)

with that is_pid respectively true and false, but then show_map_vma() doesn't
use 'is_pid' at all :-\

/me scratches head, what am I missing?

The following commit, that appeared circa v4.9-rc2. So, fixing up the "tasks"
-> "tasks" we end up with something safe and that avoids this by now
non-existing problem, on older kernels, ok?

commit b18cb64ead400c01bf1580eeba330ace51f8087d
Author: Andy Lutomirski <luto@kernel.org>
Date:   Fri Sep 30 10:58:57 2016 -0700

    fs/proc: Stop trying to report thread stacks
    
    This reverts more of:
    
      b76437579d13 ("procfs: mark thread stack correctly in proc/<pid>/maps")
    
    ... which was partially reverted by:
    
      65376df58217 ("proc: revert /proc/<pid>/maps [stack:TID] annotation")
    
    Originally, /proc/PID/task/TID/maps was the same as /proc/TID/maps.
    
    In current kernels, /proc/PID/maps (or /proc/TID/maps even for
    threads) shows "[stack]" for VMAs in the mm's stack address range.
    
    In contrast, /proc/PID/task/TID/maps uses KSTK_ESP to guess the
    target thread's stack's VMA.  This is racy, probably returns garbage
    and, on arches with CONFIG_TASK_INFO_IN_THREAD=y, is also crash-prone:
    KSTK_ESP is not safe to use on tasks that aren't known to be running
    ordinary process-context kernel code.
    
    This patch removes the difference and just shows "[stack]" for VMAs
    in the mm's stack range.  This is IMO much more sensible -- the
    actual "stack" address really is treated specially by the VM code,
    and the current thread stack isn't even well-defined for programs
    that frequently switch stacks on their own.
    
    Reported-by: Jann Horn <jann@thejh.net>
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Acked-by: Thomas Gleixner <tglx@linutronix.de>


>  
>  	fp = fopen(filename, "r");
>  	if (fp == NULL) {
> -- 
> 2.5.0

  parent reply	other threads:[~2017-03-15 13:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-15  6:57 [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale Stephane Eranian
2017-03-15 11:33 ` Jiri Olsa
2017-03-15 13:50 ` Arnaldo Carvalho de Melo [this message]
2017-03-15 13:58   ` Arnaldo Carvalho de Melo
2017-03-15 17:08     ` Stephane Eranian
2017-03-15 17:44       ` Arnaldo Carvalho de Melo
2017-03-16  2:27         ` Stephane Eranian
2017-03-15 17:17   ` [PATCH V2] " Stephane Eranian
2017-03-16 16:35     ` [tip:perf/core] perf tools: Make " tip-bot for Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170315135059.GC2177@redhat.com \
    --to=acme@redhat.com \
    --cc=acme@kernel.org \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung.kim@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox