All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Joonsoo Kim <js1304@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Jiri Olsa <jolsa@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	David Ahern <dsahern@gmail.com>, Minchan Kim <minchan@kernel.org>
Subject: Re: [PATCH 2/5] perf kmem: Analyze page allocator events also
Date: Tue, 24 Mar 2015 15:05:07 +0900	[thread overview]
Message-ID: <20150324060507.GA10802@sejong> (raw)
In-Reply-To: <CAAmzW4MkS5M+X+e9w=zv873jjfMxqQxtpNTz3EAkxa+f5WnK4A@mail.gmail.com>

Hi Joonsoo,

On Tue, Mar 24, 2015 at 02:26:39PM +0900, Joonsoo Kim wrote:
> 2015-03-24 9:18 GMT+09:00 Namhyung Kim <namhyung@kernel.org>:
> > On Tue, Mar 24, 2015 at 02:32:17AM +0900, Joonsoo Kim wrote:
> >> 2015-03-23 15:30 GMT+09:00 Namhyung Kim <namhyung@kernel.org>:
> >> > The perf kmem command records and analyze kernel memory allocation
> >> > only for SLAB objects.  This patch implement a simple page allocator
> >> > analyzer using kmem:mm_page_alloc and kmem:mm_page_free events.
> >> >
> >> > It adds two new options of --slab and --page.  The --slab option is
> >> > for analyzing SLAB allocator and that's what perf kmem currently does.
> >> >
> >> > The new --page option enables page allocator events and analyze kernel
> >> > memory usage in page unit.  Currently, 'stat --alloc' subcommand is
> >> > implemented only.
> >> >
> >> > If none of these --slab nor --page is specified, --slab is implied.
> >> >
> >> >   # perf kmem stat --page --alloc --line 10
> >> >
> >> >   -------------------------------------------------------------------------------------
> >> >    Page             | Total alloc (KB) | Hits     | Order | Migration type | GFP flags
> >> >   -------------------------------------------------------------------------------------
> >> >    ffffea0015e48e00 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea0015e47400 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea001440f600 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea001440cc00 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea00140c6300 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea00140c5c00 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea00140c5000 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea00140c4f00 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea00140c4e00 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ffffea00140c4d00 |               16 |        1 |     2 |    RECLAIMABLE |  00285250
> >> >    ...              | ...              | ...      | ...   | ...            | ...
> >> >   -------------------------------------------------------------------------------------
> >>
> >> Tracepoint on mm_page_alloc print out pfn as well as pointer of struct page.
> >> How about printing pfn rather than pointer of struct page?
> >
> > I'd really like to have pfn rather than struct page.  But I don't know
> > how to convert page pointer to pfn in userspace.
> >
> > The output of tracepoint via $debugfs/tracing/trace file is generated
> > from kernel-side, so it can easily have pfn from page pointer.  But
> > tracepoint itself only saves page pointer and we need to convert/print
> > it in userspace.
> 
> Ah...I didn't realize that perf don't use output of $debugfs/tracing/trace
> file. So, perf just uses raw trace buffer directly? If pfn is saved to
> the trace buffer, perf can print pfn rather than pointer of struct page?

Yes, perf uses raw (binary) trace data..  If the tracepoint saves pfn
directly it would be nice!


> 
> > Yes, perf script (or libtraceevent) shows pfn when printing those
> > events.  But that's bogus since it cannot determine the size of the
> > struct page so the pointer arithmetic in open-coded page_to_pfn()
> > which is saved in the print_fmt of the tracepoint will end up with an
> > normal integer arithmatic.
> 
> How about following change and making 'perf kmem' print pfn?
> If we store pfn on the trace buffer, we can print $debugfs/tracing/trace
> as is and 'perf kmem' can also print pfn.

I'm very happy with this change.  The textual output via
debugfs/tracefs will be same, binary data will be changed but
libtraceevent will handle it seamlessly.

Thanks,
Namhyung


> 
> diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> index 4ad10ba..9dcfd0b 100644
> --- a/include/trace/events/kmem.h
> +++ b/include/trace/events/kmem.h
> @@ -199,22 +199,22 @@ TRACE_EVENT(mm_page_alloc,
>         TP_ARGS(page, order, gfp_flags, migratetype),
> 
>         TP_STRUCT__entry(
> -               __field(        struct page *,  page            )
> +               __field(        unsigned long,  pfn             )
>                 __field(        unsigned int,   order           )
>                 __field(        gfp_t,          gfp_flags       )
>                 __field(        int,            migratetype     )
>         ),
> 
>         TP_fast_assign(
> -               __entry->page           = page;
> +               __entry->pfn            = page ? page_to_pfn(page) : -1;
>                 __entry->order          = order;
>                 __entry->gfp_flags      = gfp_flags;
>                 __entry->migratetype    = migratetype;
>         ),
> 
>         TP_printk("page=%p pfn=%lu order=%d migratetype=%d gfp_flags=%s",
> -               __entry->page,
> -               __entry->page ? page_to_pfn(__entry->page) : 0,
> +               __entry->pfn != -1 ? pfn_to_page(__entry->pfn) : NULL,
> +               __entry->pfn != -1 ? __entry->pfn : 0,
>                 __entry->order,
>                 __entry->migratetype,
>                 show_gfp_flags(__entry->gfp_flags))
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2015-03-24  6:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23  6:30 [PATCHSET 0/5] perf kmem: Implement page allocation analysis (v3) Namhyung Kim
2015-03-23  6:30 ` [PATCH 1/5] perf kmem: Print big numbers using thousands' group Namhyung Kim
2015-03-23 14:08   ` Arnaldo Carvalho de Melo
2015-03-23 23:35     ` Namhyung Kim
2015-03-24 16:31   ` [tip:perf/core] " tip-bot for Namhyung Kim
2015-03-23  6:30 ` [PATCH 2/5] perf kmem: Analyze page allocator events also Namhyung Kim
2015-03-23 17:32   ` Joonsoo Kim
2015-03-24  0:18     ` Namhyung Kim
2015-03-24  5:26       ` Joonsoo Kim
2015-03-24  6:05         ` Namhyung Kim [this message]
2015-03-24  7:08         ` Ingo Molnar
2015-03-24 13:17           ` Namhyung Kim
2015-03-23  6:30 ` [PATCH 3/5] perf kmem: Implement stat --page --caller Namhyung Kim
2015-03-23  6:30 ` [PATCH 4/5] perf kmem: Support sort keys on page analysis Namhyung Kim
2015-03-23 17:27   ` Joonsoo Kim
2015-03-24  0:20     ` Namhyung Kim
2015-03-23  6:30 ` [PATCH 5/5] perf kmem: Add --live option for current allocation stat Namhyung Kim
2015-03-23 17:23 ` [PATCHSET 0/5] perf kmem: Implement page allocation analysis (v3) Joonsoo Kim
2015-03-23 23:57   ` Namhyung Kim
  -- strict thread matches above, loose matches on Subject: below --
2015-04-13 22:14 [GIT PULL 0/5] perf/core improvements and fixes Arnaldo Carvalho de Melo
2015-04-13 22:14 ` [PATCH 2/5] perf kmem: Analyze page allocator events also Arnaldo Carvalho de Melo
2015-04-13 22:14   ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150324060507.GA10802@sejong \
    --to=namhyung@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=dsahern@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.