From: Ingo Molnar <mingo@kernel.org>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>, Davidlohr Bueso <davidlohr@hp.com>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Michel Lespinasse <walken@google.com>,
Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
Guan Xuetao <gxt@mprc.pku.edu.cn>,
aswin@hp.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Linus Torvalds <torvalds@linux-foundation.org>,
David Ahern <dsahern@gmail.com>
Subject: Re: [PATCH] mm: cache largest vma
Date: Tue, 5 Nov 2013 09:24:51 +0100 [thread overview]
Message-ID: <20131105082450.GA10127@gmail.com> (raw)
In-Reply-To: <20131104181012.GK9299@localhost.localdomain>
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Mon, Nov 04, 2013 at 06:52:45PM +0100, Ingo Molnar wrote:
> >
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >
> > > On Mon, Nov 04, 2013 at 08:05:00AM +0100, Ingo Molnar wrote:
> > > >
> > > > * Davidlohr Bueso <davidlohr@hp.com> wrote:
> > > >
> > > > > Btw, do you suggest using a high level tool such as perf for getting
> > > > > this data or sprinkling get_cycles() in find_vma() -- I'd think that the
> > > > > first isn't fine grained enough, while the later will probably variate a
> > > > > lot from run to run but the ratio should be rather constant.
> > > >
> > > > LOL - I guess I should have read your mail before replying to it ;-)
> > > >
> > > > Yes, I think get_cycles() works better in this case - not due to
> > > > granularity (perf stat will report cycle granular just fine), but due
> > > > to the size of the critical path you'll be measuring. You really want
> > > > to extract the delta, because it's probably so much smaller than the
> > > > overhead of the workload itself.
> > > >
> > > > [ We still don't have good 'measure overhead from instruction X to
> > > > instruction Y' delta measurement infrastructure in perf yet,
> > > > although Frederic is working on such a trigger/delta facility AFAIK.
> > > > ]
> > >
> > > Yep, in fact Jiri took it over and he's still working on it. But yeah,
> > > once that get merged, we should be able to measure instructions or
> > > cycles inside any user or kernel function through kprobes/uprobes or
> > > function graph tracer.
> >
> > So, what would be nice is to actually make use of it: one very nice
> > usecase I'd love to see is to have the capability within the 'perf top'
> > TUI annotated assembly output to mark specific instructions as 'start' and
> > 'end' markers, and measure the overhead between them.
>
> Yeah that would be a nice interface. Speaking about that, it would be nice to get your input
> on the proposed interface for toggle events.
>
> It's still in an RFC state, although it's getting quite elaborated, and I believe we haven't
> yet found a real direction to take for the tooling interface IIRC. For example the perf record
> cmdline used to state toggle events based contexts was one of the parts we were not that confident about.
> And we really don't want to take a wrong direction for that as it's going to be complicated
> to handle in any case.
>
> See this thread:
> https://lwn.net/Articles/568602/
At the risk of hijacking this discussion, here's my take on triggers:
I think the primary interface should be to allow the disabling/enabling of
a specific event from other events.
>From user-space it would be fd driven: add a perf attribute to allow a
specific event to set the state of another event if it triggers. The
'other event' would be an fd, similar to how group events are specified.
An 'off' trigger sets the state to 0 (disabled).
An 'on' trigger sets the state to 1 (enabled).
Using such a facility the measurement of deltas would need 3 events:
- fd1: a cycles event that is created disabled
- fd2: a kprobes event at the 'start' RIP, set to counting only,
connected to fd1, setting state to '1'
- fd3: a kprobes event at the 'stop' RIP, set to counting only,
connected to fd1, setting state to '0'.
This way every time the (fd2) start-RIP kprobes event executes, the
trigger code sees that it's supposed to enable the (fd1) cycles event.
Every time the (fd3) stop-RIP kprobes event executes, the trigger code
sees that it's set to disable the (fd1) cycles event.
Instead of 'cycles event', it could count instructions, or pagefaults, or
cachemisses.
( If the (fd1) cycles event is a sampling event then this would allow nice
things like the profiling of individual functions within the context of
a specific system call, driven by triggers. )
In theory we could allow self-referential triggers as well: the first
execution of the trigger would disable itself. If the trigger state is not
on/off but a counter then this would allow 'take 100 samples then shut
off' type of functionality as well.
But success primarily depends on how useful the tooling UI turns out to
be: create a nice Slang or GTK UI for kprobes and triggers, and/or turn it
into a really intuitive command line UI, and people will use it.
I think annotated assembly/source output is a really nice match for
triggers and kprobes, so I'd suggest the Slang TUI route ...
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-11-05 8:24 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-01 20:17 [PATCH] mm: cache largest vma Davidlohr Bueso
2013-11-01 20:38 ` KOSAKI Motohiro
2013-11-01 21:11 ` Davidlohr Bueso
2013-11-03 9:46 ` Ingo Molnar
2013-11-03 23:57 ` KOSAKI Motohiro
2013-11-04 4:22 ` Davidlohr Bueso
2013-11-01 21:23 ` Rik van Riel
2013-11-03 10:12 ` Ingo Molnar
2013-11-04 4:20 ` Davidlohr Bueso
2013-11-04 4:48 ` converting unicore32 to gate_vma as done for arm (was Re: [PATCH] mm: cache largest vma) Al Viro
2013-11-05 2:49 ` 管雪涛
2013-11-11 7:25 ` converting unicore32 to gate_vma as done for arm (was " Al Viro
2013-11-04 7:00 ` [PATCH] mm: cache largest vma Ingo Molnar
2013-11-04 7:05 ` Ingo Molnar
2013-11-04 14:20 ` Frederic Weisbecker
2013-11-04 17:52 ` Ingo Molnar
2013-11-04 18:10 ` Frederic Weisbecker
2013-11-05 8:24 ` Ingo Molnar [this message]
2013-11-05 14:27 ` Jiri Olsa
2013-11-06 6:01 ` Ingo Molnar
2013-11-06 14:03 ` Konstantin Khlebnikov
2013-11-03 18:51 ` Linus Torvalds
2013-11-04 4:04 ` Davidlohr Bueso
2013-11-04 7:36 ` Ingo Molnar
2013-11-04 14:56 ` Michel Lespinasse
2013-11-11 4:12 ` Davidlohr Bueso
2013-11-11 7:43 ` Michel Lespinasse
2013-11-11 12:04 ` Ingo Molnar
2013-11-11 20:47 ` Davidlohr Bueso
2013-11-13 17:08 ` Davidlohr Bueso
2013-11-13 17:59 ` Ingo Molnar
2013-11-13 18:16 ` Peter Zijlstra
2013-11-11 12:01 ` Ingo Molnar
2013-11-11 18:24 ` Davidlohr Bueso
2013-11-11 20:47 ` Ingo Molnar
2013-11-11 20:59 ` Davidlohr Bueso
2013-11-11 21:09 ` Ingo Molnar
2013-11-04 7:03 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131105082450.GA10127@gmail.com \
--to=mingo@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=aswin@hp.com \
--cc=davidlohr@hp.com \
--cc=dsahern@gmail.com \
--cc=fweisbec@gmail.com \
--cc=gxt@mprc.pku.edu.cn \
--cc=hughd@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).