public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Pavel Machek <pavel@ucw.cz>
Cc: Roland Dreier <rdreier@cisco.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	Paul Mackerras <paulus@samba.org>,
	Anton Blanchard <anton@samba.org>,
	general@lists.openfabrics.org, akpm@linux-foundation.org,
	torvalds@linux-foundation.org
Subject: Re: [ofa-general] Re: [GIT PULL] please pull ummunotify
Date: Wed, 30 Sep 2009 11:44:56 +0200	[thread overview]
Message-ID: <20090930094456.GD24621@elte.hu> (raw)
In-Reply-To: <20090929171332.GD14405@elf.ucw.cz>


* Pavel Machek <pavel@ucw.cz> wrote:

> On Thu 2009-09-17 08:45:29, Roland Dreier wrote:
> > 
> >
[...]
> > OK.  It would be nice to tie into something more general, but I 
> > think I agree -- perf counters are missing the filtering and the "no 
> > lost events" that ummunotify does have. [...]

Performance events filtering is being worked on and now with the proper 
non-DoS limit you've added you can lose events too, dont you? So it's 
all a question of how much buffering to add - and with perf events too 
you can buffer arbitrary large amount of events.

> > [...]  And I'm not sure it's worth messing up the perf counters 
> > design just to jam one more not totally related thing in.

Nobody suggested details for any redesign yet (so far it seems like a 
perfect match, to me at least) so i'm wondering what messup you are 
referring to.

> I believe that extending perf counters to do what you want is better 
> than adding one more, very strange, user<->kernel interface.

Agreed.

Lemme react to the original description of the code:

>     git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify
>
> This will get "ummunotify," a new character device that allows a 
> userspace library to register for MMU notifications; this is 
> particularly useful for MPI implementions (message passing libraries 
> used in HPC) to be able to keep track of what wacky things consumers 
> do to their memory mappings.

I test-pulled this code and had a look at it.

I think this could be done in a simpler, less limited, more generic, 
more useful form by using some variation of perf events.

You should be able to get all that you want by adding two TRACE_EVENT() 
tracepoints and using the existing perf event syscall to get the events 
to user-space.

Meaning that this:

  9 files changed, 1060 insertions(+), 1 deletions(-)

Would be replaced with something like:

  2 files changed, 100 insertions(+), 0 deletions(-)

[ the +100 lines would (roughly) would add tracepoints to 
  invalidate_page and invalidate_range_start. (possibly via 
  mmu_notifier_register() like the ummunotify code does) Most of that 
  linecount would be comments. ]

Another upside, beyond the reduction in complexity is that we'd have one 
less special char driver based ABI. Which is a big plus in my opinion, 
especially if this goes towards HPC folks and if it's used for real. Why 
should such a MM capability hidden behind a character device and an 
ioctl?

The perf event approach is beneficial to non-HPC as well: MM 
instrumentation for example - page range invalidates are interesting to 
all sorts of modi of analysis.

A question: what is the typical size/scope of the rbtree of the watched 
regions of memory in practical (test) deployments of the ummunofity 
code?

Per tracepoint filtering is possible via the perf event patches Li Zefan 
has posted to lkml recently, under this subject:

   [PATCH 0/6] perf trace: Add filter support

They are still being worked on but it's very clear that flexible 
in-kernel filtering support will be a natural part of the perf event 
design in the very near future, so if that alone is your reason not to 
use it it would be better if you helped us complete/test the filter 
support and use that, instead of a parallel framework.

Or if that's not desirable or not possible, or if there's any other 
technical roadblock, i'd like to know the particulars of that.

Thanks,

	Ingo

  reply	other threads:[~2009-09-30  9:45 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-11  4:38 [GIT PULL] please pull ummunotify Roland Dreier
2009-09-11  5:56 ` KOSAKI Motohiro
2009-09-11  6:03   ` Roland Dreier
2009-09-11  6:11     ` KOSAKI Motohiro
2009-09-11 16:42       ` Gleb Natapov
2009-09-11  6:15     ` Brice Goglin
2009-09-11  6:21       ` KOSAKI Motohiro
2009-09-11  6:22       ` Roland Dreier
2009-09-11  6:40         ` [ofa-general] " Jason Gunthorpe
2009-09-11 16:58           ` Roland Dreier
2009-09-15  7:03             ` KOSAKI Motohiro
2009-09-15  8:27               ` Roland Dreier
2009-09-15 12:38               ` Jeff Squyres
2009-09-15 11:34 ` Pavel Machek
2009-09-15 14:57   ` [ofa-general] " Roland Dreier
2009-09-28 20:49     ` Pavel Machek
2009-09-28 21:40       ` Jason Gunthorpe
2009-09-16 16:30 ` Roland Dreier
2009-09-16 16:40   ` Linus Torvalds
2009-09-17 11:30 ` Peter Zijlstra
2009-09-17 14:24   ` [ofa-general] " Roland Dreier
2009-09-17 14:32     ` Roland Dreier
2009-09-17 14:49       ` Peter Zijlstra
2009-09-17 15:03         ` Roland Dreier
2009-09-17 15:22           ` Peter Zijlstra
2009-09-17 15:45           ` Roland Dreier
2009-09-18 11:50             ` Ingo Molnar
2009-09-29 17:13             ` Pavel Machek
2009-09-30  9:44               ` Ingo Molnar [this message]
2009-09-30 16:02                 ` Jason Gunthorpe
2009-10-12 18:19                   ` Ingo Molnar
2009-10-12 19:30                     ` Jason Gunthorpe
2009-10-12 20:20                       ` Ingo Molnar
2009-10-13  4:05                         ` Jason Gunthorpe
2009-10-13  6:40                           ` Ingo Molnar
2009-10-13 16:27                             ` Jason Gunthorpe
2009-10-13  5:43                         ` Brice Goglin
2009-10-13  6:38                           ` Ingo Molnar
2009-09-30 17:06                 ` Roland Dreier
2009-10-02 16:32                 ` Roland Dreier
2009-10-02 20:45                   ` Pavel Machek
2009-10-07 22:34                   ` Roland Dreier
2009-10-12 17:33                     ` Peter Zijlstra
2009-09-17 14:43     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090930094456.GD24621@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=anton@samba.org \
    --cc=general@lists.openfabrics.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=paulus@samba.org \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=rdreier@cisco.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox