uprobes && pre-filtering

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* uprobes && pre-filtering
@ 2012-11-05 19:04 Oleg Nesterov
  2012-11-06  9:05 ` Srikar Dronamraju
  0 siblings, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2012-11-05 19:04 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli, Anton Arapov, David Smith,
	Frank Ch. Eigler, Ingo Molnar, Josh Stone, Peter Zijlstra,
	Srikar Dronamraju, Suzuki K. Poulose
  Cc: linux-kernel

Hello.

There is a known (and by design) problem with uprobes. They act
systemwide, there is no pre-filtering. Just some random thoughts
to provoke the discussion.

- I think that the current uprobe_consumer->filter(task) should die.

  It buys nothing. It is called right before ->handler() and this is
  pointless, ->handler() can call it itself.

  And more importantly, I do not think this hook (with the same
  semantics) can/should be used by both register and handler_chain().

- We should change register_ and and mmap to filter out the tasks
  which we do not want to trace. To simplify, lets forget about
  multiple consumers first.

  Everything is simple, install_breakpoint() callers should simply
  call consumer->filter(args) and do nothing if it returns false.

  The main problem is, what should be passed as "args". I think it is
  pointless to use task_struct as an argument. And not because there
  is no simple way to find all tasks which use this particular mm in
  register_for_each_vma(), even if it was possible I think this makes
  no sense.

  If 2 tasks/threads share the same mm they will share int3 as well,
  so I think we need

  	enum filter_mode {
  		UPROBE_FILTER_REGISTER,
  		UPROBE_FILTER_MMAP,
  		/* more */
  	};

  	consumer->filter(enum filter_mode mode, struct mm_struct *mm);

  Sure, this does not allow to, say, probe the tasks with the given uid.
  But once again, currently we do not have for_each_mm_user(task, mm)
  and even if we implement it

  	a) ->filter(mm) can use it itself)

  	b) I do not think register_for_each_vma() should call it.

  	   Suppose that a consumer wants to track the task with the
  	   given pid PID. In this case ->filter() can simply check
  	   find_task_by_vpid(PID)->mm = mm. This is fast and simple.

  	   Or you want to probe all instances of /bin/ls. In this case
  	   filter() can check mm->exe_file->f_path == saved_path, this
  	   is very cheap.

  	   But if we add for_each_mm_user() into register_for_each_vma()
  	   it will be called even if the filtering is simple.

  So. If filter(UPROBE_FILTER_REGISTER) needs to check task_uid(task) == UID
  it has to do for_each_process() until we have (if ever) for_each_mm_user().

  Or it should always return true and remove the unnecessary breakpoints
  later, or use another API (see below).

  Also. Who needs the "nontrivial" filtering? I do not see any potential
  in-kernel user. And the tools like systemtap can take another approach
  (perhaps).

- Perhaps we should extend the API. We can add

	uprobe_apply(consumer, task, bool add_remove);

  which adds/removes breakpoints to task->mm.

  This way consumer can probe every task it wants to trace after
  uprobe_register().

  Its ->filter(UPROBE_FILTER_REGISTER) should simply return false. Or,
  better, we can split uprobe_register() into 2 functions,
  __uprobe_register() and uprobe_apply_all() which actually does
  register_for_each_vma().

  ***** QUESTION *****: perhaps this is all systemtap needs? ignoring
  UPROBE_FILTER_MMAP.

- Multiple consumers. uprobe_register/uprobe_unregister should be
  modified so that register_for_each_vma() is called every time when
  the new consumer comes or consumer goes away.

  uprobe_apply(add_remove => false) should obviously consult other
  consumers too.

- Perhaps we should teach handle_swbp() to remove the unwanted breakpoints.

  If every ->handler() returns, say, UPROBE_GO_AWAY handle_swbp() should
  remove this breakpoint. Of course, a consumer must be sure that if it
  returns UPROBE_GO_AWAY this task can't share ->mm with another task it
  wants to trace.

  Or consumer->handler() can do uprobe_apply(add_remove => false) itself,
  but this needs more discussion.

  The point is that if the filtering at UPROBE_FILTER_REGISTER time is
  not possible, ->filter(UPROBE_FILTER_REGISTER) can return true. A
  "wrong" int3 doesn't hurt until the task actually hits the breakpoint,
  and I think that a single bp-hit is fine performance-wise.

- fork(). The child inherits all breakpoints from parent, and uprobes
  can't control this. What can we do?

	* We can add another uprobe hook which does something like

	  	for_each_uprobe_in_each_vma(child_mm) {
	  		if (filter(UPROBE_FILTER_FORK))
	  			install_breakoint();
	  		else
	  			remove_breakpoint();
	  	}

	  But is is not clear where can we add this hook. dup_mmap()
	  looks appealing, but at this time the child is still under
	  construction, consumer->filter() can't look at task_struct.

	  And of course, it is not nice to slow down fork().

	* If we only care about the unwanted breakpoints, perhaps it
	  would be better to rely on UPROBE_GO_AWAY above?

	* Finally, do we care at all? Again, who can ever need to
	  re-install breakpoints after fork?

	  systemtap (iiuc) doesn't need this. And even if it does
	  or will need, I guess it can hook fork itself and use
	  uprobe_apply() ?

Please comment.

Oleg.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: uprobes && pre-filtering
  2012-11-05 19:04 uprobes && pre-filtering Oleg Nesterov
@ 2012-11-06  9:05 ` Srikar Dronamraju
  2012-11-06 17:02   ` Oleg Nesterov
  0 siblings, 1 reply; 5+ messages in thread
From: Srikar Dronamraju @ 2012-11-06  9:05 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ananth N Mavinakayanahalli, Anton Arapov, David Smith,
	Frank Ch. Eigler, Ingo Molnar, Josh Stone, Peter Zijlstra,
	Suzuki K. Poulose, linux-kernel

> Hello.
> 
> There is a known (and by design) problem with uprobes. They act
> systemwide, there is no pre-filtering. Just some random thoughts
> to provoke the discussion.
> 
> - I think that the current uprobe_consumer->filter(task) should die.
> 
>   It buys nothing. It is called right before ->handler() and this is
>   pointless, ->handler() can call it itself.
> 

I think it helps to call filter before handler.

When a tracer has specified a filter condition and then we run a handler
for cases that dont fit the handler looks a little odd.

Another reason for having the filters in the current way was to have a
set of standard filters in uprobes code so that all users dont need to
recreate these filters. 

Also please see below.

>   And more importantly, I do not think this hook (with the same
>   semantics) can/should be used by both register and handler_chain().
> 
> - We should change register_ and and mmap to filter out the tasks
>   which we do not want to trace. To simplify, lets forget about
>   multiple consumers first.
> 
>   Everything is simple, install_breakpoint() callers should simply
>   call consumer->filter(args) and do nothing if it returns false.
> 
>   The main problem is, what should be passed as "args". I think it is
>   pointless to use task_struct as an argument. And not because there
>   is no simple way to find all tasks which use this particular mm in
>   register_for_each_vma(), even if it was possible I think this makes
>   no sense.
> 

Having mm instead of task is fine with me. But this would mostly mean 
that the prefilter, i.e filter at the time of registration and the
filter at the time of handling should be different or we remove the
handling time filtering and expect the handler to do the filtering.

Having a task instead of mm helps in having the same filter run at both
places.

>   If 2 tasks/threads share the same mm they will share int3 as well,
>   so I think we need
> 
>   	enum filter_mode {
>   		UPROBE_FILTER_REGISTER,
>   		UPROBE_FILTER_MMAP,
>   		/* more */
>   	};
> 
>   	consumer->filter(enum filter_mode mode, struct mm_struct *mm);

what would a tracing session having filter_mode == UPROBE_FILTER_REGISTER mean?
	Does that mean it can trace only instances in currently mapped vmas? 
	Would it handle a probe instance in a process that is already running but has not yet mapped a vma?

Also would a tracing session having filter_mode == UPROBE_FILTER_MMAP
mean that it would only handle probepoints in only vmas that are going
to be mapped in future?

> 
>   Sure, this does not allow to, say, probe the tasks with the given uid.
>   But once again, currently we do not have for_each_mm_user(task, mm)
>   and even if we implement it
> 
>   	a) ->filter(mm) can use it itself)
> 
>   	b) I do not think register_for_each_vma() should call it.
> 
>   	   Suppose that a consumer wants to track the task with the
>   	   given pid PID. In this case ->filter() can simply check
>   	   find_task_by_vpid(PID)->mm = mm. This is fast and simple.
> 
>   	   Or you want to probe all instances of /bin/ls. In this case
>   	   filter() can check mm->exe_file->f_path == saved_path, this
>   	   is very cheap.
> 
>   	   But if we add for_each_mm_user() into register_for_each_vma()
>   	   it will be called even if the filtering is simple.
> 
>   So. If filter(UPROBE_FILTER_REGISTER) needs to check task_uid(task) == UID
>   it has to do for_each_process() until we have (if ever) for_each_mm_user().
> 
>   Or it should always return true and remove the unnecessary breakpoints
>   later, or use another API (see below).
> 
>   Also. Who needs the "nontrivial" filtering? I do not see any potential
>   in-kernel user. And the tools like systemtap can take another approach
>   (perhaps).
> 
> - Perhaps we should extend the API. We can add
> 
> 	uprobe_apply(consumer, task, bool add_remove);
> 
>   which adds/removes breakpoints to task->mm.
> 
>   This way consumer can probe every task it wants to trace after
>   uprobe_register().
> 
>   Its ->filter(UPROBE_FILTER_REGISTER) should simply return false. Or,
>   better, we can split uprobe_register() into 2 functions,
>   __uprobe_register() and uprobe_apply_all() which actually does
>   register_for_each_vma().
> 
>   ***** QUESTION *****: perhaps this is all systemtap needs? ignoring
>   UPROBE_FILTER_MMAP.
> 
> - Multiple consumers. uprobe_register/uprobe_unregister should be
>   modified so that register_for_each_vma() is called every time when
>   the new consumer comes or consumer goes away.
> 
>   uprobe_apply(add_remove => false) should obviously consult other
>   consumers too.


So in this case, would uprobe_register() just add a consumer to a
new/existing uprobe. The actual probe insertion is done by the
uprobe_apply()/uprobe_apply_all().  Also in this case, there is no
filter element in the uprobe_consumer. Right?

I am just wondering if people could use/misuse this 
for example: - somebody could keep modifying and reusing the consumers
but one probe registration,  with subsequent uprobe_apply(). 
Unlike now we could have lots of uprobes but all with defunct consumers.

> 
> - Perhaps we should teach handle_swbp() to remove the unwanted breakpoints.
> 
>   If every ->handler() returns, say, UPROBE_GO_AWAY handle_swbp() should
>   remove this breakpoint. Of course, a consumer must be sure that if it
>   returns UPROBE_GO_AWAY this task can't share ->mm with another task it
>   wants to trace.

Its a good idea to remove redundant probepoints from the breakpoint hit
context. However,

- even from the handlers() perspective, if we have to identify that this
  consumer isnt looking at this mm, we need something like
  for_each_mm_user(). No? 

- also if we are looking for removing a breakpoint, I would think its
  better done in common code, i.e uprobe code rather than keeping it in
  every handler. So I think keeping the filter logic before the handler
  is good.

> 
>   Or consumer->handler() can do uprobe_apply(add_remove => false) itself,
>   but this needs more discussion.
> 
>   The point is that if the filtering at UPROBE_FILTER_REGISTER time is
>   not possible, ->filter(UPROBE_FILTER_REGISTER) can return true. A
>   "wrong" int3 doesn't hurt until the task actually hits the breakpoint,
>   and I think that a single bp-hit is fine performance-wise.
> 

Agree. Also the reason why we inserted this probe in the current mm
might have changed and we might no more need the probe.
For example, traced thread in a multithreaded process actually died and
all other threads may not be interested

> - fork(). The child inherits all breakpoints from parent, and uprobes
>   can't control this. What can we do?
> 
> 	* We can add another uprobe hook which does something like
> 
> 	  	for_each_uprobe_in_each_vma(child_mm) {
> 	  		if (filter(UPROBE_FILTER_FORK))
> 	  			install_breakoint();
> 	  		else
> 	  			remove_breakpoint();
> 	  	}
> 
> 
> 	  But is is not clear where can we add this hook. dup_mmap()
> 	  looks appealing, but at this time the child is still under
> 	  construction, consumer->filter() can't look at task_struct.
> 
> 	  And of course, it is not nice to slow down fork().
> 
> 	* If we only care about the unwanted breakpoints, perhaps it
> 	  would be better to rely on UPROBE_GO_AWAY above?
> 
> 	* Finally, do we care at all? Again, who can ever need to
> 	  re-install breakpoints after fork?
> 
> 	  systemtap (iiuc) doesn't need this. And even if it does
> 	  or will need, I guess it can hook fork itself and use
> 	  uprobe_apply() ?


I was thinking if we can implement for_each_mm_user() using additional
mm->flags. 

Would something like this work?
- Set a new MMF_REPARENTED flag for a mm, when its thread either gets
  reparented or is called with CLONE_PARENT.

- for_each_mm_users would only be needed during uprobe_register /
  uprobe_mmap() if and only if there are filters.

- for_each_mm_users would mostly boil down to searching in children,
  siblings unless MMF_REPARENTED flag is set.
 
- If MMF_REPARENTED flag is set for a mm, and probe needs filtering,
  then we look at do_each_thread ..for_each_thread. This would mostly
  not be the common case. So I hope this should be okay. Not sure if we
  could look at further optimizing like looking at parent's children,
  init's children to see if any of them refer to mm.

-- 
Thanks and Regards
Srikar


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: uprobes && pre-filtering
  2012-11-06  9:05 ` Srikar Dronamraju
@ 2012-11-06 17:02   ` Oleg Nesterov
  2012-11-06 17:41     ` Josh Stone
  0 siblings, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2012-11-06 17:02 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ananth N Mavinakayanahalli, Anton Arapov, David Smith,
	Frank Ch. Eigler, Ingo Molnar, Josh Stone, Peter Zijlstra,
	Suzuki K. Poulose, linux-kernel

On 11/06, Srikar Dronamraju wrote:
>
> > Hello.
> >
> > There is a known (and by design) problem with uprobes. They act
> > systemwide, there is no pre-filtering. Just some random thoughts
> > to provoke the discussion.
> >
> > - I think that the current uprobe_consumer->filter(task) should die.
> >
> >   It buys nothing. It is called right before ->handler() and this is
> >   pointless, ->handler() can call it itself.
> >
>
> I think it helps to call filter before handler.

How? The hanlder can simply call this function at the start.

But this is minor. I think this can double the work sometimes. If ->handler
needs to do something nontrivial, it will probably look into my_traced_tasks
database anyway to find the additional info which represents the tracee.
And most probably ->filter() will do the same lookup unnecessary.

> When a tracer has specified a filter condition and then we run a handler
> for cases that dont fit the handler looks a little odd.
>
> Another reason for having the filters in the current way was to have a
> set of standard filters in uprobes code so that all users dont need to
> recreate these filters.

IOW, you mean that both register_for_each_vma() and handler_chain() should
use the same ->filter() method? Personally I do not think they should.

Because the semantics is different imo. register_for_each_vma() needs
to know if we should modify mm and insert the breakpoint. handler_chain()
just tries to skip ->handler() (and for no reason, imho).

> >   And more importantly, I do not think this hook (with the same
> >   semantics) can/should be used by both register and handler_chain().
> >
> > - We should change register_ and and mmap to filter out the tasks
> >   which we do not want to trace. To simplify, lets forget about
> >   multiple consumers first.
> >
> >   Everything is simple, install_breakpoint() callers should simply
> >   call consumer->filter(args) and do nothing if it returns false.
> >
> >   The main problem is, what should be passed as "args". I think it is
> >   pointless to use task_struct as an argument. And not because there
> >   is no simple way to find all tasks which use this particular mm in
> >   register_for_each_vma(), even if it was possible I think this makes
> >   no sense.
> >
>
> Having mm instead of task is fine with me. But this would mostly mean
> that the prefilter, i.e filter at the time of registration and the
> filter at the time of handling should be different

Yes, I think they should be different, please see above.

We need the pre-filtering to avoid the unnecessary traps, not to avoid
the function call when the task already hit the breakpoint.

> or we remove the
> handling time filtering and expect the handler to do the filtering.

Yes. But once again, if ->handler() wants to use the same function it
can simply call it.

> Having a task instead of mm helps in having the same filter run at both
> places.

And this is where we start to disagree. Namely, I do not think that
register_for_each_vma() should even try to find mm user(s).

Because once again, a) whater register_for_each_vma() can do to iterate
the tasks can be done by ->filter, b) right now I do not see any potential
user who will need this so we should not overdesign this.

> >   If 2 tasks/threads share the same mm they will share int3 as well,
> >   so I think we need
> >
> >   	enum filter_mode {
> >   		UPROBE_FILTER_REGISTER,
> >   		UPROBE_FILTER_MMAP,
> >   		/* more */
> >   	};
> >
> >   	consumer->filter(enum filter_mode mode, struct mm_struct *mm);
>
> what would a tracing session having filter_mode == UPROBE_FILTER_REGISTER mean?
> 	Does that mean it can trace only instances in currently mapped vmas?
> 	Would it handle a probe instance in a process that is already running but has not yet mapped a vma?
>
> Also would a tracing session having filter_mode == UPROBE_FILTER_MMAP
> mean that it would only handle probepoints in only vmas that are going
> to be mapped in future?

No, no, you misunderstood. Sorry I was unclear and yes, the naming I used
was confusing.

I meant that both register_for_each_vma() and uprobe_mmap() (lets ignore
other callers and other modes) call the same ->filter() method but with
the different "mode" argument, UPROBE_FILTER_REGISTER or UPROBE_FILTER_MMAP.
This is only the hint, for example UPROBE_FILTER_MMAP can use current
but UPROBE_FILTER_REGISTER obviously can't.

> > - Perhaps we should extend the API. We can add
> >
> > 	uprobe_apply(consumer, task, bool add_remove);
> >
> >   which adds/removes breakpoints to task->mm.
> >
> >   This way consumer can probe every task it wants to trace after
> >   uprobe_register().
> >
> >   Its ->filter(UPROBE_FILTER_REGISTER) should simply return false. Or,
> >   better, we can split uprobe_register() into 2 functions,
> >   __uprobe_register() and uprobe_apply_all() which actually does
> >   register_for_each_vma().
> >
> >   ***** QUESTION *****: perhaps this is all systemtap needs? ignoring
> >   UPROBE_FILTER_MMAP.
> >
> > - Multiple consumers. uprobe_register/uprobe_unregister should be
> >   modified so that register_for_each_vma() is called every time when
> >   the new consumer comes or consumer goes away.
> >
> >   uprobe_apply(add_remove => false) should obviously consult other
> >   consumers too.
>
>
> So in this case, would uprobe_register() just add a consumer to a
> new/existing uprobe. The actual probe insertion is done by the
> uprobe_apply()/uprobe_apply_all().

Yes. Not sure we really need this, but to me this extension looks natural.

Frank, Josh, do you think it can help systemtap ?

Suppose that the consumer no longer wants to trace the task T but it
has other tasks to trace. If we only rely on ->filter, the tracer should
do unregister + register, this doesn't look good. Or it wants to add
the new task to its trace-list...

> Also in this case, there is no
> filter element in the uprobe_consumer. Right?

Why? it could be there.

> I am just wondering if people could use/misuse this
> for example: - somebody could keep modifying and reusing the consumers
> but one probe registration,  with subsequent uprobe_apply().

Yes, exactly.

> Unlike now we could have lots of uprobes but all with defunct consumers.

sorry, can't understand...

> > - Perhaps we should teach handle_swbp() to remove the unwanted breakpoints.
> >
> >   If every ->handler() returns, say, UPROBE_GO_AWAY handle_swbp() should
> >   remove this breakpoint. Of course, a consumer must be sure that if it
> >   returns UPROBE_GO_AWAY this task can't share ->mm with another task it
> >   wants to trace.
>
> Its a good idea to remove redundant probepoints from the breakpoint hit
> context. However,
>
> - even from the handlers() perspective, if we have to identify that this
>   consumer isnt looking at this mm, we need something like
>   for_each_mm_user(). No?

Yes and no, I think.

Yes, _in general_ it is needed. (although once again, where is potential
users?).

But in the simple case - no. For example, filter-by-pid should return
UPROBE_GO_AWAY if current->mm != find_task_by_vpid(PID)->mm.

> - also if we are looking for removing a breakpoint, I would think its
>   better done in common code, i.e uprobe code rather than keeping it in
>   every handler. So I think keeping the filter logic before the handler
>   is good.

Again, I do not understand why do you think so. OK, I won't really insist,
we can add UBPROBE_FILTER_BP_HIT or whatever. Still I do not understand
why should we mix remove-this-breakpoint and do-not-call-hanlder. And
note that ->handler() has to do addtional checks anyway.

Anyway. Do you agree that filter's arguments should be changed? If yes,
then we should remove handler_chain()->filter(), then discuss why we
should add it back and which semantics it should have. Agreed?

> >   Or consumer->handler() can do uprobe_apply(add_remove => false) itself,
> >   but this needs more discussion.
> >
> >   The point is that if the filtering at UPROBE_FILTER_REGISTER time is
> >   not possible, ->filter(UPROBE_FILTER_REGISTER) can return true. A
> >   "wrong" int3 doesn't hurt until the task actually hits the breakpoint,
> >   and I think that a single bp-hit is fine performance-wise.
> >
>
> Agree. Also the reason why we inserted this probe in the current mm
> might have changed and we might no more need the probe.
> For example, traced thread in a multithreaded process actually died and
> all other threads may not be interested

Yes.

> > - fork(). The child inherits all breakpoints from parent, and uprobes
> >   can't control this. What can we do?
> >
> > 	* We can add another uprobe hook which does something like
> >
> > 	  	for_each_uprobe_in_each_vma(child_mm) {
> > 	  		if (filter(UPROBE_FILTER_FORK))
> > 	  			install_breakoint();
> > 	  		else
> > 	  			remove_breakpoint();
> > 	  	}
> >
> >
> > 	  But is is not clear where can we add this hook. dup_mmap()
> > 	  looks appealing, but at this time the child is still under
> > 	  construction, consumer->filter() can't look at task_struct.
> >
> > 	  And of course, it is not nice to slow down fork().
> >
> > 	* If we only care about the unwanted breakpoints, perhaps it
> > 	  would be better to rely on UPROBE_GO_AWAY above?
> >
> > 	* Finally, do we care at all? Again, who can ever need to
> > 	  re-install breakpoints after fork?
> >
> > 	  systemtap (iiuc) doesn't need this. And even if it does
> > 	  or will need, I guess it can hook fork itself and use
> > 	  uprobe_apply() ?
>
>
> I was thinking if we can implement for_each_mm_user() using additional
> mm->flags.

(I guess you didn't meant it can help to solve the problem with fork)

I do not know if we can (or should) implement for_each_mm_user().

My main point was, the core uprobes should not care. Once again,
who do you think will need it? systemtap doesn't afaics, neither
debug/tracing/uprobe_events.

And once again, even if we have to implement it for some new tricky
tracer, why its ->filter can not do for_each_mm_user() itself?

>  Would something like this work?
>  ...
>
> - for_each_mm_users would mostly boil down to searching in children,
>   siblings unless MMF_REPARENTED flag is set.

Whose children? I don't think mm->owner is always the root of the
tree. And what about the locking? I do not think we want to call
->filter under (say) tasklist.

But again, again. Why do you think register_for_each_vma() should do
this? IOW, why should we think about for_each_mm_user() at all (at
least right now) ?

Oleg.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: uprobes && pre-filtering
  2012-11-06 17:02   ` Oleg Nesterov
@ 2012-11-06 17:41     ` Josh Stone
  2012-11-06 18:20       ` Oleg Nesterov
  0 siblings, 1 reply; 5+ messages in thread
From: Josh Stone @ 2012-11-06 17:41 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Srikar Dronamraju, Ananth N Mavinakayanahalli, Anton Arapov,
	David Smith, Frank Ch. Eigler, Ingo Molnar, Peter Zijlstra,
	Suzuki K. Poulose, linux-kernel

On 11/06/2012 09:02 AM, Oleg Nesterov wrote:
> On 11/06, Srikar Dronamraju wrote:
>> Another reason for having the filters in the current way was to have a
>> set of standard filters in uprobes code so that all users dont need to
>> recreate these filters.
> 
> IOW, you mean that both register_for_each_vma() and handler_chain() should
> use the same ->filter() method? Personally I do not think they should.
> 
> Because the semantics is different imo. register_for_each_vma() needs
> to know if we should modify mm and insert the breakpoint. handler_chain()
> just tries to skip ->handler() (and for no reason, imho).

I agree here - I don't see much use for filter(), and even if you have
it, the semantics are definitely different than prefilter().  You could
do all sorts of localized and temporal checks in filter() that make no
sense for a prefilter(), like check for a particular thread id, filter
every N hits, or filter only hits under calls to some foo().

[...]
>>> - Perhaps we should extend the API. We can add
>>>
>>> 	uprobe_apply(consumer, task, bool add_remove);
>>>
>>>   which adds/removes breakpoints to task->mm.
>>>
>>>   This way consumer can probe every task it wants to trace after
>>>   uprobe_register().
>>>
>>>   Its ->filter(UPROBE_FILTER_REGISTER) should simply return false. Or,
>>>   better, we can split uprobe_register() into 2 functions,
>>>   __uprobe_register() and uprobe_apply_all() which actually does
>>>   register_for_each_vma().
>>>
>>>   ***** QUESTION *****: perhaps this is all systemtap needs? ignoring
>>>   UPROBE_FILTER_MMAP.
>>>
>>> - Multiple consumers. uprobe_register/uprobe_unregister should be
>>>   modified so that register_for_each_vma() is called every time when
>>>   the new consumer comes or consumer goes away.
>>>
>>>   uprobe_apply(add_remove => false) should obviously consult other
>>>   consumers too.
>>
>>
>> So in this case, would uprobe_register() just add a consumer to a
>> new/existing uprobe. The actual probe insertion is done by the
>> uprobe_apply()/uprobe_apply_all().
> 
> Yes. Not sure we really need this, but to me this extension looks natural.
> 
> Frank, Josh, do you think it can help systemtap ?

Yes, I think this sounds closer to systemtap's model of probing.  We
already track tasks that come and go to see which are "interesting", so
we could easily call apply() at that time.  We actually watch mmaps too,
so I think we could apply() for that case as well.

We wouldn't even need filtering functions at all in this mode.  But
maybe other consumers could still use it, like perf.

However, it's not clear to me what value there is in uprobe_register, if
you always have to apply it too.  The modes are something like:

1. uprobe_register(); uprobe_apply_all();
2. uprobe_register(); uprobe_apply(); [...]

And the second could have a bunch of idle registrations waiting for the
first applicable task to come around.  So why not instead:

1. uprobe_register_all();
2. uprobe_register_task(); [...]

In this case, the second would have to allow the same consumer to be
repeated on different tasks, but it feels more natural to me.

Josh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: uprobes && pre-filtering
  2012-11-06 17:41     ` Josh Stone
@ 2012-11-06 18:20       ` Oleg Nesterov
  0 siblings, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2012-11-06 18:20 UTC (permalink / raw)
  To: Josh Stone
  Cc: Srikar Dronamraju, Ananth N Mavinakayanahalli, Anton Arapov,
	David Smith, Frank Ch. Eigler, Ingo Molnar, Peter Zijlstra,
	Suzuki K. Poulose, linux-kernel

On 11/06, Josh Stone wrote:
>
> On 11/06/2012 09:02 AM, Oleg Nesterov wrote:
> >>> - Perhaps we should extend the API. We can add
> >>>
> >>> 	uprobe_apply(consumer, task, bool add_remove);
> >>>
> >>>   which adds/removes breakpoints to task->mm.
> >>>
> >>>   This way consumer can probe every task it wants to trace after
> >>>   uprobe_register().
> >>>
> >>>   Its ->filter(UPROBE_FILTER_REGISTER) should simply return false. Or,
> >>>   better, we can split uprobe_register() into 2 functions,
> >>>   __uprobe_register() and uprobe_apply_all() which actually does
> >>>   register_for_each_vma().
> >>>
> >>>   ***** QUESTION *****: perhaps this is all systemtap needs? ignoring
> >>>   UPROBE_FILTER_MMAP.
> >>>
> >> So in this case, would uprobe_register() just add a consumer to a
> >> new/existing uprobe. The actual probe insertion is done by the
> >> uprobe_apply()/uprobe_apply_all().
> >
> > Yes. Not sure we really need this, but to me this extension looks natural.
> >
> > Frank, Josh, do you think it can help systemtap ?
>
> Yes, I think this sounds closer to systemtap's model of probing.  We
> already track tasks that come and go to see which are "interesting", so
> we could easily call apply() at that time.  We actually watch mmaps too,
> so I think we could apply() for that case as well.

OK, thanks.

(just in case, mmap is different, but lets ignore this now).

> We wouldn't even need filtering functions at all in this mode.  But
> maybe other consumers could still use it, like perf.

Of course, we need ->filter() anyway.

> However, it's not clear to me what value there is in uprobe_register, if
> you always have to apply it too.  The modes are something like:
>
> 1. uprobe_register(); uprobe_apply_all();
> 2. uprobe_register(); uprobe_apply(); [...]

No, no, sorry for confusion.

I meant we could add __uprobe_register() (or whatever) which doesn't actually
insert the breakpoint. So if the tracer relies on uprobe_apply() it can avoid
the costly register_for_each_vma/filter and do __uprobe_register + apply.

This is not strictly necessary even if we add uprobe_apply*, and you can
always use uprobe_register() (or uprobe_register_all as you denoted it
below).

> first applicable task to come around.  So why not instead:
>
> 1. uprobe_register_all();
> 2. uprobe_register_task(); [...]
>
> In this case, the second would have to allow the same consumer to be
> repeated on different tasks, but it feels more natural to me.

This can work too.

But uprobe_unregister_task() doesn't look very clear. What should it
do? IOW, you still need uprobe_unregister_all() and this doesn't look
symmetrical.

Oleg.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-11-06 18:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-05 19:04 uprobes && pre-filtering Oleg Nesterov
2012-11-06  9:05 ` Srikar Dronamraju
2012-11-06 17:02   ` Oleg Nesterov
2012-11-06 17:41     ` Josh Stone
2012-11-06 18:20       ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox