All of lore.kernel.org
 help / color / mirror / Atom feed
From: Li Zefan <lizf@cn.fujitsu.com>
To: Tom Zanussi <tzanussi@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Jason Baron <jbaron@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/6] perf_counter: Add PERF_COUNTER_IOC_SET_FILTER ioctl
Date: Wed, 09 Sep 2009 10:18:40 +0800	[thread overview]
Message-ID: <4AA71080.7030503@cn.fujitsu.com> (raw)
In-Reply-To: <1252393315.10919.4.camel@tropicana>

>>>> Hrm,.. not at all sure about this.. what are the ABI implications?
>>> I think the ABI should be fine if it's always a sub-set of C syntax. 
>>> That would be C expressions initially. Hm?
>> Right, so I've no clue what filter expressions look like, and the
>> changelog doesn't help us at all. It doesn't mention its a well
>> considered decision to henceforth freeze the expression syntax.
>>
>> Of course, since filters so far only work with tracepoint things, and
>> since you can only come by tracepoint things through debugfs, and since
>> anything debugfs is basically a free-for-all ABI-less world, we might be
>> good, but then this is a very ill-defined ioctl() indeed.
>>
>> So please, consider this well -- there might not be a second chance.
>>
> 
> I've been meaning to write up something about the event filters - here's
> a first stab that hopefully helps explain them...
> 

Great!

Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>

Could you add your SOB and send it to Ingo?

Some nitpicks below:

> Tom
> 
> diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt
> index 2bcc8d4..50fe510 100644
> --- a/Documentation/trace/events.txt
> +++ b/Documentation/trace/events.txt
> @@ -97,3 +97,160 @@ The format of this boot option is the same as described in section 2.1.
>  
>  See The example provided in samples/trace_events
>  
> +4. Event formats
> +================
> +
> +Each trace event has a 'format' file associated with it that contains
> +a description of each field in a logged event.  This information can
> +be used to parse the binary trace stream, and is also the place to
> +find the field names that can be used in event filters (see section 5
> +below).
> +
> +It also displays the format string that will be used to print the
> +event in text mode, along with the event name and ID used for
> +profiling.
> +
> +Every event has a set of 'common' fields associated with it; these are
> +the fields prefixed with 'common_'.  The other fields vary between
> +events and correspond to the fields defined in the TRACE_EVENT
> +definition for that event.
> +
> +Here's the information displayed for the 'sched_wakeup' event:
> +
> +# cat /debug/tracing/events/sched/sched_wakeup/format
> +
> +name: sched_wakeup
> +ID: 60
> +format:
> +	field:unsigned short common_type;	offset:0;	size:2;
> +	field:unsigned char common_flags;	offset:2;	size:1;
> +	field:unsigned char common_preempt_count;	offset:3;	size:1;
> +	field:int common_pid;	offset:4;	size:4;
> +	field:int common_tgid;	offset:8;	size:4;
> +
> +	field:char comm[TASK_COMM_LEN];	offset:12;	size:16;
> +	field:pid_t pid;	offset:28;	size:4;
> +	field:int prio;	offset:32;	size:4;
> +	field:int success;	offset:36;	size:4;
> +	field:int cpu;	offset:40;	size:4;
> +
> +print fmt: "task %s:%d [%d] success=%d [%03d]", REC->comm, REC->pid, REC->prio, REC->success, REC->cpu
> +
> +5. Event filtering
> +==================
> +

How about adding titles 5.1, 5.2, 5.3?

5.1 expression syntax

5.2 set a filter

5.3 clear a filter

5.4 subsystem filter

> +Trace events can be filtered in the kernel by associating boolean
> +'filter expressions' with them.  As soon as an event is logged into
> +the trace buffer, its fields are checked against the filter expression
> +associated with that event type.  An event with field values that
> +'match' the filter will appear in the trace output, and an event whose
> +values don't match will be discarded.  An event with no filter
> +associated with it matches everything, which is the default when no
> +filter has been set for an event.
> +
> +A filter expression consists of one or more 'predicates' that can be
> +combined using the logical operators '&&' and '||'.  A predicate is
> +simply a clause that compares the value of a field contained within a
> +logged event with a constant value and returns either 0 or 1 depending
> +on whether the field value matched (1) or didn't match (0):
> +
> +	  field-name relational-operator value
> +
> +Parentheses can be used to provide arbitrary logical groupings and
> +double-quotes can be used to prevent the shell from interpreting
> +operators as shell metacharacters.
> +
> +The field-names available for use in filters can be found in the
> +'format' files for trace events (see section 4 above).
> +
> +The relational-operators depend on the type of the field being tested:
> +
> +The operators available for numeric fields are:
> +    
> +==, !=, <, <=, >, >=
> +
> +And for string fields they are:
> +    
> +==, !=
> +
> +Currently, only exact string matches are supported.
> +
> +Currently, the maximum number of predicates in a filter is set at 16.
> +
> +A filter for an individual event is set by writing a filter expression
> +to the 'filter' file for the given event.
> +
> +For example:
> +
> +# cd /debug/tracing/events/sched/sched_wakeup
> +# echo "common_preempt_count > 4" > filter
> +
> +A slightly more involved example:
> +
> +# cd /debug/tracing/events/sched/sched_signal_send
> +# echo "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
> +
> +If there was an error in the expression, you'll get an 'Invalid
> +argument' error when setting it, and the erroneous string along with
> +an error message can be seen by looking at the filter e.g.:
> +
> +# cd /debug/tracing/events/sched/sched_signal_send
> +# echo "((sig >= 10 && sig < 15) || dsig == 17) && comm != bash" > filter
> +-bash: echo: write error: Invalid argument
> +# cat filter
> +((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
> +^
> +parse_error: Field not found
> +
> +Currently the caret for an error always appears at the beginning of
> +the filter string; the error message should be still be useful though

s/should be still be/should still be

> +even without more accurate position info.
> +    
> +To clear a filter, write a '0' to the filter file.
> +
> +For convenience, filters for every event in a subsystem can be set or
> +cleared as a group by writing a filter expression into the filter file
> +at the root of the subsytem.  Note, however, that if a filter for any
> +event within the subsystem lacks a field specified in the subsystem
> +filter, or if the filter can't be applied for any other reason, the
> +filter for that event will retain its previous setting.  This can
> +result in an unintended mixture of filters which could lead to
> +confusing (to the user who might think different filters are in
> +effect) trace output.  Only filters that reference just the common
> +fields can be guaranteed to propagate successfully to all events.
> +
> +To clear the filters for all events in a subsystem, write a '0' to the
> +subsystem's filter file.
> +
> +Here are a few subsystem filter examples that also illustrate the
> +above points:
> +
> +Clear the filters on all events in the sched subsytem:
> +
> +# cd /sys/kernel/debug/tracing/events/sched
> +# echo 0 > filter
> +# cat sched_switch/filter
> +none
> +# cat sched_switch/filter
> +none
> +
> +Set a filter using only common fields for all events in the sched
> +subsytem (all events end up with the same filter):
> +
> +# cd /sys/kernel/debug/tracing/events/sched
> +# echo common_pid == 0 > filter
> +# cat sched_switch/filter
> +common_pid == 0
> +# cat sched_wakeup/filter
> +common_pid == 0
> +
> +Attempt to set a filter using a non-common field for all events in the
> +sched subsytem (all events but those that have a prev_pid field retain
> +their old filters):
> +
> +# cd /sys/kernel/debug/tracing/events/sched
> +# echo prev_pid == 0 > filter
> +# cat sched_switch/filter
> +prev_pid == 0
> +# cat sched_wakeup/filter
> +common_pid == 0
> 

  reply	other threads:[~2009-09-09  2:20 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-07  8:11 [PATCH 0/6] perf trace: Add filter support Li Zefan
2009-09-07  8:12 ` [PATCH 1/6] tracing/filters: refactor subsystem filter code Li Zefan
2009-09-07  8:12 ` [PATCH 2/6] tracing/profile: Add filter support Li Zefan
2009-09-08  2:01   ` Frederic Weisbecker
2009-09-08  2:24     ` Frederic Weisbecker
2009-09-08  8:35     ` Peter Zijlstra
2009-09-08 12:33       ` Frederic Weisbecker
2009-09-07  8:13 ` [PATCH 3/6] tracing/syscalls: Add profile " Li Zefan
2009-09-07  8:13 ` [PATCH 4/6] perf_counter: Add PERF_COUNTER_IOC_SET_FILTER ioctl Li Zefan
2009-09-07 16:44   ` Peter Zijlstra
2009-09-07 16:48     ` Ingo Molnar
2009-09-07 16:55       ` Peter Zijlstra
2009-09-08  0:49         ` Li Zefan
2009-09-08  6:52           ` Ingo Molnar
2009-09-08  8:37           ` Peter Zijlstra
2009-09-08  7:01         ` Tom Zanussi
2009-09-09  2:18           ` Li Zefan [this message]
2009-09-10  4:45             ` Tom Zanussi
2009-09-10 23:01           ` Randy Dunlap
2009-09-11  4:08             ` Tom Zanussi
2009-09-07  8:13 ` [PATCH 5/6] perf trace: increase MAX_EVENT_LENGTH Li Zefan
2009-09-07  8:14 ` [PATCH 6/6] perf trace: Add filter support Li Zefan
2009-09-08  0:02 ` [PATCH 0/6] " Frederic Weisbecker
2009-09-08  1:06   ` Li Zefan
2009-09-08  2:12     ` Frederic Weisbecker
2009-09-08  6:53       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AA71080.7030503@cn.fujitsu.com \
    --to=lizf@cn.fujitsu.com \
    --cc=fweisbec@gmail.com \
    --cc=jbaron@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tzanussi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.