From: Arnaldo Carvalho de Melo <acme@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>,
LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
prasad@linux.vnet.ibm.com,
Linus Torvalds <torvalds@linux-foundation.org>,
Mathieu Desnoyers <compudj@krystal.dyndns.org>,
"Frank Ch. Eigler" <fche@redhat.com>,
David Wilder <dwilder@us.ibm.com>,
hch@lst.de, Martin Bligh <mbligh@google.com>,
Christoph Hellwig <hch@infradead.org>,
Steven Rostedt <srostedt@redhat.com>
Subject: Re: [PATCH v5] Unified trace buffer
Date: Fri, 26 Sep 2008 14:31:30 -0300 [thread overview]
Message-ID: <20080926173130.GE15446@ghostprotocols.net> (raw)
In-Reply-To: <alpine.DEB.1.10.0809261245420.21618@gandalf.stny.rr.com>
Em Fri, Sep 26, 2008 at 01:11:57PM -0400, Steven Rostedt escreveu:
>
> [
> Note the removal of the RFC in the subject.
> I am happy with this version. It handles everything I need
> for ftrace.
>
> New since last version:
>
> - Fixed timing bug. I did not add the deltas properly when
> reading the buffer.
>
> - Removed "-1" time stamp normalize test. This made the
> clock go backwards!
>
> - Removed page pointer array and replaced it with the ftrace
> page struct link list trick. Since this is my second time
> writing this code (first with ftrace), it is actually much
> cleaner than the ftrace code.
>
> - Implemented buffer resizing. By using the page link list trick,
> this became much simpler.
>
> Note, the GOTD part is still not implemented, but can be done
> later without affecting this interface.
>
> ]
>
> This is a unified tracing buffer that implements a ring buffer that
> hopefully everyone will eventually be able to use.
>
> The events recorded into the buffer have the following structure:
>
> struct ring_buffer_event {
> u32 type:2, len:3, time_delta:27;
> u32 array[];
> };
>
> The minimum size of an event is 8 bytes. All events are 4 byte
> aligned inside the buffer.
>
> There are 4 types (all internal use for the ring buffer, only
> the data type is exported to the interface users).
>
> RB_TYPE_PADDING: this type is used to note extra space at the end
> of a buffer page.
>
> RB_TYPE_TIME_EXTENT: This type is used when the time between events
> is greater than the 27 bit delta can hold. We add another
> 32 bits, and record that in its own event (8 byte size).
>
> RB_TYPE_TIME_STAMP: (Not implemented yet). This will hold data to
> help keep the buffer timestamps in sync.
>
> RB_TYPE_DATA: The event actually holds user data.
>
> The "len" field is only three bits. Since the data must be
> 4 byte aligned, this field is shifted left by 2, giving a
> max length of 28 bytes. If the data load is greater than 28
> bytes, the first array field holds the full length of the
> data load and the len field is set to zero.
>
> Example, data size of 7 bytes:
>
> type = RB_TYPE_DATA
> len = 2
> time_delta: <time-stamp> - <prev_event-time-stamp>
> array[0..1]: <7 bytes of data> <1 byte empty>
>
> This event is saved in 12 bytes of the buffer.
>
> An event with 82 bytes of data:
>
> type = RB_TYPE_DATA
> len = 0
> time_delta: <time-stamp> - <prev_event-time-stamp>
> array[0]: 84 (Note the alignment)
> array[1..14]: <82 bytes of data> <2 bytes empty>
>
> The above event is saved in 92 bytes (if my math is correct).
> 82 bytes of data, 2 bytes empty, 4 byte header, 4 byte length.
>
> Do not reference the above event struct directly. Use the following
> functions to gain access to the event table, since the
> ring_buffer_event structure may change in the future.
>
> ring_buffer_event_length(event): get the length of the event.
> This is the size of the memory used to record this
> event, and not the size of the data pay load.
>
> ring_buffer_time_delta(event): get the time delta of the event
> This returns the delta time stamp since the last event.
> Note: Even though this is in the header, there should
> be no reason to access this directly, accept
> for debugging.
>
> ring_buffer_event_data(event): get the data from the event
> This is the function to use to get the actual data
> from the event. Note, it is only a pointer to the
> data inside the buffer. This data must be copied to
> another location otherwise you risk it being written
> over in the buffer.
>
> ring_buffer_lock: A way to lock the entire buffer.
> ring_buffer_unlock: unlock the buffer.
>
> ring_buffer_alloc: create a new ring buffer. Can choose between
> overwrite or consumer/producer mode. Overwrite will
> overwrite old data, where as consumer producer will
> throw away new data if the consumer catches up with the
> producer. The consumer/producer is the default.
>
> ring_buffer_free: free the ring buffer.
>
> ring_buffer_resize: resize the buffer. Changes the size of each cpu
> buffer. Note, it is up to the caller to provide that
> the buffer is not being used while this is happening.
> This requirement may go away but do not count on it.
>
> ring_buffer_lock_reserve: locks the ring buffer and allocates an
> entry on the buffer to write to.
> ring_buffer_unlock_commit: unlocks the ring buffer and commits it to
> the buffer.
>
> ring_buffer_write: writes some data into the ring buffer.
>
> ring_buffer_peek: Look at a next item in the cpu buffer.
> ring_buffer_consume: get the next item in the cpu buffer and
> consume it. That is, this function increments the head
> pointer.
>
> ring_buffer_read_start: Start an iterator of a cpu buffer.
> For now, this disables the cpu buffer, until you issue
> a finish. This is just because we do not want the iterator
> to be overwritten. This restriction may change in the future.
> But note, this is used for static reading of a buffer which
> is usually done "after" a trace. Live readings would want
> to use the ring_buffer_consume above, which will not
> disable the ring buffer.
>
> ring_buffer_read_finish: Finishes the read iterator and reenables
> the ring buffer.
>
> ring_buffer_iter_peek: Look at the next item in the cpu iterator.
> ring_buffer_read: Read the iterator and increment it.
> ring_buffer_iter_reset: Reset the iterator to point to the beginning
> of the cpu buffer.
> ring_buffer_iter_empty: Returns true if the iterator is at the end
> of the cpu buffer.
>
> ring_buffer_size: returns the size in bytes of each cpu buffer.
> Note, the real size is this times the number of CPUs.
>
> ring_buffer_reset_cpu: Sets the cpu buffer to empty
> ring_buffer_reset: sets all cpu buffers to empty
>
> ring_buffer_swap_cpu: swaps a cpu buffer from one buffer with a
> cpu buffer of another buffer. This is handy when you
> want to take a snap shot of a running trace on just one
> cpu. Having a backup buffer, to swap with facilitates this.
> Ftrace max latencies use this.
>
> ring_buffer_empty: Returns true if the ring buffer is empty.
> ring_buffer_empty_cpu: Returns true if the cpu buffer is empty.
>
> ring_buffer_record_disable: disable all cpu buffers (read only)
> ring_buffer_record_disable_cpu: disable a single cpu buffer (read only)
> ring_buffer_record_enable: enable all cpu buffers.
> ring_buffer_record_enabl_cpu: enable a single cpu buffer.
>
> ring_buffer_entries: The number of entries in a ring buffer.
> ring_buffer_overruns: The number of entries removed due to writing wrap.
>
> ring_buffer_time_stamp: Get the time stamp used by the ring buffer
> ring_buffer_normalize_time_stamp: normalize the ring buffer time stamp
> into nanosecs.
>
> I still need to implement the GTOD feature. But we need support from
> the cpu frequency infrastructure. But this can be done at a later
> time without affecting the ring buffer interface.
>
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
> include/linux/ring_buffer.h | 178 +++++
> kernel/trace/Kconfig | 4
> kernel/trace/Makefile | 1
> kernel/trace/ring_buffer.c | 1491 ++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 1674 insertions(+)
>
> Index: linux-trace.git/include/linux/ring_buffer.h
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-trace.git/include/linux/ring_buffer.h 2008-09-25 21:29:16.000000000 -0400
> @@ -0,0 +1,178 @@
> +#ifndef _LINUX_RING_BUFFER_H
> +#define _LINUX_RING_BUFFER_H
> +
> +#include <linux/mm.h>
> +#include <linux/seq_file.h>
> +
> +struct ring_buffer;
> +struct ring_buffer_iter;
> +
> +/*
> + * Don't reference this struct directly, use the inline items below.
> + */
> +struct ring_buffer_event {
> + u32 type:2, len:3, time_delta:27;
> + u32 array[];
> +} __attribute__((__packed__));
Why do you need __packed__ here? With or without it the layout is the
same:
[acme@doppio examples]$ pahole packed
struct ring_buffer_event {
u32 type:2; /* 0:30 4 */
u32 len:3; /* 0:27 4 */
u32 time_delta:27; /* 0: 0 4 */
u32 array[0]; /* 4 0 */
/* size: 4, cachelines: 1, members: 4 */
/* last cacheline: 4 bytes */
};
- Arnaldo
next prev parent reply other threads:[~2008-09-26 17:39 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-25 18:51 [RFC PATCH 0/2 v3] Unified trace buffer Steven Rostedt
2008-09-25 18:51 ` [RFC PATCH 1/2 " Steven Rostedt
2008-09-26 1:02 ` [RFC PATCH v4] " Steven Rostedt
2008-09-26 1:52 ` Masami Hiramatsu
2008-09-26 2:11 ` Steven Rostedt
2008-09-26 2:47 ` Masami Hiramatsu
2008-09-26 3:20 ` Mathieu Desnoyers
2008-09-26 7:18 ` Peter Zijlstra
2008-09-26 10:45 ` Steven Rostedt
2008-09-26 11:00 ` Peter Zijlstra
2008-09-26 16:57 ` Masami Hiramatsu
2008-09-26 17:14 ` Steven Rostedt
2008-09-26 10:47 ` Steven Rostedt
2008-09-26 16:04 ` Mathieu Desnoyers
2008-09-26 17:11 ` [PATCH v5] " Steven Rostedt
2008-09-26 17:31 ` Arnaldo Carvalho de Melo [this message]
2008-09-26 17:37 ` Linus Torvalds
2008-09-26 17:46 ` Steven Rostedt
2008-09-27 17:02 ` Ingo Molnar
2008-09-27 17:18 ` Steven Rostedt
2008-09-26 18:05 ` [PATCH v6] " Steven Rostedt
2008-09-26 18:30 ` Richard Holden
2008-09-26 18:39 ` Steven Rostedt
2008-09-26 18:59 ` Peter Zijlstra
2008-09-26 19:46 ` Martin Bligh
2008-09-26 19:52 ` Steven Rostedt
2008-09-26 21:37 ` Steven Rostedt
2008-09-26 19:14 ` Peter Zijlstra
2008-09-26 22:28 ` Mike Travis
2008-09-26 23:56 ` Steven Rostedt
2008-09-27 0:05 ` Mike Travis
2008-09-27 0:18 ` Steven Rostedt
2008-09-27 0:46 ` Mike Travis
2008-09-27 0:52 ` Steven Rostedt
2008-09-26 19:17 ` Peter Zijlstra
2008-09-26 23:16 ` Arjan van de Ven
2008-09-26 20:08 ` Peter Zijlstra
2008-09-26 21:14 ` Masami Hiramatsu
2008-09-26 21:26 ` Steven Rostedt
2008-09-26 21:13 ` [PATCH v7] " Steven Rostedt
2008-09-27 2:02 ` [PATCH v8] " Steven Rostedt
2008-09-27 6:06 ` [PATCH v9] " Steven Rostedt
2008-09-27 18:39 ` Ingo Molnar
2008-09-27 19:24 ` Steven Rostedt
2008-09-27 19:41 ` Ingo Molnar
2008-09-27 19:54 ` Steven Rostedt
2008-09-27 20:00 ` Ingo Molnar
2008-09-29 15:05 ` Steven Rostedt
2008-09-27 20:07 ` Martin Bligh
2008-09-27 20:34 ` Ingo Molnar
2008-09-29 16:10 ` [PATCH v10 Golden] " Steven Rostedt
2008-09-29 16:11 ` Steven Rostedt
2008-09-29 23:35 ` Mathieu Desnoyers
2008-09-30 0:01 ` Steven Rostedt
2008-09-30 0:03 ` Mathieu Desnoyers
2008-09-30 0:12 ` Steven Rostedt
2008-09-30 3:46 ` Mathieu Desnoyers
2008-09-30 4:00 ` Steven Rostedt
2008-09-30 15:20 ` Jonathan Corbet
2008-09-30 15:54 ` Peter Zijlstra
2008-09-30 16:38 ` Linus Torvalds
2008-09-30 16:48 ` Steven Rostedt
2008-09-30 17:00 ` Peter Zijlstra
2008-09-30 17:41 ` Steven Rostedt
2008-09-30 17:49 ` Peter Zijlstra
2008-09-30 17:56 ` Steven Rostedt
2008-09-30 18:02 ` Steven Rostedt
2008-09-30 17:01 ` Linus Torvalds
2008-10-01 15:14 ` [PATCH] ring_buffer: allocate buffer page pointer Steven Rostedt
2008-10-01 17:36 ` Mathieu Desnoyers
2008-10-01 17:49 ` Steven Rostedt
2008-10-01 18:21 ` Mathieu Desnoyers
2008-10-02 8:50 ` Ingo Molnar
2008-10-02 8:51 ` Ingo Molnar
2008-10-02 9:05 ` [PATCH] ring-buffer: fix build error Ingo Molnar
2008-10-02 9:38 ` [boot crash] " Ingo Molnar
2008-10-02 13:16 ` Steven Rostedt
2008-10-02 13:17 ` Steven Rostedt
2008-10-02 15:50 ` Ingo Molnar
2008-10-02 18:27 ` Steven Rostedt
2008-10-02 18:55 ` Ingo Molnar
2008-10-02 23:18 ` [PATCH] ring_buffer: map to cpu not page Steven Rostedt
2008-10-02 23:36 ` Steven Rostedt
2008-10-03 4:56 ` [PATCH] x86 Topology cpu_to_node parameter check Mathieu Desnoyers
2008-10-03 5:20 ` Steven Rostedt
2008-10-03 15:56 ` Mathieu Desnoyers
2008-10-03 16:26 ` Steven Rostedt
2008-10-03 17:21 ` Mathieu Desnoyers
2008-10-03 17:54 ` Steven Rostedt
2008-10-03 18:53 ` [PATCH] topology.h define mess fix Mathieu Desnoyers
2008-10-03 20:14 ` Luck, Tony
2008-10-03 22:47 ` [PATCH] topology.h define mess fix v2 Mathieu Desnoyers
2008-10-03 7:27 ` [PATCH] ring_buffer: map to cpu not page Ingo Molnar
2008-10-02 9:06 ` [PATCH] ring_buffer: allocate buffer page pointer Andrew Morton
2008-10-02 9:41 ` Ingo Molnar
2008-10-02 13:06 ` Steven Rostedt
2008-09-26 22:31 ` [PATCH v6] Unified trace buffer Arnaldo Carvalho de Melo
2008-09-26 23:58 ` Steven Rostedt
2008-09-27 0:13 ` Linus Torvalds
2008-09-27 0:23 ` Steven Rostedt
2008-09-27 0:28 ` Steven Rostedt
2008-09-25 18:51 ` [RFC PATCH 2/2 v3] ftrace: make work with new ring buffer Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080926173130.GE15446@ghostprotocols.net \
--to=acme@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=compudj@krystal.dyndns.org \
--cc=dwilder@us.ibm.com \
--cc=fche@redhat.com \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@google.com \
--cc=mhiramat@redhat.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=prasad@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
--cc=srostedt@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.