public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq
@ 2009-12-08 19:54 Steven Rostedt
  2009-12-08 19:54 ` [PATCH 1/2][RFC] [PATCH 1/2] tracing: Change trace_seq to use separate buffer Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Steven Rostedt @ 2009-12-08 19:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Li Zefan,
	Lai Jiangshan

This is an RFC patch set. The trace_seq currently has its buffer
within the structure itself. But this limits its ability and efficiency.

This patch set separates it out, but now it requires the callers
to supply their own buffer. But this helps out the splice code because
it can now write directly into the splice pages.

This may be too much for 33? But it is a nice fix.

Thoughs?

The following patches are in:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git

    branch: rfc/tracing/core


Steven Rostedt (2):
      tracing: Change trace_seq to use separate buffer
      tracing: Write directly into splice page for trace_pipe

----
 include/linux/ftrace_event.h |    6 ++++-
 include/linux/trace_seq.h    |   22 +++++++++++++++-
 include/trace/ftrace.h       |   20 +++++++++++----
 kernel/trace/ftrace.c        |    3 +-
 kernel/trace/trace.c         |   54 ++++++++++++++++-------------------------
 kernel/trace/trace_events.c  |   45 +++++++++++++++++++++++++++++++----
 kernel/trace/trace_ksym.c    |   10 +++++++-
 kernel/trace/trace_output.c  |   24 ++++++++++--------
 8 files changed, 125 insertions(+), 59 deletions(-)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2][RFC] [PATCH 1/2] tracing: Change trace_seq to use separate buffer
  2009-12-08 19:54 [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq Steven Rostedt
@ 2009-12-08 19:54 ` Steven Rostedt
  2009-12-08 19:54 ` [PATCH 2/2][RFC] [PATCH 2/2] tracing: Write directly into splice page for trace_pipe Steven Rostedt
  2009-12-09  9:44 ` [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq Lai Jiangshan
  2 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2009-12-08 19:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Li Zefan,
	Lai Jiangshan

[-- Attachment #1: 0001-tracing-Change-trace_seq-to-use-separate-buffer.patch --]
[-- Type: text/plain, Size: 16855 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Currently, the trace_seq buffer is part of the trace_seq structure.
This makes manipulating the trace_seq easier, but it also limits
its ability. In some cases, it is advantageous to have trace_seq
write into a separate buffer. Separating the buffer from the structure
makes the usage of trace_seq a little more complex, but it also
makes it more efficient.

The splice code will then be able to write directly into the splice
page as suppose to write into the trace_seq buffer and copying a page
worth of data.

Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/ftrace_event.h |    6 ++++-
 include/linux/trace_seq.h    |   22 ++++++++++++++++++-
 include/trace/ftrace.h       |   20 ++++++++++++++----
 kernel/trace/ftrace.c        |    3 +-
 kernel/trace/trace.c         |   24 +++++++++++++++------
 kernel/trace/trace_events.c  |   45 +++++++++++++++++++++++++++++++++++++----
 kernel/trace/trace_ksym.c    |   10 ++++++++-
 kernel/trace/trace_output.c  |   24 ++++++++++++----------
 8 files changed, 121 insertions(+), 33 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 38f8d65..1921cc0 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -10,7 +10,10 @@ struct trace_array;
 struct tracer;
 struct dentry;
 
+#define FTRACE_SEQ_BUFSIZE PAGE_SIZE
+
 DECLARE_PER_CPU(struct trace_seq, ftrace_event_seq);
+DECLARE_PER_CPU(unsigned char[FTRACE_SEQ_BUFSIZE], ftrace_event_buffer);
 
 struct trace_print_flags {
 	unsigned long		mask;
@@ -53,9 +56,10 @@ struct trace_iterator {
 	struct mutex		mutex;
 	struct ring_buffer_iter	*buffer_iter[NR_CPUS];
 	unsigned long		iter_flags;
+	struct trace_seq	seq;
+	unsigned char		buffer[FTRACE_SEQ_BUFSIZE];
 
 	/* The below is zeroed out in pipe_read */
-	struct trace_seq	seq;
 	struct trace_entry	*ent;
 	int			leftover;
 	int			cpu;
diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
index dd38678..72238e7 100644
--- a/include/linux/trace_seq.h
+++ b/include/linux/trace_seq.h
@@ -11,18 +11,36 @@
  */
 
 struct trace_seq {
-	unsigned char		buffer[PAGE_SIZE];
 	unsigned int		len;
 	unsigned int		readpos;
 	int			full;
+	int			buflen;
+	unsigned char		*buffer;
 };
 
 static inline void
-trace_seq_init(struct trace_seq *s)
+trace_seq_init(struct trace_seq *s,
+	       unsigned char *buffer, int buflen)
 {
 	s->len = 0;
 	s->readpos = 0;
 	s->full = 0;
+	s->buflen = buflen;
+	s->buffer = buffer;
+}
+
+static inline void trace_seq_reset(struct trace_seq *s)
+{
+	WARN_ON_ONCE(!s->buffer);
+
+	s->len = 0;
+	s->readpos = 0;
+	s->full = 0;
+}
+
+static inline unsigned char *trace_seq_buffer(struct trace_seq *s)
+{
+	return s->buffer;
 }
 
 /*
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index d1b3de9..de03116 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -266,6 +266,7 @@ ftrace_format_##name(struct ftrace_event_call *unused,			\
  *	struct ftrace_raw_<call> *field; <-- defined in stage 1
  *	struct trace_entry *entry;
  *	struct trace_seq *p;
+ *	unsigned char *buffer;
  *	int ret;
  *
  *	entry = iter->ent;
@@ -278,7 +279,8 @@ ftrace_format_##name(struct ftrace_event_call *unused,			\
  *	field = (typeof(field))entry;
  *
  *	p = get_cpu_var(ftrace_event_seq);
- *	trace_seq_init(p);
+ *	buffer = get_cpu_var(ftrace_event_buffer);
+ *	trace_seq_init(p, buffer, FTRACE_SEQ_BUFSIZE);
  *	ret = trace_seq_printf(s, <TP_printk> "\n");
  *	put_cpu();
  *	if (!ret)
@@ -331,7 +333,9 @@ ftrace_raw_output_id_##call(int event_id, const char *name,		\
 	struct ftrace_raw_##call *field;				\
 	struct trace_entry *entry;					\
 	struct trace_seq *p;						\
+	unsigned char *buffer;						\
 	int ret;							\
+	int cpu;							\
 									\
 	entry = iter->ent;						\
 									\
@@ -342,8 +346,10 @@ ftrace_raw_output_id_##call(int event_id, const char *name,		\
 									\
 	field = (typeof(field))entry;					\
 									\
-	p = &get_cpu_var(ftrace_event_seq);				\
-	trace_seq_init(p);						\
+	cpu = get_cpu();						\
+	p = &per_cpu(ftrace_event_seq, cpu);				\
+	buffer = per_cpu(ftrace_event_buffer, cpu);			\
+	trace_seq_init(p, buffer, FTRACE_SEQ_BUFSIZE);			\
 	ret = trace_seq_printf(s, "%s: ", name);			\
 	if (ret)							\
 		ret = trace_seq_printf(s, print);			\
@@ -372,7 +378,9 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags)	\
 	struct ftrace_raw_##template *field;				\
 	struct trace_entry *entry;					\
 	struct trace_seq *p;						\
+	unsigned char *buffer;						\
 	int ret;							\
+	int cpu;							\
 									\
 	entry = iter->ent;						\
 									\
@@ -383,8 +391,10 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags)	\
 									\
 	field = (typeof(field))entry;					\
 									\
-	p = &get_cpu_var(ftrace_event_seq);				\
-	trace_seq_init(p);						\
+	cpu = get_cpu();						\
+	p = &per_cpu(ftrace_event_seq, cpu);				\
+	buffer = per_cpu(ftrace_event_buffer, cpu);			\
+	trace_seq_init(p, buffer, FTRACE_SEQ_BUFSIZE);			\
 	ret = trace_seq_printf(s, "%s: ", #call);			\
 	if (ret)							\
 		ret = trace_seq_printf(s, print);			\
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index e51a1bc..3118503 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -383,6 +383,7 @@ static int function_stat_show(struct seq_file *m, void *v)
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	static DEFINE_MUTEX(mutex);
 	static struct trace_seq s;
+	static char s_buffer[PAGE_SIZE];
 	unsigned long long avg;
 #endif
 
@@ -395,7 +396,7 @@ static int function_stat_show(struct seq_file *m, void *v)
 	do_div(avg, rec->counter);
 
 	mutex_lock(&mutex);
-	trace_seq_init(&s);
+	trace_seq_init(&s, s_buffer, PAGE_SIZE);
 	trace_print_graph_duration(rec->time, &s);
 	trace_seq_puts(&s, "    ");
 	trace_print_graph_duration(avg, &s);
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 596dcf2..df25c4f 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2001,6 +2001,7 @@ __tracing_open(struct inode *inode, struct file *file)
 	iter = kzalloc(sizeof(*iter), GFP_KERNEL);
 	if (!iter)
 		return ERR_PTR(-ENOMEM);
+	trace_seq_init(&iter->seq, iter->buffer, FTRACE_SEQ_BUFSIZE);
 
 	/*
 	 * We make a copy of the current tracer to avoid concurrent
@@ -2873,6 +2874,7 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
 		ret = -ENOMEM;
 		goto out;
 	}
+	trace_seq_init(&iter->seq, iter->buffer, FTRACE_SEQ_BUFSIZE);
 
 	/*
 	 * We make a copy of the current tracer to avoid concurrent
@@ -3046,7 +3048,7 @@ tracing_read_pipe(struct file *filp, char __user *ubuf,
 	if (sret != -EBUSY)
 		return sret;
 
-	trace_seq_init(&iter->seq);
+	trace_seq_reset(&iter->seq);
 
 	/* copy the tracer to avoid using a global lock all around */
 	mutex_lock(&trace_types_lock);
@@ -3083,10 +3085,11 @@ waitagain:
 		cnt = PAGE_SIZE - 1;
 
 	/* reset all but tr, trace, and overruns */
-	memset(&iter->seq, 0,
+	memset(&iter->ent, 0,
 	       sizeof(struct trace_iterator) -
-	       offsetof(struct trace_iterator, seq));
+	       offsetof(struct trace_iterator, ent));
 	iter->pos = -1;
+	trace_seq_reset(&iter->seq);
 
 	trace_event_read_lock();
 	while (find_next_entry_inc(iter) != NULL) {
@@ -3110,7 +3113,7 @@ waitagain:
 	/* Now copy what we have to the user */
 	sret = trace_seq_to_user(&iter->seq, ubuf, cnt);
 	if (iter->seq.readpos >= iter->seq.len)
-		trace_seq_init(&iter->seq);
+		trace_seq_reset(&iter->seq);
 
 	/*
 	 * If there was nothing to send to user, inspite of consuming trace
@@ -3250,7 +3253,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
 		partial[i].offset = 0;
 		partial[i].len = iter->seq.len;
 
-		trace_seq_init(&iter->seq);
+		trace_seq_reset(&iter->seq);
 	}
 
 	trace_event_read_unlock();
@@ -3748,13 +3751,19 @@ tracing_stats_read(struct file *filp, char __user *ubuf,
 	unsigned long cpu = (unsigned long)filp->private_data;
 	struct trace_array *tr = &global_trace;
 	struct trace_seq *s;
+	unsigned char *buffer;
 	unsigned long cnt;
 
 	s = kmalloc(sizeof(*s), GFP_KERNEL);
 	if (!s)
 		return -ENOMEM;
+	buffer = (unsigned char *)__get_free_page(GFP_KERNEL);
+	if (!buffer) {
+		kfree(s);
+		return -ENOMEM;
+	}
 
-	trace_seq_init(s);
+	trace_seq_init(s, buffer, PAGE_SIZE);
 
 	cnt = ring_buffer_entries_cpu(tr->buffer, cpu);
 	trace_seq_printf(s, "entries: %ld\n", cnt);
@@ -3767,6 +3776,7 @@ tracing_stats_read(struct file *filp, char __user *ubuf,
 
 	count = simple_read_from_buffer(ubuf, count, ppos, s->buffer, s->len);
 
+	free_page((unsigned long)buffer);
 	kfree(s);
 
 	return count;
@@ -4296,7 +4306,7 @@ trace_printk_seq(struct trace_seq *s)
 
 	printk(KERN_TRACE "%s", s->buffer);
 
-	trace_seq_init(s);
+	trace_seq_reset(s);
 }
 
 static void __ftrace_dump(bool disable_tracing)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 1d18315..4959e2d 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -529,6 +529,7 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
 		  loff_t *ppos)
 {
 	struct ftrace_event_call *call = filp->private_data;
+	unsigned char *buffer;
 	struct trace_seq *s;
 	char *buf;
 	int r;
@@ -539,8 +540,13 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
 	s = kmalloc(sizeof(*s), GFP_KERNEL);
 	if (!s)
 		return -ENOMEM;
+	buffer = (unsigned char *)__get_free_page(GFP_KERNEL);
+	if (!buffer) {
+		kfree(s);
+		return -ENOMEM;
+	}
 
-	trace_seq_init(s);
+	trace_seq_init(s, buffer, PAGE_SIZE);
 
 	/* If any of the first writes fail, so will the show_format. */
 
@@ -563,6 +569,7 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
 	r = simple_read_from_buffer(ubuf, cnt, ppos,
 				    s->buffer, s->len);
  out:
+	free_page((unsigned long)buffer);
 	kfree(s);
 	return r;
 }
@@ -571,6 +578,7 @@ static ssize_t
 event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
 {
 	struct ftrace_event_call *call = filp->private_data;
+	unsigned char *buffer;
 	struct trace_seq *s;
 	int r;
 
@@ -580,12 +588,18 @@ event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
 	s = kmalloc(sizeof(*s), GFP_KERNEL);
 	if (!s)
 		return -ENOMEM;
+	buffer = (unsigned char *)__get_free_page(GFP_KERNEL);
+	if (!buffer) {
+		kfree(s);
+		return -ENOMEM;
+	}
 
-	trace_seq_init(s);
+	trace_seq_init(s, buffer, PAGE_SIZE);
 	trace_seq_printf(s, "%d\n", call->id);
 
 	r = simple_read_from_buffer(ubuf, cnt, ppos,
 				    s->buffer, s->len);
+	free_page((unsigned long)buffer);
 	kfree(s);
 	return r;
 }
@@ -595,6 +609,7 @@ event_filter_read(struct file *filp, char __user *ubuf, size_t cnt,
 		  loff_t *ppos)
 {
 	struct ftrace_event_call *call = filp->private_data;
+	unsigned char *buffer;
 	struct trace_seq *s;
 	int r;
 
@@ -604,12 +619,18 @@ event_filter_read(struct file *filp, char __user *ubuf, size_t cnt,
 	s = kmalloc(sizeof(*s), GFP_KERNEL);
 	if (!s)
 		return -ENOMEM;
+	buffer = (unsigned char *)__get_free_page(GFP_KERNEL);
+	if (!buffer) {
+		kfree(s);
+		return -ENOMEM;
+	}
 
-	trace_seq_init(s);
+	trace_seq_init(s, buffer, PAGE_SIZE);
 
 	print_event_filter(call, s);
 	r = simple_read_from_buffer(ubuf, cnt, ppos, s->buffer, s->len);
 
+	free_page((unsigned long)buffer);
 	kfree(s);
 
 	return r;
@@ -651,6 +672,7 @@ subsystem_filter_read(struct file *filp, char __user *ubuf, size_t cnt,
 		      loff_t *ppos)
 {
 	struct event_subsystem *system = filp->private_data;
+	unsigned char *buffer;
 	struct trace_seq *s;
 	int r;
 
@@ -660,12 +682,18 @@ subsystem_filter_read(struct file *filp, char __user *ubuf, size_t cnt,
 	s = kmalloc(sizeof(*s), GFP_KERNEL);
 	if (!s)
 		return -ENOMEM;
+	buffer = (unsigned char *)__get_free_page(GFP_KERNEL);
+	if (!buffer) {
+		kfree(s);
+		return -ENOMEM;
+	}
 
-	trace_seq_init(s);
+	trace_seq_init(s, buffer, PAGE_SIZE);
 
 	print_subsystem_event_filter(system, s);
 	r = simple_read_from_buffer(ubuf, cnt, ppos, s->buffer, s->len);
 
+	free_page((unsigned long)buffer);
 	kfree(s);
 
 	return r;
@@ -706,6 +734,7 @@ static ssize_t
 show_header(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
 {
 	int (*func)(struct trace_seq *s) = filp->private_data;
+	unsigned char *buffer;
 	struct trace_seq *s;
 	int r;
 
@@ -715,12 +744,18 @@ show_header(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
 	s = kmalloc(sizeof(*s), GFP_KERNEL);
 	if (!s)
 		return -ENOMEM;
+	buffer = (unsigned char *)__get_free_page(GFP_KERNEL);
+	if (!buffer) {
+		kfree(s);
+		return -ENOMEM;
+	}
 
-	trace_seq_init(s);
+	trace_seq_init(s, buffer, PAGE_SIZE);
 
 	func(s);
 	r = simple_read_from_buffer(ubuf, cnt, ppos, s->buffer, s->len);
 
+	free_page((unsigned long)buffer);
 	kfree(s);
 
 	return r;
diff --git a/kernel/trace/trace_ksym.c b/kernel/trace/trace_ksym.c
index ddfa0fd..c4972e1 100644
--- a/kernel/trace/trace_ksym.c
+++ b/kernel/trace/trace_ksym.c
@@ -223,6 +223,7 @@ static ssize_t ksym_trace_filter_read(struct file *filp, char __user *ubuf,
 {
 	struct trace_ksym *entry;
 	struct hlist_node *node;
+	unsigned char *buffer;
 	struct trace_seq *s;
 	ssize_t cnt = 0;
 	int ret;
@@ -230,7 +231,13 @@ static ssize_t ksym_trace_filter_read(struct file *filp, char __user *ubuf,
 	s = kmalloc(sizeof(*s), GFP_KERNEL);
 	if (!s)
 		return -ENOMEM;
-	trace_seq_init(s);
+	buffer = (unsigned char *)__get_free_page(GFP_KERNEL);
+	if (!buffer) {
+		kfree(s);
+		return -ENOMEM;
+	}
+
+	trace_seq_init(s, buffer, PAGE_SIZE);
 
 	mutex_lock(&ksym_tracer_mutex);
 
@@ -249,6 +256,7 @@ static ssize_t ksym_trace_filter_read(struct file *filp, char __user *ubuf,
 
 	mutex_unlock(&ksym_tracer_mutex);
 
+	free_page((unsigned long)buffer);
 	kfree(s);
 
 	return cnt;
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 8e46b33..78f9825 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -17,7 +17,9 @@
 DECLARE_RWSEM(trace_event_mutex);
 
 DEFINE_PER_CPU(struct trace_seq, ftrace_event_seq);
+DEFINE_PER_CPU(unsigned char[PAGE_SIZE], ftrace_event_buffer);
 EXPORT_PER_CPU_SYMBOL(ftrace_event_seq);
+EXPORT_PER_CPU_SYMBOL(ftrace_event_buffer);
 
 static struct hlist_head event_hash[EVENT_HASHSIZE] __read_mostly;
 
@@ -25,7 +27,7 @@ static int next_event_type = __TRACE_LAST_TYPE + 1;
 
 int trace_print_seq(struct seq_file *m, struct trace_seq *s)
 {
-	int len = s->len >= PAGE_SIZE ? PAGE_SIZE - 1 : s->len;
+	int len = s->len >= s->buflen ? s->buflen - 1 : s->len;
 	int ret;
 
 	ret = seq_write(m, s->buffer, len);
@@ -35,7 +37,7 @@ int trace_print_seq(struct seq_file *m, struct trace_seq *s)
 	 * seq_file buffer.
 	 */
 	if (!ret)
-		trace_seq_init(s);
+		trace_seq_reset(s);
 
 	return ret;
 }
@@ -89,7 +91,7 @@ enum print_line_t trace_print_printk_msg_only(struct trace_iterator *iter)
 int
 trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
 {
-	int len = (PAGE_SIZE - 1) - s->len;
+	int len = (s->buflen - 1) - s->len;
 	va_list ap;
 	int ret;
 
@@ -126,7 +128,7 @@ EXPORT_SYMBOL_GPL(trace_seq_printf);
 int
 trace_seq_vprintf(struct trace_seq *s, const char *fmt, va_list args)
 {
-	int len = (PAGE_SIZE - 1) - s->len;
+	int len = (s->buflen - 1) - s->len;
 	int ret;
 
 	if (s->full || !len)
@@ -148,7 +150,7 @@ EXPORT_SYMBOL_GPL(trace_seq_vprintf);
 
 int trace_seq_bprintf(struct trace_seq *s, const char *fmt, const u32 *binary)
 {
-	int len = (PAGE_SIZE - 1) - s->len;
+	int len = (s->buflen - 1) - s->len;
 	int ret;
 
 	if (s->full || !len)
@@ -184,7 +186,7 @@ int trace_seq_puts(struct trace_seq *s, const char *str)
 	if (s->full)
 		return 0;
 
-	if (len > ((PAGE_SIZE - 1) - s->len)) {
+	if (len > ((s->buflen - 1) - s->len)) {
 		s->full = 1;
 		return 0;
 	}
@@ -200,7 +202,7 @@ int trace_seq_putc(struct trace_seq *s, unsigned char c)
 	if (s->full)
 		return 0;
 
-	if (s->len >= (PAGE_SIZE - 1)) {
+	if (s->len >= (s->buflen - 1)) {
 		s->full = 1;
 		return 0;
 	}
@@ -215,7 +217,7 @@ int trace_seq_putmem(struct trace_seq *s, const void *mem, size_t len)
 	if (s->full)
 		return 0;
 
-	if (len > ((PAGE_SIZE - 1) - s->len)) {
+	if (len > ((s->buflen - 1) - s->len)) {
 		s->full = 1;
 		return 0;
 	}
@@ -255,7 +257,7 @@ void *trace_seq_reserve(struct trace_seq *s, size_t len)
 	if (s->full)
 		return 0;
 
-	if (len > ((PAGE_SIZE - 1) - s->len)) {
+	if (len > ((s->buflen - 1) - s->len)) {
 		s->full = 1;
 		return NULL;
 	}
@@ -273,12 +275,12 @@ int trace_seq_path(struct trace_seq *s, struct path *path)
 	if (s->full)
 		return 0;
 
-	if (s->len >= (PAGE_SIZE - 1)) {
+	if (s->len >= (s->buflen - 1)) {
 		s->full = 1;
 		return 0;
 	}
 
-	p = d_path(path, s->buffer + s->len, PAGE_SIZE - s->len);
+	p = d_path(path, s->buffer + s->len, s->buflen - s->len);
 	if (!IS_ERR(p)) {
 		p = mangle_path(s->buffer + s->len, p, "\n");
 		if (p) {
-- 
1.6.5



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2][RFC] [PATCH 2/2] tracing: Write directly into splice page for trace_pipe
  2009-12-08 19:54 [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq Steven Rostedt
  2009-12-08 19:54 ` [PATCH 1/2][RFC] [PATCH 1/2] tracing: Change trace_seq to use separate buffer Steven Rostedt
@ 2009-12-08 19:54 ` Steven Rostedt
  2009-12-09  9:44 ` [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq Lai Jiangshan
  2 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2009-12-08 19:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Li Zefan,
	Lai Jiangshan

[-- Attachment #1: 0002-tracing-Write-directly-into-splice-page-for-trace_pi.patch --]
[-- Type: text/plain, Size: 2207 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Now that the trace_seq has a separate buffer space, we can use this
feature to write directly into the splice page, instead of writing
to the trace_seq buffer and then copying it. This has shown almost
a 4% improvement in reading trace_pipe via splice:

Average timing of reading 10megs from trace_pipe before patch:

real	0m1.619s
user	0m0.002s
sys	0m1.596s

After patch:

real	0m1.561s
user    0m0.002s
sys	0m1.535s

These were the average of five runs of reading 10megs.

Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c |   32 +++++---------------------------
 1 files changed, 5 insertions(+), 27 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index df25c4f..0ee4e0e 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -468,25 +468,6 @@ ssize_t trace_seq_to_user(struct trace_seq *s, char __user *ubuf, size_t cnt)
 	return cnt;
 }
 
-static ssize_t trace_seq_to_buffer(struct trace_seq *s, void *buf, size_t cnt)
-{
-	int len;
-	void *ret;
-
-	if (s->len <= s->readpos)
-		return -EBUSY;
-
-	len = s->len - s->readpos;
-	if (cnt > len)
-		cnt = len;
-	ret = memcpy(buf, s->buffer + s->readpos, cnt);
-	if (!ret)
-		return -EFAULT;
-
-	s->readpos += cnt;
-	return cnt;
-}
-
 /*
  * ftrace_max_lock is used to protect the swapping of buffers
  * when taking a max snapshot. The buffers themselves are
@@ -3240,21 +3221,18 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
 		if (!pages[i])
 			break;
 
-		rem = tracing_fill_pipe_page(rem, iter);
+		trace_seq_init(&iter->seq, page_address(pages[i]), PAGE_SIZE);
 
-		/* Copy the data into the page, so we can start over. */
-		ret = trace_seq_to_buffer(&iter->seq,
-					  page_address(pages[i]),
-					  iter->seq.len);
-		if (ret < 0) {
+		rem = tracing_fill_pipe_page(rem, iter);
+		if (!iter->seq.len) {
 			__free_page(pages[i]);
 			break;
 		}
+
 		partial[i].offset = 0;
 		partial[i].len = iter->seq.len;
-
-		trace_seq_reset(&iter->seq);
 	}
+	trace_seq_init(&iter->seq, &iter->buffer, FTRACE_SEQ_BUFSIZE);
 
 	trace_event_read_unlock();
 	mutex_unlock(&iter->mutex);
-- 
1.6.5



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq
  2009-12-08 19:54 [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq Steven Rostedt
  2009-12-08 19:54 ` [PATCH 1/2][RFC] [PATCH 1/2] tracing: Change trace_seq to use separate buffer Steven Rostedt
  2009-12-08 19:54 ` [PATCH 2/2][RFC] [PATCH 2/2] tracing: Write directly into splice page for trace_pipe Steven Rostedt
@ 2009-12-09  9:44 ` Lai Jiangshan
  2009-12-09 16:26   ` Steven Rostedt
  2 siblings, 1 reply; 5+ messages in thread
From: Lai Jiangshan @ 2009-12-09  9:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Li Zefan

Steven Rostedt wrote:
> This is an RFC patch set. The trace_seq currently has its buffer
> within the structure itself. But this limits its ability and efficiency.
> 
> This patch set separates it out, but now it requires the callers
> to supply their own buffer. But this helps out the splice code because
> it can now write directly into the splice pages.
> 
> This may be too much for 33? But it is a nice fix.
> 
> Thoughs?
> 
> The following patches are in:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
> 
>     branch: rfc/tracing/core
> 
> 
> Steven Rostedt (2):
>       tracing: Change trace_seq to use separate buffer
>       tracing: Write directly into splice page for trace_pipe

It's great that you wrote it and trace_pipe benefit from it.
And it show separate buffer is required.

But I didn't expect so much/big changes. I'm wondering that
we can use seq file directly for some files and
kill all kmalloc(sizeof(struct trace_seq)).

kernel/trace/trace.c:3720:      s = kmalloc(sizeof(*s), GFP_KERNEL);
We can use a short buffer + simple_read_from_buffer()
Or seq file.

kernel/trace/trace_events.c:539:        s = kmalloc(sizeof(*s), GFP_KERNEL);
We can use seq file after of my patchset(today) is applied.

kernel/trace/trace_events.c:580:        s = kmalloc(sizeof(*s), GFP_KERNEL);
we can use a shor buffer. "char buf[20]" + simple_read_from_buffer()

kernel/trace/trace_events.c:604:        s = kmalloc(sizeof(*s), GFP_KERNEL);
We can use a short buffer + simple_read_from_buffer()
Or seq file.

kernel/trace/trace_events.c:660:        s = kmalloc(sizeof(*s), GFP_KERNEL);
We can use a short buffer + simple_read_from_buffer()
Or seq file.

kernel/trace/trace_events.c:715:        s = kmalloc(sizeof(*s), GFP_KERNEL);
We can use seq file.

kernel/trace/trace_ksym.c:230:  s = kmalloc(sizeof(*s), GFP_KERNEL);
We should use seq! It's very bad that we use trace_seq.

For function_stat_show(), it uses static trace_seq. but it can also
use a short buffer + simple_read_from_buffer().

I will do this.(or you if you would like to)

So could you write a simpler version of separate-buffer-trace_seq
with supposition that all kmalloc(sizeof(struct trace_seq)) are removed.

Lai.

> 
> ----
>  include/linux/ftrace_event.h |    6 ++++-
>  include/linux/trace_seq.h    |   22 +++++++++++++++-
>  include/trace/ftrace.h       |   20 +++++++++++----
>  kernel/trace/ftrace.c        |    3 +-
>  kernel/trace/trace.c         |   54 ++++++++++++++++-------------------------
>  kernel/trace/trace_events.c  |   45 +++++++++++++++++++++++++++++++----
>  kernel/trace/trace_ksym.c    |   10 +++++++-
>  kernel/trace/trace_output.c  |   24 ++++++++++--------
>  8 files changed, 125 insertions(+), 59 deletions(-)
> 
> 
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq
  2009-12-09  9:44 ` [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq Lai Jiangshan
@ 2009-12-09 16:26   ` Steven Rostedt
  0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2009-12-09 16:26 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker,
	Li Zefan

On Wed, 2009-12-09 at 17:44 +0800, Lai Jiangshan wrote:
> Steven Rostedt wrote:

> > Steven Rostedt (2):
> >       tracing: Change trace_seq to use separate buffer
> >       tracing: Write directly into splice page for trace_pipe
> 
> It's great that you wrote it and trace_pipe benefit from it.
> And it show separate buffer is required.
> 
> But I didn't expect so much/big changes. I'm wondering that

Yeah, I first did a patch (and I still have it) that keeps the trace_seq
structure as is. But then I decided to bite the bullet and just convert
it to a print helper.

> we can use seq file directly for some files and
> kill all kmalloc(sizeof(struct trace_seq)).
> 
> kernel/trace/trace.c:3720:      s = kmalloc(sizeof(*s), GFP_KERNEL);
> We can use a short buffer + simple_read_from_buffer()
> Or seq file.

Heh, yeah that usage of trace_seq was not necessary. It looks like a
handler for sprintf only.


> 
> kernel/trace/trace_events.c:539:        s = kmalloc(sizeof(*s), GFP_KERNEL);
> We can use seq file after of my patchset(today) is applied.

Yeah, sure.

> 
> kernel/trace/trace_events.c:580:        s = kmalloc(sizeof(*s), GFP_KERNEL);
> we can use a shor buffer. "char buf[20]" + simple_read_from_buffer()
> 
> kernel/trace/trace_events.c:604:        s = kmalloc(sizeof(*s), GFP_KERNEL);
> We can use a short buffer + simple_read_from_buffer()
> Or seq file.
> 
> kernel/trace/trace_events.c:660:        s = kmalloc(sizeof(*s), GFP_KERNEL);
> We can use a short buffer + simple_read_from_buffer()
> Or seq file.

Yep to the above.

> 
> kernel/trace/trace_events.c:715:        s = kmalloc(sizeof(*s), GFP_KERNEL);
> We can use seq file.

Hmm, this I'm not so sure about (my line numbers are off, this is
"show_header" right?).

The reason that I wrote trace_seq is to let functions write to whatever
buffer they want to. Not just a seq_file. This function calls other
functions that write to this buffer, and in the future, this buffer may
not be from a normal read.


> 
> kernel/trace/trace_ksym.c:230:  s = kmalloc(sizeof(*s), GFP_KERNEL);
> We should use seq! It's very bad that we use trace_seq.

I agree that seq_file should be used here, but I would not say it was
"very bad" to use trace_seq.

> 
> For function_stat_show(), it uses static trace_seq. but it can also
> use a short buffer + simple_read_from_buffer().

No it can't. It uses it to call trace_print_graph_duration which is used
for trace and trace_pipe (and others). This requires the trace_seq
structure.

> 
> I will do this.(or you if you would like to)

The ones that I said above are fine, you can do. Also split the patches
up per section. That is, the trace_seq -> small buffer +
simple_read_from_buffer should be a separate patch than the conversion
to seq_file.

Note, one reason others have used trace_seq over seq_file is that the
seq_file is a confusing interface. I constantly get it wrong. Helper
functions should be simplistic and intuitive, not something to spend
time debugging when it is suppose to be helping.

I agree with most of the above conversions, but I don't want to convert
all of them just because they can be. seq_file can limit their usage.

> 
> So could you write a simpler version of separate-buffer-trace_seq
> with supposition that all kmalloc(sizeof(struct trace_seq)) are removed.

Make the above changes (where I agreed), then I'll rewrite this patch on
top of yours.

OK?

Thanks!

-- Steve



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-12-09 16:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-08 19:54 [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq Steven Rostedt
2009-12-08 19:54 ` [PATCH 1/2][RFC] [PATCH 1/2] tracing: Change trace_seq to use separate buffer Steven Rostedt
2009-12-08 19:54 ` [PATCH 2/2][RFC] [PATCH 2/2] tracing: Write directly into splice page for trace_pipe Steven Rostedt
2009-12-09  9:44 ` [PATCH 0/2][RFC] [RFC] tracing: separate out buffer from trace_seq Lai Jiangshan
2009-12-09 16:26   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox