linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] tracepoints: add tracepoint_call() to optimize tracepoints disabled
  2009-05-19 21:03 [PATCH 0/3] tracepoints: delay argument evaluation Jason Baron
@ 2009-05-19 21:03 ` Jason Baron
  2009-05-19 21:03 ` [PATCH 2/3] tracepoints: convert scheduler tracepoints to 'tracepoint_call' api Jason Baron
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Jason Baron @ 2009-05-19 21:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: fweisbec, mingo, laijs, rostedt, peterz, mathieu.desnoyers,
	jiayingz, mbligh, roland, fche


Add a wrapper function call before the call to 'trace_##name'. This preserves
the type checking, while allowing arguments that are passed to the tracepoint
to not be evaluated until after the tracepoint on/off check is performed.
This reduces the cost when tracepoints are disabled.


Signed-off-by: Jason Baron <jbaron@redhat.com>


---
 include/linux/tracepoint.h |   20 +++++++++++++++++++-
 1 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 14df7e6..d747c78 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -65,7 +65,6 @@ struct tracepoint {
 	extern struct tracepoint __tracepoint_##name;			\
 	static inline void trace_##name(proto)				\
 	{								\
-		if (unlikely(__tracepoint_##name.state))		\
 			__DO_TRACE(&__tracepoint_##name,		\
 				TP_PROTO(proto), TP_ARGS(args));	\
 	}								\
@@ -78,6 +77,19 @@ struct tracepoint {
 		return tracepoint_probe_unregister(#name, (void *)probe);\
 	}
 
+/*
+ * Use this function in the instrumented code. This wrapper call to trace_##name
+ * is added so that arguments aren't evaluated, and do not have side-effects
+ * when a tracepoint is disabled. We still call the inline for type checking.
+ */
+
+#define tracepoint_call(name, ...)		     \
+do {						     \
+	extern struct tracepoint __tracepoint_##name;\
+	if (unlikely(__tracepoint_##name.state))     \
+		trace_##name(__VA_ARGS__);	     \
+} while (0)
+
 #define DEFINE_TRACE(name)						\
 	static const char __tpstrtab_##name[]				\
 	__attribute__((section("__tracepoints_strings"))) = #name;	\
@@ -108,6 +120,12 @@ extern void tracepoint_update_probe_range(struct tracepoint *begin,
 		return -ENOSYS;						\
 	}
 
+#define tracepoint_call(name, ...)	  \
+do {					  \
+	if (0)				  \
+		trace_##name(__VA_ARGS__);\
+} while (0)
+
 #define DEFINE_TRACE(name)
 #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
 #define EXPORT_TRACEPOINT_SYMBOL(name)
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 0/3] tracepoints: delay argument evaluation
@ 2009-05-19 21:03 Jason Baron
  2009-05-19 21:03 ` [PATCH 1/3] tracepoints: add tracepoint_call() to optimize tracepoints disabled Jason Baron
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Jason Baron @ 2009-05-19 21:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: fweisbec, mingo, laijs, rostedt, peterz, mathieu.desnoyers,
	jiayingz, mbligh, roland, fche

hi,

After disassembling some of the tracepoints, I've noticed that arguments that
are passed as macros or that perform  dereferences, evaluate prior to the
tracepoint on/off check. This means that we are needlessly impacting the 
off case.

I am proposing to fix this by adding a macro that first checks for on/off and
then calls 'trace_##name', preserving type checking. Thus, callsites have to 
move from:

trace_block_bio_complete(md->queue, bio);

to:

tracepoint_call(block_bio_complete,  md->queue, bio);

I've tried '__always_inline', but that did not fix this issue. Obviously this
change will require changes to all the callsites. But, that shouldn't be
very hard, I've already included the scheduler and block changes with this
patch. I think its important to minimize code execution in the off case, and
thus going through all the callsites is well worth it. If we agree on this
change, I can change the rest in very short order.

Below I'm also showing the assembly in the 'dec_pending()' function before and
after this change to show the difference it makes. The arguments to the
tracepoint are as above, 'md->queue' and 'bio'. Notice the 2 extra instructions,
before the initial 'je', that could be moved after the 'je'.

before:

ffffffff8137b2a3:       83 3d de 90 4b 00 00    cmpl   $0x0,0x4b90de(%rip)        # ffffffff81834388 <__tracepoint_block_bio_complete+0x8>
ffffffff8137b2aa:       49 8b 45 50             mov    0x50(%r13),%rax
ffffffff8137b2ae:       48 89 45 d0             mov    %rax,-0x30(%rbp)
ffffffff8137b2b2:       74 1f                   je     ffffffff8137b2d3 <dec_pending+0x101>
ffffffff8137b2b4:       48 8b 1d d5 90 4b 00    mov    0x4b90d5(%rip),%rbx        # ffffffff81834390 <__tracepoint_block_bio_complete+0x10>
ffffffff8137b2bb:       48 85 db                test   %rbx,%rbx
ffffffff8137b2be:       74 13                   je     ffffffff8137b2d3 <dec_pending+0x101>
ffffffff8137b2c0:       4c 89 f6                mov    %r14,%rsi
ffffffff8137b2c3:       48 8b 7d d0             mov    -0x30(%rbp),%rdi
ffffffff8137b2c7:       ff 13                   callq  *(%rbx)
ffffffff8137b2c9:       48 83 c3 08             add    $0x8,%rbx
ffffffff8137b2cd:       48 83 3b 00             cmpq   $0x0,(%rbx)
ffffffff8137b2d1:       eb eb                   jmp    ffffffff8137b2be <dec_pending+0xec>
ffffffff8137b2d3:       44 89 fe                mov    %r15d,%esi

after:

ffffffff8137b2a3:       83 3d de 90 4b 00 00    cmpl   $0x0,0x4b90de(%rip)        # ffffffff81834388 <__tracepoint_block_bio_complete+0x8>
ffffffff8137b2aa:       74 27                   je     ffffffff8137b2d3 <dec_pending+0x101>
ffffffff8137b2ac:       49 8b 45 50             mov    0x50(%r13),%rax
ffffffff8137b2b0:       48 8b 1d d9 90 4b 00    mov    0x4b90d9(%rip),%rbx        # ffffffff81834390 <__tracepoint_block_bio_complete+0x10>
ffffffff8137b2b7:       48 89 45 d0             mov    %rax,-0x30(%rbp)
ffffffff8137b2bb:       48 85 db                test   %rbx,%rbx
ffffffff8137b2be:       74 13                   je     ffffffff8137b2d3 <dec_pending+0x101>
ffffffff8137b2c0:       4c 89 f6                mov    %r14,%rsi
ffffffff8137b2c3:       48 8b 7d d0             mov    -0x30(%rbp),%rdi
ffffffff8137b2c7:       ff 13                   callq  *(%rbx)
ffffffff8137b2c9:       48 83 c3 08             add    $0x8,%rbx
ffffffff8137b2cd:       48 83 3b 00             cmpq   $0x0,(%rbx)
ffffffff8137b2d1:       eb eb                   jmp    ffffffff8137b2be <dec_pending+0xec>
ffffffff8137b2d3:       44 89 fe                mov    %r15d,%esi


thanks,

-Jason



Jason Baron (3):
  -add wrapper so we don't have argument resolution overhead
  -add scheduler wrapper calls
  -add block layer trace wrappers

 block/blk-core.c           |   27 ++++++++++++++-------------
 block/elevator.c           |    6 +++---
 drivers/md/dm.c            |    7 ++++---
 fs/bio.c                   |    2 +-
 include/linux/tracepoint.h |   20 +++++++++++++++++++-
 kernel/exit.c              |    6 +++---
 kernel/fork.c              |    2 +-
 kernel/kthread.c           |    4 ++--
 kernel/sched.c             |   10 +++++-----
 kernel/signal.c            |    2 +-
 mm/bounce.c                |    2 +-
 11 files changed, 54 insertions(+), 34 deletions(-)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/3] tracepoints: convert scheduler tracepoints to 'tracepoint_call' api
  2009-05-19 21:03 [PATCH 0/3] tracepoints: delay argument evaluation Jason Baron
  2009-05-19 21:03 ` [PATCH 1/3] tracepoints: add tracepoint_call() to optimize tracepoints disabled Jason Baron
@ 2009-05-19 21:03 ` Jason Baron
  2009-05-19 21:03 ` [PATCH 3/3] tracepoints: convert block " Jason Baron
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Jason Baron @ 2009-05-19 21:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: fweisbec, mingo, laijs, rostedt, peterz, mathieu.desnoyers,
	jiayingz, mbligh, roland, fche


Convert the scheduler tracepoints to the new 'tracepoint_call' api.


Signed-off-by: Jason Baron <jbaron@redhat.com>


---
 kernel/exit.c    |    6 +++---
 kernel/fork.c    |    2 +-
 kernel/kthread.c |    4 ++--
 kernel/sched.c   |   10 +++++-----
 kernel/signal.c  |    2 +-
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 700d110..5bd103f 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -157,7 +157,7 @@ static void delayed_put_task_struct(struct rcu_head *rhp)
 #ifdef CONFIG_PERF_COUNTERS
 	WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list));
 #endif
-	trace_sched_process_free(tsk);
+	tracepoint_call(sched_process_free, tsk);
 	put_task_struct(tsk);
 }
 
@@ -964,7 +964,7 @@ NORET_TYPE void do_exit(long code)
 
 	if (group_dead)
 		acct_process();
-	trace_sched_process_exit(tsk);
+	tracepoint_call(sched_process_exit, tsk);
 
 	exit_sem(tsk);
 	exit_files(tsk);
@@ -1580,7 +1580,7 @@ static long do_wait(enum pid_type type, struct pid *pid, int options,
 	struct task_struct *tsk;
 	int retval;
 
-	trace_sched_process_wait(pid);
+	tracepoint_call(sched_process_wait, pid);
 
 	add_wait_queue(&current->signal->wait_chldexit,&wait);
 repeat:
diff --git a/kernel/fork.c b/kernel/fork.c
index 60a473e..2f718ae 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1396,7 +1396,7 @@ long do_fork(unsigned long clone_flags,
 	if (!IS_ERR(p)) {
 		struct completion vfork;
 
-		trace_sched_process_fork(current, p);
+		tracepoint_call(sched_process_fork, current, p);
 
 		nr = task_pid_vnr(p);
 
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 41c88fe..28ecded 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -204,7 +204,7 @@ int kthread_stop(struct task_struct *k)
 	/* It could exit after stop_info.k set, but before wake_up_process. */
 	get_task_struct(k);
 
-	trace_sched_kthread_stop(k);
+	tracepoint_call(sched_kthread_stop, k);
 
 	/* Must init completion *before* thread sees kthread_stop_info.k */
 	init_completion(&kthread_stop_info.done);
@@ -221,7 +221,7 @@ int kthread_stop(struct task_struct *k)
 	ret = kthread_stop_info.err;
 	mutex_unlock(&kthread_stop_lock);
 
-	trace_sched_kthread_stop_ret(ret);
+	tracepoint_call(sched_kthread_stop_ret, ret);
 
 	return ret;
 }
diff --git a/kernel/sched.c b/kernel/sched.c
index c036590..28c0f16 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1962,7 +1962,7 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
 
 	clock_offset = old_rq->clock - new_rq->clock;
 
-	trace_sched_migrate_task(p, new_cpu);
+	tracepoint_call(sched_migrate_task, p, new_cpu);
 
 #ifdef CONFIG_SCHEDSTATS
 	if (p->se.wait_start)
@@ -2119,7 +2119,7 @@ unsigned long wait_task_inactive(struct task_struct *p, long match_state)
 		 * just go back and repeat.
 		 */
 		rq = task_rq_lock(p, &flags);
-		trace_sched_wait_task(rq, p);
+		tracepoint_call(sched_wait_task, rq, p);
 		running = task_running(rq, p);
 		on_rq = p->se.on_rq;
 		ncsw = 0;
@@ -2515,7 +2515,7 @@ out_activate:
 	}
 
 out_running:
-	trace_sched_wakeup(rq, p, success);
+	tracepoint_call(sched_wakeup, rq, p, success);
 	check_preempt_curr(rq, p, sync);
 
 	p->state = TASK_RUNNING;
@@ -2662,7 +2662,7 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
 		p->sched_class->task_new(rq, p);
 		inc_nr_running(rq);
 	}
-	trace_sched_wakeup_new(rq, p, 1);
+	tracepoint_call(sched_wakeup_new, rq, p, 1);
 	check_preempt_curr(rq, p, 0);
 #ifdef CONFIG_SMP
 	if (p->sched_class->task_wake_up)
@@ -2842,7 +2842,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	struct mm_struct *mm, *oldmm;
 
 	prepare_task_switch(rq, prev, next);
-	trace_sched_switch(rq, prev, next);
+	tracepoint_call(sched_switch, rq, prev, next);
 	mm = next->mm;
 	oldmm = prev->active_mm;
 	/*
diff --git a/kernel/signal.c b/kernel/signal.c
index dba6ae9..19596c1 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -828,7 +828,7 @@ static int __send_signal(int sig, struct siginfo *info, struct task_struct *t,
 	struct sigpending *pending;
 	struct sigqueue *q;
 
-	trace_sched_signal_send(sig, t);
+	tracepoint_call(sched_signal_send, sig, t);
 
 	assert_spin_locked(&t->sighand->siglock);
 
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] tracepoints: convert block tracepoints to 'tracepoint_call' api
  2009-05-19 21:03 [PATCH 0/3] tracepoints: delay argument evaluation Jason Baron
  2009-05-19 21:03 ` [PATCH 1/3] tracepoints: add tracepoint_call() to optimize tracepoints disabled Jason Baron
  2009-05-19 21:03 ` [PATCH 2/3] tracepoints: convert scheduler tracepoints to 'tracepoint_call' api Jason Baron
@ 2009-05-19 21:03 ` Jason Baron
  2009-05-19 21:17 ` [PATCH 0/3] tracepoints: delay argument evaluation Mathieu Desnoyers
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Jason Baron @ 2009-05-19 21:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: fweisbec, mingo, laijs, rostedt, peterz, mathieu.desnoyers,
	jiayingz, mbligh, roland, fche


Convert the block trace tracepoints to the new 'tracepoint_call' api.


Signed-off-by: Jason Baron <jbaron@redhat.com>


---
 block/blk-core.c |   27 ++++++++++++++-------------
 block/elevator.c |    6 +++---
 drivers/md/dm.c  |    7 ++++---
 fs/bio.c         |    2 +-
 mm/bounce.c      |    2 +-
 5 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1306de9..1e10e0b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -223,7 +223,7 @@ void blk_plug_device(struct request_queue *q)
 
 	if (!queue_flag_test_and_set(QUEUE_FLAG_PLUGGED, q)) {
 		mod_timer(&q->unplug_timer, jiffies + q->unplug_delay);
-		trace_block_plug(q);
+		tracepoint_call(block_plug, q);
 	}
 }
 EXPORT_SYMBOL(blk_plug_device);
@@ -309,7 +309,7 @@ void blk_unplug_work(struct work_struct *work)
 	struct request_queue *q =
 		container_of(work, struct request_queue, unplug_work);
 
-	trace_block_unplug_io(q);
+	tracepoint_call(block_unplug_io, q);
 	q->unplug_fn(q);
 }
 
@@ -317,7 +317,7 @@ void blk_unplug_timeout(unsigned long data)
 {
 	struct request_queue *q = (struct request_queue *)data;
 
-	trace_block_unplug_timer(q);
+	tracepoint_call(block_unplug_timer, q);
 	kblockd_schedule_work(q, &q->unplug_work);
 }
 
@@ -327,7 +327,7 @@ void blk_unplug(struct request_queue *q)
 	 * devices don't necessarily have an ->unplug_fn defined
 	 */
 	if (q->unplug_fn) {
-		trace_block_unplug_io(q);
+		tracepoint_call(block_unplug_io, q);
 		q->unplug_fn(q);
 	}
 }
@@ -831,7 +831,7 @@ rq_starved:
 	if (ioc_batching(q, ioc))
 		ioc->nr_batch_requests--;
 
-	trace_block_getrq(q, bio, rw_flags & 1);
+	tracepoint_call(block_getrq, q, bio, rw_flags & 1);
 out:
 	return rq;
 }
@@ -857,7 +857,7 @@ static struct request *get_request_wait(struct request_queue *q, int rw_flags,
 		prepare_to_wait_exclusive(&rl->wait[is_sync], &wait,
 				TASK_UNINTERRUPTIBLE);
 
-		trace_block_sleeprq(q, bio, rw_flags & 1);
+		tracepoint_call(block_sleeprq, q, bio, rw_flags & 1);
 
 		__generic_unplug_device(q);
 		spin_unlock_irq(q->queue_lock);
@@ -937,7 +937,7 @@ void blk_requeue_request(struct request_queue *q, struct request *rq)
 {
 	blk_delete_timer(rq);
 	blk_clear_rq_complete(rq);
-	trace_block_rq_requeue(q, rq);
+	tracepoint_call(block_rq_requeue, q, rq);
 
 	if (blk_rq_tagged(rq))
 		blk_queue_end_tag(q, rq);
@@ -1178,7 +1178,7 @@ static int __make_request(struct request_queue *q, struct bio *bio)
 		if (!ll_back_merge_fn(q, req, bio))
 			break;
 
-		trace_block_bio_backmerge(q, bio);
+		tracepoint_call(block_bio_backmerge, q, bio);
 
 		req->biotail->bi_next = bio;
 		req->biotail = bio;
@@ -1197,7 +1197,7 @@ static int __make_request(struct request_queue *q, struct bio *bio)
 		if (!ll_front_merge_fn(q, req, bio))
 			break;
 
-		trace_block_bio_frontmerge(q, bio);
+		tracepoint_call(block_bio_frontmerge, q, bio);
 
 		bio->bi_next = req->bio;
 		req->bio = bio;
@@ -1276,7 +1276,7 @@ static inline void blk_partition_remap(struct bio *bio)
 		bio->bi_sector += p->start_sect;
 		bio->bi_bdev = bdev->bd_contains;
 
-		trace_block_remap(bdev_get_queue(bio->bi_bdev), bio,
+		tracepoint_call(block_remap, bdev_get_queue(bio->bi_bdev), bio,
 				    bdev->bd_dev,
 				    bio->bi_sector - p->start_sect);
 	}
@@ -1446,9 +1446,10 @@ static inline void __generic_make_request(struct bio *bio)
 			goto end_io;
 
 		if (old_sector != -1)
-			trace_block_remap(q, bio, old_dev, old_sector);
+			tracepoint_call(block_remap, q, bio,
+						old_dev, old_sector);
 
-		trace_block_bio_queue(q, bio);
+		tracepoint_call(block_bio_queue, q, bio);
 
 		old_sector = bio->bi_sector;
 		old_dev = bio->bi_bdev->bd_dev;
@@ -1737,7 +1738,7 @@ static int __end_that_request_first(struct request *req, int error,
 	int total_bytes, bio_nbytes, next_idx = 0;
 	struct bio *bio;
 
-	trace_block_rq_complete(req->q, req);
+	tracepoint_call(block_rq_complete, req->q, req);
 
 	/*
 	 * for a REQ_TYPE_BLOCK_PC request, we want to carry any eventual
diff --git a/block/elevator.c b/block/elevator.c
index 7073a90..0aa6cdb 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -618,7 +618,7 @@ void elv_insert(struct request_queue *q, struct request *rq, int where)
 	unsigned ordseq;
 	int unplug_it = 1;
 
-	trace_block_rq_insert(q, rq);
+	tracepoint_call(block_rq_insert, q, rq);
 
 	rq->q = q;
 
@@ -796,7 +796,7 @@ struct request *elv_next_request(struct request_queue *q)
 			 * not be passed by new incoming requests
 			 */
 			rq->cmd_flags |= REQ_STARTED;
-			trace_block_rq_issue(q, rq);
+			tracepoint_call(block_rq_issue, q, rq);
 		}
 
 		if (!q->boundary_rq || q->boundary_rq == rq) {
@@ -938,7 +938,7 @@ void elv_abort_queue(struct request_queue *q)
 	while (!list_empty(&q->queue_head)) {
 		rq = list_entry_rq(q->queue_head.next);
 		rq->cmd_flags |= REQ_QUIET;
-		trace_block_rq_abort(q, rq);
+		tracepoint_call(block_rq_abort, q, rq);
 		__blk_end_request(rq, -EIO, blk_rq_bytes(rq));
 	}
 }
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index e2ee4a7..106a73d 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -561,7 +561,8 @@ static void dec_pending(struct dm_io *io, int error)
 			end_io_acct(io);
 
 			if (io_error != DM_ENDIO_REQUEUE) {
-				trace_block_bio_complete(md->queue, bio);
+				tracepoint_call(block_bio_complete,
+							md->queue, bio);
 
 				bio_endio(bio, io_error);
 			}
@@ -655,8 +656,8 @@ static void __map_bio(struct dm_target *ti, struct bio *clone,
 	if (r == DM_MAPIO_REMAPPED) {
 		/* the bio has been remapped so dispatch it */
 
-		trace_block_remap(bdev_get_queue(clone->bi_bdev), clone,
-				    tio->io->bio->bi_bdev->bd_dev, sector);
+		tracepoint_call(block_remap, bdev_get_queue(clone->bi_bdev),
+				clone, tio->io->bio->bi_bdev->bd_dev, sector);
 
 		generic_make_request(clone);
 	} else if (r < 0 || r == DM_MAPIO_REQUEUE) {
diff --git a/fs/bio.c b/fs/bio.c
index 9871164..5cda18a 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1440,7 +1440,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
 	if (!bp)
 		return bp;
 
-	trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
+	tracepoint_call(block_split, bdev_get_queue(bi->bi_bdev), bi,
 				bi->bi_sector + first_sectors);
 
 	BUG_ON(bi->bi_vcnt != 1);
diff --git a/mm/bounce.c b/mm/bounce.c
index e590272..c160109 100644
--- a/mm/bounce.c
+++ b/mm/bounce.c
@@ -230,7 +230,7 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
 	if (!bio)
 		return;
 
-	trace_block_bio_bounce(q, *bio_orig);
+	tracepoint_call(block_bio_bounce, q, *bio_orig);
 
 	/*
 	 * at least one page was bounced, fill in possible non-highmem
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 21:03 [PATCH 0/3] tracepoints: delay argument evaluation Jason Baron
                   ` (2 preceding siblings ...)
  2009-05-19 21:03 ` [PATCH 3/3] tracepoints: convert block " Jason Baron
@ 2009-05-19 21:17 ` Mathieu Desnoyers
  2009-05-19 22:16   ` Jason Baron
  2009-05-19 22:36   ` Steven Rostedt
  2009-05-20  7:12 ` Peter Zijlstra
  2009-05-20  7:33 ` Ingo Molnar
  5 siblings, 2 replies; 22+ messages in thread
From: Mathieu Desnoyers @ 2009-05-19 21:17 UTC (permalink / raw)
  To: Jason Baron
  Cc: linux-kernel, fweisbec, mingo, laijs, rostedt, peterz, jiayingz,
	mbligh, roland, fche

* Jason Baron (jbaron@redhat.com) wrote:
> hi,
> 
> After disassembling some of the tracepoints, I've noticed that arguments that
> are passed as macros or that perform  dereferences, evaluate prior to the
> tracepoint on/off check. This means that we are needlessly impacting the 
> off case.
> 
> I am proposing to fix this by adding a macro that first checks for on/off and
> then calls 'trace_##name', preserving type checking. Thus, callsites have to 
> move from:
> 
> trace_block_bio_complete(md->queue, bio);
> 
> to:
> 
> tracepoint_call(block_bio_complete,  md->queue, bio);
> 

I knew this limitation in the first place, but decided it was not worth
uglifying the tracepoint call site for it.

The expected use is to pass a pointer or a value as tracepoint argument
and dereference it in the callback attached to it.

Is there any _real_ added value for going through this API change pain ?

Mathieu

> I've tried '__always_inline', but that did not fix this issue. Obviously this
> change will require changes to all the callsites. But, that shouldn't be
> very hard, I've already included the scheduler and block changes with this
> patch. I think its important to minimize code execution in the off case, and
> thus going through all the callsites is well worth it. If we agree on this
> change, I can change the rest in very short order.
> 
> Below I'm also showing the assembly in the 'dec_pending()' function before and
> after this change to show the difference it makes. The arguments to the
> tracepoint are as above, 'md->queue' and 'bio'. Notice the 2 extra instructions,
> before the initial 'je', that could be moved after the 'je'.
> 
> before:
> 
> ffffffff8137b2a3:       83 3d de 90 4b 00 00    cmpl   $0x0,0x4b90de(%rip)        # ffffffff81834388 <__tracepoint_block_bio_complete+0x8>
> ffffffff8137b2aa:       49 8b 45 50             mov    0x50(%r13),%rax
> ffffffff8137b2ae:       48 89 45 d0             mov    %rax,-0x30(%rbp)
> ffffffff8137b2b2:       74 1f                   je     ffffffff8137b2d3 <dec_pending+0x101>
> ffffffff8137b2b4:       48 8b 1d d5 90 4b 00    mov    0x4b90d5(%rip),%rbx        # ffffffff81834390 <__tracepoint_block_bio_complete+0x10>
> ffffffff8137b2bb:       48 85 db                test   %rbx,%rbx
> ffffffff8137b2be:       74 13                   je     ffffffff8137b2d3 <dec_pending+0x101>
> ffffffff8137b2c0:       4c 89 f6                mov    %r14,%rsi
> ffffffff8137b2c3:       48 8b 7d d0             mov    -0x30(%rbp),%rdi
> ffffffff8137b2c7:       ff 13                   callq  *(%rbx)
> ffffffff8137b2c9:       48 83 c3 08             add    $0x8,%rbx
> ffffffff8137b2cd:       48 83 3b 00             cmpq   $0x0,(%rbx)
> ffffffff8137b2d1:       eb eb                   jmp    ffffffff8137b2be <dec_pending+0xec>
> ffffffff8137b2d3:       44 89 fe                mov    %r15d,%esi
> 
> after:
> 
> ffffffff8137b2a3:       83 3d de 90 4b 00 00    cmpl   $0x0,0x4b90de(%rip)        # ffffffff81834388 <__tracepoint_block_bio_complete+0x8>
> ffffffff8137b2aa:       74 27                   je     ffffffff8137b2d3 <dec_pending+0x101>
> ffffffff8137b2ac:       49 8b 45 50             mov    0x50(%r13),%rax
> ffffffff8137b2b0:       48 8b 1d d9 90 4b 00    mov    0x4b90d9(%rip),%rbx        # ffffffff81834390 <__tracepoint_block_bio_complete+0x10>
> ffffffff8137b2b7:       48 89 45 d0             mov    %rax,-0x30(%rbp)
> ffffffff8137b2bb:       48 85 db                test   %rbx,%rbx
> ffffffff8137b2be:       74 13                   je     ffffffff8137b2d3 <dec_pending+0x101>
> ffffffff8137b2c0:       4c 89 f6                mov    %r14,%rsi
> ffffffff8137b2c3:       48 8b 7d d0             mov    -0x30(%rbp),%rdi
> ffffffff8137b2c7:       ff 13                   callq  *(%rbx)
> ffffffff8137b2c9:       48 83 c3 08             add    $0x8,%rbx
> ffffffff8137b2cd:       48 83 3b 00             cmpq   $0x0,(%rbx)
> ffffffff8137b2d1:       eb eb                   jmp    ffffffff8137b2be <dec_pending+0xec>
> ffffffff8137b2d3:       44 89 fe                mov    %r15d,%esi
> 
> 
> thanks,
> 
> -Jason
> 
> 
> 
> Jason Baron (3):
>   -add wrapper so we don't have argument resolution overhead
>   -add scheduler wrapper calls
>   -add block layer trace wrappers
> 
>  block/blk-core.c           |   27 ++++++++++++++-------------
>  block/elevator.c           |    6 +++---
>  drivers/md/dm.c            |    7 ++++---
>  fs/bio.c                   |    2 +-
>  include/linux/tracepoint.h |   20 +++++++++++++++++++-
>  kernel/exit.c              |    6 +++---
>  kernel/fork.c              |    2 +-
>  kernel/kthread.c           |    4 ++--
>  kernel/sched.c             |   10 +++++-----
>  kernel/signal.c            |    2 +-
>  mm/bounce.c                |    2 +-
>  11 files changed, 54 insertions(+), 34 deletions(-)
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 21:17 ` [PATCH 0/3] tracepoints: delay argument evaluation Mathieu Desnoyers
@ 2009-05-19 22:16   ` Jason Baron
  2009-05-19 22:25     ` Roland McGrath
  2009-05-19 22:36   ` Steven Rostedt
  1 sibling, 1 reply; 22+ messages in thread
From: Jason Baron @ 2009-05-19 22:16 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, fweisbec, mingo, laijs, rostedt, peterz, jiayingz,
	mbligh, roland, fche

On Tue, May 19, 2009 at 05:17:04PM -0400, Mathieu Desnoyers wrote:
> * Jason Baron (jbaron@redhat.com) wrote:
> > hi,
> > 
> > After disassembling some of the tracepoints, I've noticed that arguments that
> > are passed as macros or that perform  dereferences, evaluate prior to the
> > tracepoint on/off check. This means that we are needlessly impacting the 
> > off case.
> > 
> > I am proposing to fix this by adding a macro that first checks for on/off and
> > then calls 'trace_##name', preserving type checking. Thus, callsites have to 
> > move from:
> > 
> > trace_block_bio_complete(md->queue, bio);
> > 
> > to:
> > 
> > tracepoint_call(block_bio_complete,  md->queue, bio);
> > 
> 
> I knew this limitation in the first place, but decided it was not worth
> uglifying the tracepoint call site for it.

hmmm...i don't think its terribly ugly.

> 
> The expected use is to pass a pointer or a value as tracepoint argument
> and dereference it in the callback attached to it.
> 

there are a lot of tracepoints that pass more than just a pointer...do
we want to be slowing up i/o paths, when tracepoints are disabled?


> Is there any _real_ added value for going through this API change pain ?
>

the value is in being able to define the tracepoint api more easily to
one's liking and not have additional instructions in the off case, where
they can easily be avoided.

thanks,

-Jason




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 22:16   ` Jason Baron
@ 2009-05-19 22:25     ` Roland McGrath
  2009-05-19 22:31       ` Steven Rostedt
  0 siblings, 1 reply; 22+ messages in thread
From: Roland McGrath @ 2009-05-19 22:25 UTC (permalink / raw)
  To: Jason Baron
  Cc: Mathieu Desnoyers, linux-kernel, fweisbec, mingo, laijs, rostedt,
	peterz, jiayingz, mbligh, fche

Wasn't the plan to switch everything to the new TRACE_EVENT macro anyway?
That will be an API change to convert existing tracepoints.  Then the
various details of the actual call site code are easy to change inside that
new macro.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 22:25     ` Roland McGrath
@ 2009-05-19 22:31       ` Steven Rostedt
  0 siblings, 0 replies; 22+ messages in thread
From: Steven Rostedt @ 2009-05-19 22:31 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Jason Baron, Mathieu Desnoyers, linux-kernel, fweisbec, mingo,
	laijs, peterz, jiayingz, mbligh, fche


On Tue, 19 May 2009, Roland McGrath wrote:

> Wasn't the plan to switch everything to the new TRACE_EVENT macro anyway?
> That will be an API change to convert existing tracepoints.  Then the
> various details of the actual call site code are easy to change inside that
> new macro.

Unfortunately that is not the case. The call sites are the issues. They 
are same for both DEFINE_TRACE and TRACE_EVENT.

-- Steve


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 21:17 ` [PATCH 0/3] tracepoints: delay argument evaluation Mathieu Desnoyers
  2009-05-19 22:16   ` Jason Baron
@ 2009-05-19 22:36   ` Steven Rostedt
  2009-05-19 23:52     ` Frederic Weisbecker
  2009-05-20  0:33     ` Mathieu Desnoyers
  1 sibling, 2 replies; 22+ messages in thread
From: Steven Rostedt @ 2009-05-19 22:36 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jason Baron, LKML, Frederic Weisbecker, Ingo Molnar,
	Lai Jiangshan, Peter Zijlstra, jiayingz, mbligh, roland,
	Frank Ch. Eigler, Christoph Hellwig


[ added Christoph ]

On Tue, 19 May 2009, Mathieu Desnoyers wrote:

> * Jason Baron (jbaron@redhat.com) wrote:
> > hi,
> > 
> > After disassembling some of the tracepoints, I've noticed that arguments that
> > are passed as macros or that perform  dereferences, evaluate prior to the
> > tracepoint on/off check. This means that we are needlessly impacting the 
> > off case.
> > 
> > I am proposing to fix this by adding a macro that first checks for on/off and
> > then calls 'trace_##name', preserving type checking. Thus, callsites have to 
> > move from:
> > 
> > trace_block_bio_complete(md->queue, bio);
> > 
> > to:
> > 
> > tracepoint_call(block_bio_complete,  md->queue, bio);
> > 
> 
> I knew this limitation in the first place, but decided it was not worth
> uglifying the tracepoint call site for it.
> 
> The expected use is to pass a pointer or a value as tracepoint argument
> and dereference it in the callback attached to it.
> 
> Is there any _real_ added value for going through this API change pain ?
> 

I agree with Mathieu that I don't think we want to "uglify" the callers. 
But I also agree with Jason that we must not add any overhead to the "off" 
state when we can avoid it.

If it comes down to the two, I would lean towards the "uglify" if it shows 
performance benefits in the "off" case.

Perhaps I'll try to see if I can fool CPP to getting both worlds. But this 
will be tricky :-/

When are we going to get our own C pre-processor?

-- Steve


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 22:36   ` Steven Rostedt
@ 2009-05-19 23:52     ` Frederic Weisbecker
  2009-05-20  0:33     ` Mathieu Desnoyers
  1 sibling, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2009-05-19 23:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, Jason Baron, LKML, Ingo Molnar, Lai Jiangshan,
	Peter Zijlstra, jiayingz, mbligh, roland, Frank Ch. Eigler,
	Christoph Hellwig

On Tue, May 19, 2009 at 06:36:56PM -0400, Steven Rostedt wrote:
> 
> [ added Christoph ]
> 
> On Tue, 19 May 2009, Mathieu Desnoyers wrote:
> 
> > * Jason Baron (jbaron@redhat.com) wrote:
> > > hi,
> > > 
> > > After disassembling some of the tracepoints, I've noticed that arguments that
> > > are passed as macros or that perform  dereferences, evaluate prior to the
> > > tracepoint on/off check. This means that we are needlessly impacting the 
> > > off case.
> > > 
> > > I am proposing to fix this by adding a macro that first checks for on/off and
> > > then calls 'trace_##name', preserving type checking. Thus, callsites have to 
> > > move from:
> > > 
> > > trace_block_bio_complete(md->queue, bio);
> > > 
> > > to:
> > > 
> > > tracepoint_call(block_bio_complete,  md->queue, bio);
> > > 
> > 
> > I knew this limitation in the first place, but decided it was not worth
> > uglifying the tracepoint call site for it.
> > 
> > The expected use is to pass a pointer or a value as tracepoint argument
> > and dereference it in the callback attached to it.
> > 
> > Is there any _real_ added value for going through this API change pain ?
> > 
> 
> I agree with Mathieu that I don't think we want to "uglify" the callers. 
> But I also agree with Jason that we must not add any overhead to the "off" 
> state when we can avoid it.
> 
> If it comes down to the two, I would lean towards the "uglify" if it shows 
> performance benefits in the "off" case.



Yeah, I agree with you, if we have no choice, the most important goal
is to drop any overhead in tracing off-case.


 
> Perhaps I'll try to see if I can fool CPP to getting both worlds. But this 
> will be tricky :-/
> 
> When are we going to get our own C pre-processor?


It starts to be really required....


> -- Steve
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 22:36   ` Steven Rostedt
  2009-05-19 23:52     ` Frederic Weisbecker
@ 2009-05-20  0:33     ` Mathieu Desnoyers
  2009-05-20  0:42       ` Steven Rostedt
  2009-05-20  7:01       ` Roland McGrath
  1 sibling, 2 replies; 22+ messages in thread
From: Mathieu Desnoyers @ 2009-05-20  0:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jason Baron, LKML, Frederic Weisbecker, Ingo Molnar,
	Lai Jiangshan, Peter Zijlstra, jiayingz, mbligh, roland,
	Frank Ch. Eigler, Christoph Hellwig

* Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> [ added Christoph ]
> 
> On Tue, 19 May 2009, Mathieu Desnoyers wrote:
> 
> > * Jason Baron (jbaron@redhat.com) wrote:
> > > hi,
> > > 
> > > After disassembling some of the tracepoints, I've noticed that arguments that
> > > are passed as macros or that perform  dereferences, evaluate prior to the
> > > tracepoint on/off check. This means that we are needlessly impacting the 
> > > off case.
> > > 
> > > I am proposing to fix this by adding a macro that first checks for on/off and
> > > then calls 'trace_##name', preserving type checking. Thus, callsites have to 
> > > move from:
> > > 
> > > trace_block_bio_complete(md->queue, bio);
> > > 
> > > to:
> > > 
> > > tracepoint_call(block_bio_complete,  md->queue, bio);
> > > 
> > 
> > I knew this limitation in the first place, but decided it was not worth
> > uglifying the tracepoint call site for it.
> > 
> > The expected use is to pass a pointer or a value as tracepoint argument
> > and dereference it in the callback attached to it.
> > 
> > Is there any _real_ added value for going through this API change pain ?
> > 
> 
> I agree with Mathieu that I don't think we want to "uglify" the callers. 
> But I also agree with Jason that we must not add any overhead to the "off" 
> state when we can avoid it.
> 
> If it comes down to the two, I would lean towards the "uglify" if it shows 
> performance benefits in the "off" case.
> 

Given the tradeoff is taste vs overhead, what do you think of the
following proposal ?

trace(block_bio_complete, md->queue, bio);

?

I'm thinking that it may just be the "tracepoint_call" name that's a bit
too verbose for its own good.

Mathieu

> Perhaps I'll try to see if I can fool CPP to getting both worlds. But this 
> will be tricky :-/
> 
> When are we going to get our own C pre-processor?
> 
> -- Steve
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-20  0:33     ` Mathieu Desnoyers
@ 2009-05-20  0:42       ` Steven Rostedt
  2009-05-20  7:01       ` Roland McGrath
  1 sibling, 0 replies; 22+ messages in thread
From: Steven Rostedt @ 2009-05-20  0:42 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jason Baron, LKML, Frederic Weisbecker, Ingo Molnar,
	Lai Jiangshan, Peter Zijlstra, jiayingz, mbligh, roland,
	Frank Ch. Eigler, Christoph Hellwig


On Tue, 19 May 2009, Mathieu Desnoyers wrote:
> 
> Given the tradeoff is taste vs overhead, what do you think of the
> following proposal ?
> 
> trace(block_bio_complete, md->queue, bio);

I'm fine with this. It is extremely important that the trace point 
overhead is negligible when disabled. Otherwise, they serve no purpose.

-- Steve

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-20  0:33     ` Mathieu Desnoyers
  2009-05-20  0:42       ` Steven Rostedt
@ 2009-05-20  7:01       ` Roland McGrath
  1 sibling, 0 replies; 22+ messages in thread
From: Roland McGrath @ 2009-05-20  7:01 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, Jason Baron, LKML, Frederic Weisbecker,
	Ingo Molnar, Lai Jiangshan, Peter Zijlstra, jiayingz, mbligh,
	Frank Ch. Eigler, Christoph Hellwig

I concur about a shorter/prettier macro name.  Perhaps TRACE is better,
since I'm fairly sure 'trace' will have been used many times as a local
variable name and the like.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 21:03 [PATCH 0/3] tracepoints: delay argument evaluation Jason Baron
                   ` (3 preceding siblings ...)
  2009-05-19 21:17 ` [PATCH 0/3] tracepoints: delay argument evaluation Mathieu Desnoyers
@ 2009-05-20  7:12 ` Peter Zijlstra
       [not found]   ` <20090520072750.DA9A0FC38D@magilla.sf.frob.com>
  2009-05-20  7:33 ` Ingo Molnar
  5 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2009-05-20  7:12 UTC (permalink / raw)
  To: Jason Baron
  Cc: linux-kernel, fweisbec, mingo, laijs, rostedt, mathieu.desnoyers,
	jiayingz, mbligh, roland, fche

On Tue, 2009-05-19 at 17:03 -0400, Jason Baron wrote:
> hi,
> 
> After disassembling some of the tracepoints, I've noticed that arguments that
> are passed as macros or that perform  dereferences, evaluate prior to the
> tracepoint on/off check. This means that we are needlessly impacting the 
> off case.
> 
> I am proposing to fix this by adding a macro that first checks for on/off and
> then calls 'trace_##name', preserving type checking. Thus, callsites have to 
> move from:
> 
> trace_block_bio_complete(md->queue, bio);
> 
> to:
> 
> tracepoint_call(block_bio_complete,  md->queue, bio);
> 
> I've tried '__always_inline', but that did not fix this issue. Obviously this
> change will require changes to all the callsites. But, that shouldn't be
> very hard, I've already included the scheduler and block changes with this
> patch. I think its important to minimize code execution in the off case, and
> thus going through all the callsites is well worth it. If we agree on this
> change, I can change the rest in very short order.
> 
> Below I'm also showing the assembly in the 'dec_pending()' function before and
> after this change to show the difference it makes. The arguments to the
> tracepoint are as above, 'md->queue' and 'bio'. Notice the 2 extra instructions,
> before the initial 'je', that could be moved after the 'je'.

I really really hate this.

Why can't 

trace_block_bio_complete(arg..)
   if (__trace_block_bio_compete.state)
     __trace_block_bio_complete(arg)

work?

Surely its possible to wrap the whole stuff in yet another macro layer
that will do the conditional before evaluating the arguments?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-19 21:03 [PATCH 0/3] tracepoints: delay argument evaluation Jason Baron
                   ` (4 preceding siblings ...)
  2009-05-20  7:12 ` Peter Zijlstra
@ 2009-05-20  7:33 ` Ingo Molnar
  2009-05-20 15:42   ` Jason Baron
  5 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2009-05-20  7:33 UTC (permalink / raw)
  To: Jason Baron
  Cc: linux-kernel, fweisbec, laijs, rostedt, peterz, mathieu.desnoyers,
	jiayingz, mbligh, roland, fche


* Jason Baron <jbaron@redhat.com> wrote:

> hi,
> 
> After disassembling some of the tracepoints, I've noticed that 
> arguments that are passed as macros or that perform dereferences, 
> evaluate prior to the tracepoint on/off check. This means that we 
> are needlessly impacting the off case.
> 
> I am proposing to fix this by adding a macro that first checks for 
> on/off and then calls 'trace_##name', preserving type checking. 
> Thus, callsites have to move from:
> 
> trace_block_bio_complete(md->queue, bio);
> 
> to:
> 
> tracepoint_call(block_bio_complete,  md->queue, bio);
>
> I've tried '__always_inline', but that did not fix this issue. 
> Obviously this change will require changes to all the callsites. 
> But, that shouldn't be very hard, I've already included the 
> scheduler and block changes with this patch. I think its important 
> to minimize code execution in the off case, and thus going through 
> all the callsites is well worth it. If we agree on this change, I 
> can change the rest in very short order.
> 
> Below I'm also showing the assembly in the 'dec_pending()' 
> function before and after this change to show the difference it 
> makes. The arguments to the tracepoint are as above, 'md->queue' 
> and 'bio'. Notice the 2 extra instructions, before the initial 
> 'je', that could be moved after the 'je'.

> 
> before:
> 
> ffffffff8137b2a3:       83 3d de 90 4b 00 00    cmpl   $0x0,0x4b90de(%rip)        # ffffffff81834388 <__tracepoint_block_bio_complete+0x8>
> ffffffff8137b2aa:       49 8b 45 50             mov    0x50(%r13),%rax
> ffffffff8137b2ae:       48 89 45 d0             mov    %rax,-0x30(%rbp)
> ffffffff8137b2b2:       74 1f                   je     ffffffff8137b2d3 <dec_pending+0x101>

> after:
> 
> ffffffff8137b2a3:       83 3d de 90 4b 00 00    cmpl   $0x0,0x4b90de(%rip)        # ffffffff81834388 <__tracepoint_block_bio_complete+0x8>
> ffffffff8137b2aa:       74 27                   je     ffffffff8137b2d3 <dec_pending+0x101>

hm, this is really a compiler bug in essence - the compiler should 
delay the construction of arguments into unlikely branches - if the 
arguments are only used there.

We'd basically open-code a clear-cut:

   trace_block_bio_complete(md->queue, bio);
 
into this form:

   trace(block_bio_complete, md->queue, bio);

.. and this latter form could become moot (and a nuisance) if the 
compiler is fixed.

Have you tried very latest GCC, does it still have this optimization 
problem?

Note that the compiler getting this right would help a _lot_ of 
other inline functions in the kernel as well. Arguments only used 
within unlikely() branches are quite common.

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
       [not found]   ` <20090520072750.DA9A0FC38D@magilla.sf.frob.com>
@ 2009-05-20  7:38     ` Peter Zijlstra
  2009-05-20  9:18       ` Roland McGrath
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2009-05-20  7:38 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Jason Baron, linux-kernel, fweisbec, mingo, laijs, rostedt,
	mathieu.desnoyers, jiayingz, mbligh, fche

On Wed, 2009-05-20 at 00:27 -0700, Roland McGrath wrote:
> > Why can't 
> > 
> > trace_block_bio_complete(arg..)
> >    if (__trace_block_bio_compete.state)
> >      __trace_block_bio_complete(arg)
> > 
> > work?
> > 
> > Surely its possible to wrap the whole stuff in yet another macro layer
> > that will do the conditional before evaluating the arguments?
> 
> That is exactly what Jason has implemented.  
> He called the new macro layer tracepoint_call.
> 
> You can't use a macro to generate a new macro definition, so it's not
> possible with cpp alone to have the new macro layer be one that defines a
> separate trace_foobar() macro for each defined tracepoint.

Right, just realized that.

But I think we should be asking GCC to get fixed. Since inline functions
don't act like a compiler barrier anyway, they might as well not force
argument evaluation either.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-20  7:38     ` Peter Zijlstra
@ 2009-05-20  9:18       ` Roland McGrath
  0 siblings, 0 replies; 22+ messages in thread
From: Roland McGrath @ 2009-05-20  9:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jason Baron, linux-kernel, fweisbec, mingo, laijs, rostedt,
	mathieu.desnoyers, jiayingz, mbligh, fche

> But I think we should be asking GCC to get fixed. Since inline functions
> don't act like a compiler barrier anyway, they might as well not force
> argument evaluation either.

I quite agree.  I'm all for better compilation.  However, usual practice
in the kernel has been to complain about introducing C code sequences
that add a few instructions when compiled by compilers over 5 years old,
let alone the very latest one.  I think it was the expectation of that
sort of sensitive reaction to the hot-path impact of unused tracepoints
that motivated Jason's attempt at microoptimization.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-20  7:33 ` Ingo Molnar
@ 2009-05-20 15:42   ` Jason Baron
  2009-05-21  1:49     ` Jiaying Zhang
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Baron @ 2009-05-20 15:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, fweisbec, laijs, rostedt, peterz, mathieu.desnoyers,
	jiayingz, mbligh, roland, fche

On Wed, May 20, 2009 at 09:33:48AM +0200, Ingo Molnar wrote:
> hm, this is really a compiler bug in essence - the compiler should 
> delay the construction of arguments into unlikely branches - if the 
> arguments are only used there.
> 
> We'd basically open-code a clear-cut:
> 
>    trace_block_bio_complete(md->queue, bio);
>  
> into this form:
> 
>    trace(block_bio_complete, md->queue, bio);
> 
> .. and this latter form could become moot (and a nuisance) if the 
> compiler is fixed.
> 
> Have you tried very latest GCC, does it still have this optimization 
> problem?
> 
> Note that the compiler getting this right would help a _lot_ of 
> other inline functions in the kernel as well. Arguments only used 
> within unlikely() branches are quite common.
> 
> 	Ingo

hi,

I e-mailed the gcc list, where they suggested using a macro, as I've
done. They also suggested filing an enhancement request for this, which
I've done: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207 It seems
like they agree with the suggestion. 

It still might make sense to make this requirement explicit (by adding
the extra macro), as the tracepoint off case should really be as optimized as
possible.

thanks,

-Jason

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-20 15:42   ` Jason Baron
@ 2009-05-21  1:49     ` Jiaying Zhang
  2009-05-21  1:59       ` Li Zefan
  0 siblings, 1 reply; 22+ messages in thread
From: Jiaying Zhang @ 2009-05-21  1:49 UTC (permalink / raw)
  To: Jason Baron
  Cc: Ingo Molnar, linux-kernel, fweisbec, laijs, rostedt, peterz,
	mathieu.desnoyers, mbligh, roland, fche

Is it possible to convert blktrace to use event tracer? Then in this case we
can pass 'md' as the parameter to trace_block_bio_complete and dereference
md->queue during assignment.

Jiaying

On Wed, May 20, 2009 at 8:42 AM, Jason Baron <jbaron@redhat.com> wrote:
> On Wed, May 20, 2009 at 09:33:48AM +0200, Ingo Molnar wrote:
>> hm, this is really a compiler bug in essence - the compiler should
>> delay the construction of arguments into unlikely branches - if the
>> arguments are only used there.
>>
>> We'd basically open-code a clear-cut:
>>
>>    trace_block_bio_complete(md->queue, bio);
>>
>> into this form:
>>
>>    trace(block_bio_complete, md->queue, bio);
>>
>> .. and this latter form could become moot (and a nuisance) if the
>> compiler is fixed.
>>
>> Have you tried very latest GCC, does it still have this optimization
>> problem?
>>
>> Note that the compiler getting this right would help a _lot_ of
>> other inline functions in the kernel as well. Arguments only used
>> within unlikely() branches are quite common.
>>
>>       Ingo
>
> hi,
>
> I e-mailed the gcc list, where they suggested using a macro, as I've
> done. They also suggested filing an enhancement request for this, which
> I've done: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207 It seems
> like they agree with the suggestion.
>
> It still might make sense to make this requirement explicit (by adding
> the extra macro), as the tracepoint off case should really be as optimized as
> possible.
>
> thanks,
>
> -Jason
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-21  1:49     ` Jiaying Zhang
@ 2009-05-21  1:59       ` Li Zefan
  2009-05-21  2:15         ` Jiaying Zhang
  0 siblings, 1 reply; 22+ messages in thread
From: Li Zefan @ 2009-05-21  1:59 UTC (permalink / raw)
  To: Jiaying Zhang
  Cc: Jason Baron, Ingo Molnar, linux-kernel, fweisbec, laijs, rostedt,
	peterz, mathieu.desnoyers, mbligh, roland, fche

Jiaying Zhang wrote:
> Is it possible to convert blktrace to use event tracer? Then in this case we

Yes, I'm doing this, see:
	http://marc.info/?l=linux-kernel&m=124228198011297&w=2

> can pass 'md' as the parameter to trace_block_bio_complete and dereference
> md->queue during assignment.
> 

But the problem discussed here exists whether you use plain tracepoints
or TRACE_EVENT.

Though we can add a new tracepoint named trace_md_bio_complete, this is
not the way to solve it.

> Jiaying
> 
> On Wed, May 20, 2009 at 8:42 AM, Jason Baron <jbaron@redhat.com> wrote:
>> On Wed, May 20, 2009 at 09:33:48AM +0200, Ingo Molnar wrote:
>>> hm, this is really a compiler bug in essence - the compiler should
>>> delay the construction of arguments into unlikely branches - if the
>>> arguments are only used there.
>>>
>>> We'd basically open-code a clear-cut:
>>>
>>>    trace_block_bio_complete(md->queue, bio);
>>>
>>> into this form:
>>>
>>>    trace(block_bio_complete, md->queue, bio);
>>>
>>> .. and this latter form could become moot (and a nuisance) if the
>>> compiler is fixed.
>>>
>>> Have you tried very latest GCC, does it still have this optimization
>>> problem?
>>>
>>> Note that the compiler getting this right would help a _lot_ of
>>> other inline functions in the kernel as well. Arguments only used
>>> within unlikely() branches are quite common.
>>>
>>>       Ingo
>> hi,
>>
>> I e-mailed the gcc list, where they suggested using a macro, as I've
>> done. They also suggested filing an enhancement request for this, which
>> I've done: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207 It seems
>> like they agree with the suggestion.
>>
>> It still might make sense to make this requirement explicit (by adding
>> the extra macro), as the tracepoint off case should really be as optimized as
>> possible.
>>
>> thanks,
>>
>> -Jason
>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-21  1:59       ` Li Zefan
@ 2009-05-21  2:15         ` Jiaying Zhang
  2009-05-21  2:41           ` Li Zefan
  0 siblings, 1 reply; 22+ messages in thread
From: Jiaying Zhang @ 2009-05-21  2:15 UTC (permalink / raw)
  To: Li Zefan
  Cc: Jason Baron, Ingo Molnar, linux-kernel, fweisbec, laijs, rostedt,
	peterz, mathieu.desnoyers, mbligh, roland, fche

But if we convert blktrace to use event tracer interface, we can have:

trace_block_bio_complete(md, bio);

TRACE_EVENT(block_bio_complete,
        TP_PROTO(struct mapped_device *md, struct bio *bio),
...
        TP_fast_assign(
                __entry->queue            = md->queue;
...
        ),
);

Jiaying

On Wed, May 20, 2009 at 6:59 PM, Li Zefan <lizf@cn.fujitsu.com> wrote:
> Jiaying Zhang wrote:
>> Is it possible to convert blktrace to use event tracer? Then in this case we
>
> Yes, I'm doing this, see:
>        http://marc.info/?l=linux-kernel&m=124228198011297&w=2
>
>> can pass 'md' as the parameter to trace_block_bio_complete and dereference
>> md->queue during assignment.
>>
>
> But the problem discussed here exists whether you use plain tracepoints
> or TRACE_EVENT.
>
> Though we can add a new tracepoint named trace_md_bio_complete, this is
> not the way to solve it.
>
>> Jiaying
>>
>> On Wed, May 20, 2009 at 8:42 AM, Jason Baron <jbaron@redhat.com> wrote:
>>> On Wed, May 20, 2009 at 09:33:48AM +0200, Ingo Molnar wrote:
>>>> hm, this is really a compiler bug in essence - the compiler should
>>>> delay the construction of arguments into unlikely branches - if the
>>>> arguments are only used there.
>>>>
>>>> We'd basically open-code a clear-cut:
>>>>
>>>>    trace_block_bio_complete(md->queue, bio);
>>>>
>>>> into this form:
>>>>
>>>>    trace(block_bio_complete, md->queue, bio);
>>>>
>>>> .. and this latter form could become moot (and a nuisance) if the
>>>> compiler is fixed.
>>>>
>>>> Have you tried very latest GCC, does it still have this optimization
>>>> problem?
>>>>
>>>> Note that the compiler getting this right would help a _lot_ of
>>>> other inline functions in the kernel as well. Arguments only used
>>>> within unlikely() branches are quite common.
>>>>
>>>>       Ingo
>>> hi,
>>>
>>> I e-mailed the gcc list, where they suggested using a macro, as I've
>>> done. They also suggested filing an enhancement request for this, which
>>> I've done: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207 It seems
>>> like they agree with the suggestion.
>>>
>>> It still might make sense to make this requirement explicit (by adding
>>> the extra macro), as the tracepoint off case should really be as optimized as
>>> possible.
>>>
>>> thanks,
>>>
>>> -Jason
>>>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] tracepoints: delay argument evaluation
  2009-05-21  2:15         ` Jiaying Zhang
@ 2009-05-21  2:41           ` Li Zefan
  0 siblings, 0 replies; 22+ messages in thread
From: Li Zefan @ 2009-05-21  2:41 UTC (permalink / raw)
  To: Jiaying Zhang
  Cc: Jason Baron, Ingo Molnar, linux-kernel, fweisbec, laijs, rostedt,
	peterz, mathieu.desnoyers, mbligh, roland, fche

Jiaying Zhang wrote:
> But if we convert blktrace to use event tracer interface, we can have:
> 

Actually we are converting block tracepoints to TRACE_EVENT, and we are
not going to remove relay+ioctl-based blktrace or ftrace-plugin blktrace.

> trace_block_bio_complete(md, bio);
> 
> TRACE_EVENT(block_bio_complete,
>         TP_PROTO(struct mapped_device *md, struct bio *bio),
> ...
>         TP_fast_assign(
>                 __entry->queue            = md->queue;
> ...
>         ),
> );
> 

I'm not sure if this piece of code can pass compile when !CONFIG_MD
if we put it in include/trace/events/block.h with other block tracepoints.

And what if we have another trace_block_bio_complete(rq, bio) in blk-core.c?

Even it works, still this is not the key to this issue.

We have some other ptr-deref in block tracepoints and other places:

  static inline void blk_partition_remap(struct bio *bio)
  {
	...
	trace_block_remap(bdev_get_queue(bio->bi_bdev), bio,
			    bdev->bd_dev,
			    bio->bi_sector - p->start_sect);
	...
  }

  static int __end_that_request_first(struct request *req, int error,
				    int nr_bytes)
  {
	...
	trace_block_rq_complete(req->q, req);
	...
  }

> Jiaying
> 
> On Wed, May 20, 2009 at 6:59 PM, Li Zefan <lizf@cn.fujitsu.com> wrote:
>> Jiaying Zhang wrote:
>>> Is it possible to convert blktrace to use event tracer? Then in this case we
>> Yes, I'm doing this, see:
>>        http://marc.info/?l=linux-kernel&m=124228198011297&w=2
>>
>>> can pass 'md' as the parameter to trace_block_bio_complete and dereference
>>> md->queue during assignment.
>>>
>> But the problem discussed here exists whether you use plain tracepoints
>> or TRACE_EVENT.
>>
>> Though we can add a new tracepoint named trace_md_bio_complete, this is
>> not the way to solve it.
>>
>>> Jiaying
>>>
>>> On Wed, May 20, 2009 at 8:42 AM, Jason Baron <jbaron@redhat.com> wrote:
>>>> On Wed, May 20, 2009 at 09:33:48AM +0200, Ingo Molnar wrote:
>>>>> hm, this is really a compiler bug in essence - the compiler should
>>>>> delay the construction of arguments into unlikely branches - if the
>>>>> arguments are only used there.
>>>>>
>>>>> We'd basically open-code a clear-cut:
>>>>>
>>>>>    trace_block_bio_complete(md->queue, bio);
>>>>>
>>>>> into this form:
>>>>>
>>>>>    trace(block_bio_complete, md->queue, bio);
>>>>>
>>>>> .. and this latter form could become moot (and a nuisance) if the
>>>>> compiler is fixed.
>>>>>
>>>>> Have you tried very latest GCC, does it still have this optimization
>>>>> problem?
>>>>>
>>>>> Note that the compiler getting this right would help a _lot_ of
>>>>> other inline functions in the kernel as well. Arguments only used
>>>>> within unlikely() branches are quite common.
>>>>>
>>>>>       Ingo
>>>> hi,
>>>>
>>>> I e-mailed the gcc list, where they suggested using a macro, as I've
>>>> done. They also suggested filing an enhancement request for this, which
>>>> I've done: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207 It seems
>>>> like they agree with the suggestion.
>>>>
>>>> It still might make sense to make this requirement explicit (by adding
>>>> the extra macro), as the tracepoint off case should really be as optimized as
>>>> possible.
>>>>
>>>> thanks,
>>>>
>>>> -Jason
>>>>
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-05-21  2:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-19 21:03 [PATCH 0/3] tracepoints: delay argument evaluation Jason Baron
2009-05-19 21:03 ` [PATCH 1/3] tracepoints: add tracepoint_call() to optimize tracepoints disabled Jason Baron
2009-05-19 21:03 ` [PATCH 2/3] tracepoints: convert scheduler tracepoints to 'tracepoint_call' api Jason Baron
2009-05-19 21:03 ` [PATCH 3/3] tracepoints: convert block " Jason Baron
2009-05-19 21:17 ` [PATCH 0/3] tracepoints: delay argument evaluation Mathieu Desnoyers
2009-05-19 22:16   ` Jason Baron
2009-05-19 22:25     ` Roland McGrath
2009-05-19 22:31       ` Steven Rostedt
2009-05-19 22:36   ` Steven Rostedt
2009-05-19 23:52     ` Frederic Weisbecker
2009-05-20  0:33     ` Mathieu Desnoyers
2009-05-20  0:42       ` Steven Rostedt
2009-05-20  7:01       ` Roland McGrath
2009-05-20  7:12 ` Peter Zijlstra
     [not found]   ` <20090520072750.DA9A0FC38D@magilla.sf.frob.com>
2009-05-20  7:38     ` Peter Zijlstra
2009-05-20  9:18       ` Roland McGrath
2009-05-20  7:33 ` Ingo Molnar
2009-05-20 15:42   ` Jason Baron
2009-05-21  1:49     ` Jiaying Zhang
2009-05-21  1:59       ` Li Zefan
2009-05-21  2:15         ` Jiaying Zhang
2009-05-21  2:41           ` Li Zefan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).