public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org, Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org,
	Masami Hiramatsu <mhiramat@redhat.com>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	Hideo AOKI <haoki@redhat.com>,
	Takashi Nishiie <t-nishiie@np.css.fujitsu.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>,
	Paul E McKenney <paulmck@linux.vnet.ibm.com>
Subject: Re: [patch 01/15] Kernel Tracepoints
Date: Tue, 15 Jul 2008 11:22:24 -0400	[thread overview]
Message-ID: <20080715152224.GE20037@Krystal> (raw)
In-Reply-To: <1216132928.12595.201.camel@twins>

* Peter Zijlstra (peterz@infradead.org) wrote:
> On Tue, 2008-07-15 at 10:27 -0400, Mathieu Desnoyers wrote:
> > * Peter Zijlstra (peterz@infradead.org) wrote:
> > > On Tue, 2008-07-15 at 09:25 -0400, Mathieu Desnoyers wrote:
> > > > * Peter Zijlstra (peterz@infradead.org) wrote:
> > > > > On Wed, 2008-07-09 at 10:59 -0400, Mathieu Desnoyers wrote:
> > > 
> > > > > > +#define __DO_TRACE(tp, proto, args)					\
> > > > > > +	do {								\
> > > > > > +		int i;							\
> > > > > > +		void **funcs;						\
> > > > > > +		preempt_disable();					\
> > > > > > +		funcs = (tp)->funcs;					\
> > > > > > +		smp_read_barrier_depends();				\
> > > > > > +		if (funcs) {						\
> > > > > > +			for (i = 0; funcs[i]; i++) {			\
> > > > > 
> > > > > can't you get rid of 'i' and write:
> > > > > 
> > > > >   void **func;
> > > > > 
> > > > >   preempt_disable();
> > > > >   func = (tp)->funcs;
> > > > >   smp_read_barrier_depends();
> > > > >   for (; func; func++)
> > > > >     ((void (*)(proto))func)(args);
> > > > >   preempt_enable();
> > > > > 
> > > > 
> > > > Yes, I though there would be an optimization to do here, I'll use your
> > > > proposal. This code snippet is especially important since it will
> > > > generate instructions near every tracepoint side. Saving a few bytes
> > > > becomes important.
> > > > 
> > > > Given that (tp)->funcs references an array of function pointers and that
> > > > it can be NULL, the if (funcs) test must still be there and we must use
> > > > 
> > > > #define __DO_TRACE(tp, proto, args)					\
> > > > 	do {								\
> > > > 		void *func;						\
> > > > 									\
> > > > 		preempt_disable();					\
> > > > 		if ((tp)->funcs) {					\
> > > > 			func = rcu_dereference((tp)->funcs);		\
> > > > 			for (; func; func++) {				\
> > > > 				((void(*)(proto))(func))(args);		\
> > > > 			}						\
> > > > 		}							\
> > > > 		preempt_enable();					\
> > > > 	} while (0)
> > > > 
> > > > 
> > > > The resulting assembly is a bit more dense than my previous
> > > > implementation, which is good :
> > > 
> > > My version also has that if ((tp)->funcs), but its hidden in the 
> > > for (; func; func++) loop. The only thing your version does is an extra
> > > test of tp->funcs but without read depends barrier - not sure if that is
> > > ok.
> > > 
> > 
> > Hrm, you are right, the implementation I just proposed is bogus. (but so
> > was yours) ;)
> > 
> > func is an iterator on the funcs array. My typing of func is thus wrong,
> > it should be void **. Otherwise I'm just incrementing the function
> > address which is plain wrong.
> > 
> > The read barrier is included in rcu_dereference() now. But given that we
> > have to take a pointer to the array as an iterator, we would have to
> > rcu_dereference() our iterator multiple times and then have many read
> > barrier depends, which we don't need. This is why I would go back to a
> > smp_read_barrier_depends().
> > 
> > Also, I use a NULL entry at the end of the funcs array as an end of
> > array identifier. However, I cannot use this in the for loop both as a
> > check for NULL array and check for NULL array element. This is why a if
> > () test is needed in addition to the for loop test. (this is actually
> > what is wrong in the implementation you proposed : you treat func both
> > as a pointer to the function pointer array and as a function pointer)
> 
> Ah, D'0h! Indeed.
> 
> > Something like this seems better :
> > 
> > #define __DO_TRACE(tp, proto, args)                                     \
> >         do {                                                            \
> >                 void **it_func;                                         \
> >                                                                         \
> >                 preempt_disable();                                      \
> >                 it_func = (tp)->funcs;                                  \
> >                 if (it_func) {                                          \
> >                         smp_read_barrier_depends();                     \
> >                         for (; *it_func; it_func++)                     \
> >                                 ((void(*)(proto))(*it_func))(args);     \
> >                 }                                                       \
> >                 preempt_enable();                                       \
> >         } while (0)
> > 
> > What do you think ?
> 
> I'm confused by the barrier games here.
> 
> Why not:
> 
>   void **it_func;
> 
>   preempt_disable();
>   it_func = rcu_dereference((tp)->funcs);
>   if (it_func) {
>     for (; *it_func; it_func++)
>       ((void(*)(proto))(*it_func))(args);
>   }
>   preempt_enable();
> 
> That is, why can we skip the barrier when !it_func? is that because at
> that time we don't actually dereference it_func and therefore cannot
> observe stale data?
> 

Exactly. I used the implementation of rcu_assign_pointer as a hint that
we did not need barriers when setting the pointer to NULL, and thus we
should not need the read barrier when reading the NULL pointer, because
it references no data.

#define rcu_assign_pointer(p, v) \
        ({ \
                if (!__builtin_constant_p(v) || \
                    ((v) != NULL)) \
                        smp_wmb(); \
                (p) = (v); \
        })

#define rcu_dereference(p)     ({ \
                                typeof(p) _________p1 = ACCESS_ONCE(p); \
                                smp_read_barrier_depends(); \
                                (_________p1); \
                                })

But I think you are right, since we are already in unlikely code, using
rcu_dereference as you do is better than my use of read barrier depends.
It should not change anything in the assembly result except on alpha,
where the read_barrier_depends() is not a nop.

I wonder if there would be a way to add this kind of NULL pointer case
check without overhead in rcu_dereference() on alpha. I guess not, since
the pointer is almost never known at compile-time. And I guess Paul must
already have thought about it. The only case where we could add this
test is when we know that we have a if (ptr != NULL) test following the
rcu_dereference(); we could then assume the compiler will merge the two
branches since they depend on the same condition.

> If so, does this really matter since we're already in an unlikely
> section? Again, if so, this deserves a comment ;-)
> 
> [ still think those preempt_* calls should be called
>   rcu_read_sched_lock() or such. ]
> 
> Anyway, does this still generate better code?
> 

On x86_64 :

 820:   bf 01 00 00 00          mov    $0x1,%edi
 825:   e8 00 00 00 00          callq  82a <thread_return+0x136>
 82a:   48 8b 1d 00 00 00 00    mov    0x0(%rip),%rbx        # 831 <thread_return+0x13d>
 831:   48 85 db                test   %rbx,%rbx
 834:   75 21                   jne    857 <thread_return+0x163>
 836:   eb 27                   jmp    85f <thread_return+0x16b>
 838:   0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
 83f:   00 
 840:   48 8b 95 68 ff ff ff    mov    -0x98(%rbp),%rdx
 847:   48 8b b5 60 ff ff ff    mov    -0xa0(%rbp),%rsi
 84e:   4c 89 e7                mov    %r12,%rdi
 851:   48 83 c3 08             add    $0x8,%rbx
 855:   ff d0                   callq  *%rax
 857:   48 8b 03                mov    (%rbx),%rax
 85a:   48 85 c0                test   %rax,%rax
 85d:   75 e1                   jne    840 <thread_return+0x14c>
 85f:   bf 01 00 00 00          mov    $0x1,%edi
 864:

for 68 bytes.

My original implementation was 77 bytes, so yes, we have a win.

Mathieu


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2008-07-15 15:22 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-09 14:59 [patch 00/15] Tracepoints v3 for linux-next Mathieu Desnoyers
2008-07-09 14:59 ` [patch 01/15] Kernel Tracepoints Mathieu Desnoyers
2008-07-15  7:50   ` Peter Zijlstra
2008-07-15 13:25     ` Mathieu Desnoyers
2008-07-15 13:59       ` Peter Zijlstra
2008-07-15 14:27         ` Mathieu Desnoyers
2008-07-15 14:42           ` Peter Zijlstra
2008-07-15 15:22             ` Mathieu Desnoyers [this message]
2008-07-15 15:31               ` Peter Zijlstra
2008-07-15 15:50                 ` Mathieu Desnoyers
2008-08-01 21:10                   ` Paul E. McKenney
2008-07-15 16:08                 ` Mathieu Desnoyers
2008-07-15 16:25                   ` Peter Zijlstra
2008-07-15 16:51                     ` Mathieu Desnoyers
2008-08-01 21:10                       ` Paul E. McKenney
2008-08-02  0:03                         ` Peter Zijlstra
2008-08-02  0:17                           ` Paul E. McKenney
2008-08-01 21:10                     ` Paul E. McKenney
2008-07-15 16:26                   ` Mathieu Desnoyers
2008-08-01 21:10                   ` Paul E. McKenney
2008-07-15 17:50                 ` Mathieu Desnoyers
2008-07-15 14:03       ` Peter Zijlstra
2008-07-15 14:46         ` Mathieu Desnoyers
2008-07-15 15:13           ` Peter Zijlstra
2008-07-15 18:22             ` Mathieu Desnoyers
2008-07-15 18:33               ` Steven Rostedt
2008-07-15 18:52             ` Masami Hiramatsu
2008-07-15 19:08               ` Mathieu Desnoyers
2008-07-15 19:02         ` Mathieu Desnoyers
2008-07-15 19:52           ` Peter Zijlstra
2008-07-09 14:59 ` [patch 02/15] Tracepoints Documentation Mathieu Desnoyers
2008-07-09 14:59 ` [patch 03/15] Tracepoints Samples Mathieu Desnoyers
2008-07-09 14:59 ` [patch 04/15] LTTng instrumentation - irq Mathieu Desnoyers
2008-07-09 16:39   ` Masami Hiramatsu
2008-07-09 17:05     ` [patch 04/15] LTTng instrumentation - irq (update) Mathieu Desnoyers
2008-07-09 14:59 ` [patch 05/15] LTTng instrumentation - scheduler Mathieu Desnoyers
2008-07-09 15:34   ` [patch 05/15] LTTng instrumentation - scheduler (repost) Mathieu Desnoyers
2008-07-09 15:39     ` Ingo Molnar
2008-07-09 16:00       ` Mathieu Desnoyers
2008-07-09 16:21     ` [patch 05/15] LTTng instrumentation - scheduler (merge ftrace markers) Mathieu Desnoyers
2008-07-09 19:09       ` [PATCH] ftrace port to tracepoints (linux-next) Mathieu Desnoyers
2008-07-10  3:14         ` Takashi Nishiie
2008-07-10  3:57           ` [PATCH] ftrace port to tracepoints (linux-next) (nitpick update) Mathieu Desnoyers
     [not found]         ` <20080711143709.GB11500@Krystal>
     [not found]           ` <Pine.LNX.4.58.0807141112540.30484@gandalf.stny.rr.com>
     [not found]             ` <20080714153334.GA651@Krystal>
     [not found]               ` <Pine.LNX.4.58.0807141153250.29493@gandalf.stny.rr.com>
2008-07-14 16:25                 ` [PATCH] ftrace memory barriers Mathieu Desnoyers
2008-07-14 16:35                   ` Steven Rostedt
2008-07-09 14:59 ` [patch 06/15] LTTng instrumentation - timer Mathieu Desnoyers
2008-07-09 14:59 ` [patch 07/15] LTTng instrumentation - kernel Mathieu Desnoyers
2008-07-09 14:59 ` [patch 08/15] LTTng instrumentation - filemap Mathieu Desnoyers
2008-07-09 14:59 ` [patch 09/15] LTTng instrumentation - swap Mathieu Desnoyers
2008-07-09 14:59 ` [patch 10/15] LTTng instrumentation - memory page faults Mathieu Desnoyers
2008-07-09 14:59 ` [patch 11/15] LTTng instrumentation - page Mathieu Desnoyers
2008-07-09 14:59 ` [patch 12/15] LTTng instrumentation - hugetlb Mathieu Desnoyers
2008-07-11 14:30   ` [patch 12/15] LTTng instrumentation - hugetlb (update) Mathieu Desnoyers
2008-07-09 14:59 ` [patch 13/15] LTTng instrumentation - net Mathieu Desnoyers
2008-07-09 14:59 ` [patch 14/15] LTTng instrumentation - ipv4 Mathieu Desnoyers
2008-07-09 14:59 ` Mathieu Desnoyers
2008-07-09 17:01 ` [patch 00/15] Tracepoints v3 for linux-next Masami Hiramatsu
2008-07-09 17:11   ` [patch 15/15] LTTng instrumentation - ipv6 Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080715152224.GE20037@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=akpm@linux-foundation.org \
    --cc=eduard.munteanu@linux360.ro \
    --cc=fche@redhat.com \
    --cc=haoki@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=t-nishiie@np.css.fujitsu.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox