netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	fbl@redhat.com, nhorman@redhat.com, davem@redhat.com,
	htejun@gmail.com, jarkao2@gmail.com, oleg@redhat.com,
	davidel@xmailserver.org
Subject: Re: [PATCHv5 2/2] memory barrier: adding smp_mb__after_lock
Date: Tue, 7 Jul 2009 10:01:35 -0400	[thread overview]
Message-ID: <20090707140135.GA5506@Krystal> (raw)
In-Reply-To: <20090707134601.GB6619@jolsa.lab.eng.brq.redhat.com>

* Jiri Olsa (jolsa@redhat.com) wrote:
> On Tue, Jul 07, 2009 at 12:18:16PM +0200, Jiri Olsa wrote:
> > On Fri, Jul 03, 2009 at 01:18:48PM +0200, Jiri Olsa wrote:
> > > On Fri, Jul 03, 2009 at 12:25:30PM +0200, Ingo Molnar wrote:
> > > > 
> > > > * Jiri Olsa <jolsa@redhat.com> wrote:
> > > > 
> > > > > On Fri, Jul 03, 2009 at 11:24:38AM +0200, Ingo Molnar wrote:
> > > > > > 
> > > > > > * Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > > > > 
> > > > > > > Ingo Molnar a écrit :
> > > > > > > > * Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > > 
> > > > > > > >> +++ b/arch/x86/include/asm/spinlock.h
> > > > > > > >> @@ -302,4 +302,7 @@ static inline void __raw_write_unlock(raw_rwlock_t *rw)
> > > > > > > >>  #define _raw_read_relax(lock)	cpu_relax()
> > > > > > > >>  #define _raw_write_relax(lock)	cpu_relax()
> > > > > > > >>  
> > > > > > > >> +/* The {read|write|spin}_lock() on x86 are full memory barriers. */
> > > > > > > >> +#define smp_mb__after_lock() do { } while (0)
> > > > > > > > 
> > > > > > > > Two small stylistic comments, please make this an inline function:
> > > > > > > > 
> > > > > > > > static inline void smp_mb__after_lock(void) { }
> > > > > > > > #define smp_mb__after_lock
> > > > > > > > 
> > > > > > > > (untested)
> > > > > > > > 
> > > > > > > >> +/* The lock does not imply full memory barrier. */
> > > > > > > >> +#ifndef smp_mb__after_lock
> > > > > > > >> +#define smp_mb__after_lock() smp_mb()
> > > > > > > >> +#endif
> > > > > > > > 
> > > > > > > > ditto.
> > > > > > > > 
> > > > > > > > 	Ingo
> > > > > > > 
> > > > > > > This was following existing implementations of various smp_mb__??? helpers :
> > > > > > > 
> > > > > > > # grep -4 smp_mb__before_clear_bit include/asm-generic/bitops.h
> > > > > > > 
> > > > > > > /*
> > > > > > >  * clear_bit may not imply a memory barrier
> > > > > > >  */
> > > > > > > #ifndef smp_mb__before_clear_bit
> > > > > > > #define smp_mb__before_clear_bit()      smp_mb()
> > > > > > > #define smp_mb__after_clear_bit()       smp_mb()
> > > > > > > #endif
> > > > > > 
> > > > > > Did i mention that those should be fixed too? :-)
> > > > > > 
> > > > > > 	Ingo
> > > > > 
> > > > > ok, could I include it in the 2/2 or you prefer separate patch?
> > > > 
> > > > depends on whether it will regress ;-)
> > > > 
> > > > If it regresses, it's better to have it separate. If it wont, it can 
> > > > be included. If unsure, default to the more conservative option.
> > > > 
> > > > 	Ingo
> > > 
> > > 
> > > how about this.. 
> > > and similar change for smp_mb__before_clear_bit in a separate patch
> > > 
> > > 
> > > diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
> > > index b7e5db8..4e77853 100644
> > > --- a/arch/x86/include/asm/spinlock.h
> > > +++ b/arch/x86/include/asm/spinlock.h
> > > @@ -302,4 +302,8 @@ static inline void __raw_write_unlock(raw_rwlock_t *rw)
> > >  #define _raw_read_relax(lock)	cpu_relax()
> > >  #define _raw_write_relax(lock)	cpu_relax()
> > >  
> > > +/* The {read|write|spin}_lock() on x86 are full memory barriers. */
> > > +static inline void smp_mb__after_lock(void) { }
> > > +#define ARCH_HAS_SMP_MB_AFTER_LOCK
> > > +
> > >  #endif /* _ASM_X86_SPINLOCK_H */
> > > diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
> > > index 252b245..4be57ab 100644
> > > --- a/include/linux/spinlock.h
> > > +++ b/include/linux/spinlock.h
> > > @@ -132,6 +132,11 @@ do {								\
> > >  #endif /*__raw_spin_is_contended*/
> > >  #endif
> > >  
> > > +/* The lock does not imply full memory barrier. */
> > > +#ifndef ARCH_HAS_SMP_MB_AFTER_LOCK
> > > +static inline void smp_mb__after_lock(void) { smp_mb(); }
> > > +#endif
> > > +
> > >  /**
> > >   * spin_unlock_wait - wait until the spinlock gets unlocked
> > >   * @lock: the spinlock in question.
> > > diff --git a/include/net/sock.h b/include/net/sock.h
> > > index 4eb8409..98afcd9 100644
> > > --- a/include/net/sock.h
> > > +++ b/include/net/sock.h
> > > @@ -1271,6 +1271,9 @@ static inline int sk_has_allocations(const struct sock *sk)
> > >   * in its cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1
> > >   * could then endup calling schedule and sleep forever if there are no more
> > >   * data on the socket.
> > > + *
> > > + * The sk_has_helper is always called right after a call to read_lock, so we
> > > + * can use smp_mb__after_lock barrier.
> > >   */
> > >  static inline int sk_has_sleeper(struct sock *sk)
> > >  {
> > > @@ -1280,7 +1283,7 @@ static inline int sk_has_sleeper(struct sock *sk)
> > >  	 *
> > >  	 * This memory barrier is paired in the sock_poll_wait.
> > >  	 */
> > > -	smp_mb();
> > > +	smp_mb__after_lock();
> > >  	return sk->sk_sleep && waitqueue_active(sk->sk_sleep);
> > >  }
> > >  
> > 
> > any feedback on this? 
> > I'd send v6 if this way is acceptable..
> > 
> > thanks,
> > jirka
> 
> also I checked the smp_mb__before_clear_bit/smp_mb__after_clear_bit and
> it is used quite extensivelly.
> 
> I'd prefer to send it in a separate patch, so we can move on with the 
> changes I've sent so far..
> 

As with any optimization (and this is one that adds a semantic that will
just grow the memory barrier/locking rule complexity), it should come
with performance benchmarks showing real-life improvements.

Otherwise I'd recommend sticking to smp_mb() if this execution path is
not that critical, or to move to RCU if it's _that_ critical.

A valid argument would be if the data structures protected are so
complex that RCU is out of question but still the few cycles saved by
removing a memory barrier are really significant. And even then, the
proper solution would be more something like a
__read_lock()+smp_mb+smp_mb+__read_unlock(), so we get the performance
improvements on architectures other than x86 as well.

So in all cases, I don't think the smp_mb__after_lock() is the
appropriate solution.

Mathieu

> regards,
> jirka

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2009-07-07 14:02 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-03  8:12 [PATCHv5 0/2] net: fix race in the receive/select Jiri Olsa
2009-07-03  8:13 ` [PATCHv5 1/2] net: adding memory barrier to the poll and receive callbacks Jiri Olsa
2009-07-07 15:56   ` Eric Dumazet
2009-07-03  8:14 ` [PATCHv5 2/2] memory barrier: adding smp_mb__after_lock Jiri Olsa
2009-07-03  9:06   ` Ingo Molnar
2009-07-03  9:20     ` Eric Dumazet
2009-07-03  9:24       ` Ingo Molnar
2009-07-03  9:56         ` Jiri Olsa
2009-07-03 10:25           ` Ingo Molnar
2009-07-03 11:18             ` Jiri Olsa
2009-07-03 11:30               ` Jarek Poplawski
2009-07-03 11:43                 ` Jiri Olsa
2009-07-07 10:18               ` Jiri Olsa
2009-07-07 13:46                 ` Jiri Olsa
2009-07-07 14:01                   ` Mathieu Desnoyers [this message]
2009-07-07 14:34                     ` Oleg Nesterov
2009-07-07 15:04                       ` Mathieu Desnoyers
2009-07-07 15:44                         ` Oleg Nesterov
2009-07-07 15:50                           ` Peter Zijlstra
2009-07-07 19:45                             ` Mathieu Desnoyers
2009-07-07 22:44                               ` Eric Dumazet
2009-07-07 23:28                                 ` Mathieu Desnoyers
2009-07-07 23:51                                   ` Oleg Nesterov
2009-07-08  4:34                                     ` Mathieu Desnoyers
2009-07-08  7:18                                       ` Jarek Poplawski
2009-07-07 14:34                     ` Jiri Olsa
2009-07-07 14:42                     ` Eric Dumazet
2009-07-07 14:57                       ` Mathieu Desnoyers
2009-07-07 15:23                         ` Eric Dumazet
2009-07-08 17:47                           ` Jiri Olsa
2009-07-08 18:07                             ` David Miller
2009-07-08 18:16                               ` Jiri Olsa
2009-07-03 14:04     ` Mathieu Desnoyers
2009-07-03 15:29       ` Herbert Xu
2009-07-03 15:37         ` Eric Dumazet
2009-07-03 15:47           ` Mathieu Desnoyers
2009-07-03 17:06             ` Paul E. McKenney
2009-07-03 17:31               ` Mathieu Desnoyers
2009-07-03 15:40         ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090707140135.GA5506@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=a.p.zijlstra@chello.nl \
    --cc=davem@redhat.com \
    --cc=davidel@xmailserver.org \
    --cc=eric.dumazet@gmail.com \
    --cc=fbl@redhat.com \
    --cc=htejun@gmail.com \
    --cc=jarkao2@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@redhat.com \
    --cc=oleg@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).