Re: [PATCH v2 1/2] spinlock: New spinlock_refcount.h for lockless update of refcount

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Waiman Long <waiman.long@hp.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@redhat.com>,
	Miklos Szeredi <mszeredi@suse.cz>, Ingo Molnar <mingo@redhat.com>,
	linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Andi Kleen <andi@firstfloor.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH v2 1/2] spinlock: New spinlock_refcount.h for lockless update of refcount
Date: Sat, 29 Jun 2013 17:03:59 -0400	[thread overview]
Message-ID: <51CF4BBF.30608@hp.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1306271501220.4013@ionos.tec.linutronix.de>

On 06/27/2013 10:44 AM, Thomas Gleixner wrote:
> On Wed, 26 Jun 2013, Waiman Long wrote:
>> On 06/26/2013 07:06 PM, Thomas Gleixner wrote:
>>> On Wed, 26 Jun 2013, Waiman Long wrote:
>>> This is a complete design disaster. You are violating every single
>>> rule of proper layering. The differentiation of spinlock, raw_spinlock
>>> and arch_spinlock is there for a reason and you just take the current
>>> implementation and map your desired function to it. If we take this,
>>> then we fundamentally ruled out a change to the mappings of spinlock,
>>> raw_spinlock and arch_spinlock. This layering is on purpose and it
>>> took a lot of effort to make it as clean as it is now. Your proposed
>>> change undoes that and completely wreckages preempt-rt.
>> Would you mind enlighten me how this change will break preempt-rt?
> The whole spinlock layering was done for RT. In mainline spinlock is
> mapped to raw_spinlock. On RT spinlock is mapped to a PI aware
> rtmutex.
>
> So your mixing of the various layers and the assumption of lock plus
> count being adjacent, does not work on RT at all.

Thank for the explanation. I had downloaded and looked at the RT patch. 
My code won't work for the full RT kernel. I guess that is no way to 
work around this and only logical choice is to disable it for the full 
RT kernel.

>> The only architecture that will break, according to data in the
>> respectively arch/*/include/asm/spinlock_types.h files is PA-RISC
>> 1.x (it is OK in PA-RISC 2.0) whose arch_spinlock type has a size of
>> 16 bytes. I am not sure if 32-bit PA-RISC 1.x is still supported or
>> not, but we can always disable the optimization for this case.
> You have to do that right from the beginning with a proper data
> structure and proper accessor functions. Introducing wreckage and then
> retroactivly fixing it, is not a really good idea.

For architecture that needs a larger than 32-bit arch_spin_lock type, 
the optimization code will be disabled just like the non-SMP and full RT 
cases.

>>> So what you really want is a new data structure, e.g. something like
>>> struct spinlock_refcount() and a proper set of new accessor functions
>>> instead of an adhoc interface which is designed solely for the needs
>>> of dcache improvements.
>> I had thought about that. The only downside is we need more code changes to
>> make existing code to use the new infrastructure. One of the drivers in my
> That's not a downside. It makes the usage of the infrastructure more
> obvious and not hidden behind macro magic. And the changes are trivial
> to script.

Making the code from d_lock to lockref.lock, for example, is trivial. 
However, there are about 40 different files that need to be changed with 
different maintainers. Do I need to get an ACK from all of them or can I 
just go ahead with such trivial changes without their ACK? You know it 
can be hard to get responses from so many maintainers in a timely 
manner. This is actually my main concern.

>> design is to minimize change to existing code. Personally, I have no
>> objection of doing that if others think this is the right way to go.
> Definitely. Something like this would be ok:
>
> struct lock_refcnt {
>         int                 refcnt;
>         struct spinlock     lock;
> };
>
> It does not require a gazillion of ifdefs and works for
> UP,SMP,DEBUG....
>
> extern bool lock_refcnt_mod(struct lock_refcnt *lr, int mod, int cond);
>
> You also want something like:
>
> extern void lock_refcnt_disable(void);
>
> So we can runtime disable it e.g. when lock elision is detected and
> active. So you can force lock_refcnt_mod() to return false.
>
> static inline bool lock_refcnt_inc(struct lock_refcnt *sr)
> {
> #ifdef CONFIG_HAVE_LOCK_REFCNT
> 	return lock_refcnt_mod(sr, 1, INTMAX);
> #else
> 	return false;
> #endif
> }
>
> That does not break any code as long as CONFIG_HAVE_SPINLOCK_REFCNT=n.
>
> So we can enable it per architecture and make it depend on SMP. For RT
> we simply can force this switch to n.
>
> The other question is the semantic of these refcount functions. From
> your description the usage pattern is:
>
>       if (refcnt_xxx())
> 	   return;
>       /* Slow path */
>       spin_lock();
>       ...
>       spin_unlock();
>
> So it might be sensible to make this explicit:
>
> static inline bool refcnt_inc_or_lock(struct lock_refcnt *lr))
> {
> #ifdef CONFIG_HAVE_SPINLOCK_REFCNT
> 	if (lock_refcnt_mod(lr, 1, INTMAX))
> 	   return true;
> #endif
> 	spin_lock(&lr->lock);
> 	return false;
> }

Yes, it is a good idea to have a config variable to enable/disable it as 
long as the default is "y". Of course, an full RT kernel or an non-SMP 
kernel will have it disabled.

Regards,
Longman

next prev parent reply	other threads:[~2013-06-29 21:04 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-26 17:43 [PATCH v2 0/2] Lockless update of reference count protected by spinlock Waiman Long
2013-06-26 17:43 ` [PATCH v2 1/2] spinlock: New spinlock_refcount.h for lockless update of refcount Waiman Long
2013-06-26 20:17   ` Andi Kleen
2013-06-26 21:07     ` Waiman Long
2013-06-26 21:22       ` Andi Kleen
2013-06-26 23:26         ` Waiman Long
2013-06-27  1:06           ` Andi Kleen
2013-06-27  1:15             ` Waiman Long
2013-06-27  1:24               ` Waiman Long
2013-06-27  1:37                 ` Andi Kleen
2013-06-27 14:56                   ` Waiman Long
2013-06-28 13:46                     ` Thomas Gleixner
2013-06-29 20:30                       ` Waiman Long
2013-06-26 23:27         ` Thomas Gleixner
2013-06-26 23:06   ` Thomas Gleixner
2013-06-27  0:16     ` Waiman Long
2013-06-27 14:44       ` Thomas Gleixner
2013-06-29 21:03         ` Waiman Long [this message]
2013-06-27  0:26     ` Waiman Long
2013-06-29 17:45   ` Linus Torvalds
2013-06-29 20:23     ` Waiman Long
2013-06-29 21:34       ` Waiman Long
2013-06-29 22:11         ` Linus Torvalds
2013-06-29 22:34           ` Waiman Long
2013-06-29 21:58       ` Linus Torvalds
2013-06-29 22:47         ` Linus Torvalds
2013-07-01 13:40           ` Waiman Long
2013-06-26 17:43 ` [PATCH v2 2/2] dcache: Locklessly update d_count whenever possible Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51CF4BBF.30608@hp.com \
    --to=waiman.long@hp.com \
    --cc=andi@firstfloor.org \
    --cc=aswin@hp.com \
    --cc=benh@kernel.crashing.org \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mszeredi@suse.cz \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox