Re: [PATCH v5 01/12] spinlock: A new lockref structure for lockless update of refcount

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Waiman Long <waiman.long@hp.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@redhat.com>,
	Miklos Szeredi <mszeredi@suse.cz>, Ingo Molnar <mingo@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Andi Kleen <andi@firstfloor.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>
Subject: Re: [PATCH v5 01/12] spinlock: A new lockref structure for lockless update of refcount
Date: Mon, 15 Jul 2013 21:40:33 -0400	[thread overview]
Message-ID: <51E4A491.8030201@hp.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1307160109490.11918@ionos.tec.linutronix.de>

On 07/15/2013 07:47 PM, Thomas Gleixner wrote:
> On Mon, 15 Jul 2013, Waiman Long wrote:
>> On 07/15/2013 10:41 AM, Thomas Gleixner wrote:
>>> On Mon, 8 Jul 2013, Waiman Long wrote:
>>> Sigh. GENERIC means, that you use the generic implementation, ARCH
>>> means the architecture has a private implementation of that code.
>>>
>>> The generic implementation can use arch specific includes and if there
>>> is none we simply fallback to the asm-generic one.
>> I used the ARCH+GENERIC to mean using the generic code but with arch specific
>> include.
> And what's the point of that? I just explained it to you that you do
> not need the ARCH=y and GENERIC=y at all.

As I said in my previous mail, I can remove the ARCH+GENERIC option.

>>>    >   Let's start with a simple version because it IS simple and easy
>>>>> to analyze and debug and then incrementally build improvements on it
>>>>> instead of creating an heuristics monster in the first place, i.e.:
>>>>>
>>>>>       if (!spin_is_locked(&lr->lock)&&    try_cmpxchg_once(lr))
>>>>>          return 0;
>>>>>       return 1;
>>>>>
>>>>> Take numbers for this on a zoo of different machines: Intel and AMD,
>>>>> old and new.
>>>>>
>>>>> Then you can add an incremental patch on that, which add loops and
>>>>> hoops. Along with numbers on the same zoo of machines.
>>>> I originally tried to do a cmpxchg without waiting and there was
>>>> practically no performance gain. I believe that as soon as
>>> Well, you did not see a difference on your particular machine. Still
>>> we don't have an idea how all of this works on a set of different
>>> machines. There is a world beside 8 socket servers.
>> I understand that. I can live with try_cmpxchg_once, though doing it
>> twice will give a slightly better performance. However, without
> I asked you several times now to explain and document the whole thing
> with numbers instead of your handwaving "slightly better performance"
> arguments.

I will provide performance data for 1 and 2 retries in my next patch 
version.

>
>> waiting for the lock to be free, this patch won't do much good. To
>> keep it simple, I can remove the ability to do customization while
>> doing cmpxchg once and wait until the lock is free. Please let me
>> know if this is acceptable to you.
> No, it's not acceptable at all if you are not able to provide data for
> 1,2,4,8 socket machines (from single core to your precious big
> boxes). It's that simple. We are not accepting patches which optimize
> for a single use case and might negatively affect 99,9999% of the
> existing users which have no access to this kind of hardware unless
> proven otherwise.

I did provide performance data for 1,2,4 and 8 socket configurations in 
my commit message. I used numactl to simulate different socket 
configuration by forcing the code to use only a subset of total number 
of sockets. I know that is not ideal, but I think it should be close 
enough. I will provide performance data on a more common 2 socket test 
machine that I have.

Yes, I don't provide data for single-thread use case. I will also 
provide that data in my next version by measuring the average time for 
doing low-level reference count update using lock and lockless update 
like what I had done for the qrwlock patch. For single thread case, I 
don't believe any real workload will show any appreciable difference in 
performance due to the differing reference count update mechanisms.

>>> Also what's the approach to tune that? Running some random testbench
>>> and monitor the results for various settings?
>>>
>>> If that's the case we better have a that as variables with generic
>>> initial values in the code, which can be modified by sysctl.
>> As I said above, I can remove the customization. I may reintroduce user
>> customization using sysctl as you suggested in the a follow up patch after
>> this one is merged.
> And I asked for a step by step approach in the first review, but you
> decided to ignore that. And now you think that it's accetable for you
> as long as you get what you want. That's not the way it works, really.

I am trying to provide what you are asking for while at the same time 
meet my own need.

>>>>>> +		getnstimeofday(&tv2);
>>>>>> +		ns = (tv2.tv_sec - tv1.tv_sec) * NSEC_PER_SEC +
>>>>>> +		     (tv2.tv_nsec - tv1.tv_nsec);
>>>>>> +		pr_info("lockref wait loop time = %lu ns\n", ns);
>>>>> Using getnstimeofday() for timestamping and spamming the logs with
>>>>> printouts? You can't be serious about that?
>>>>>
> q>  >  >  >  Thats what tracepoints are for. Tracing is the only way to get proper
>>>>> numbers which take preemption, interrupts etc. into account without
>>>>> hurting runtime performace.
>>>> The _SHOW_WAIT_LOOP_TIME is for debugging and instrumentation purpose only
>>>> during development cycle. It is not supposed to be turned on at production
>>>> system. I will document that in the code.
>>> No, no, no! Again: That's what tracepoints are for.
>>>
>>> This kind of debugging is completely pointless. Tracepoints are
>>> designed to be low overhead and can be enabled on production
>>> systems.
>>>
>>> Your debugging depends on slow timestamps against CLOCK_REALTIME and
>>> an even slower output via printk. How useful is that if you want to
>>> really instrument the behaviour of this code?
>> This code is not critical and I can certainly remove it.
> Did you even try to understand what I wrote? I did not ask you to
> remove instrumentation. I asked you to use useful instrumentation
> instead of some totally useless crap.

I am not that familiar with using the tracepoints instrumentation for 
timing measurement. I will try to use that in the code for that purpose.

Regards,
Longman

next prev parent reply	other threads:[~2013-07-16  1:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-05 14:47 [PATCH v5 00/12] Lockless update of reference count protected by spinlock Waiman Long
2013-07-05 14:47 ` [PATCH v5 01/12] spinlock: A new lockref structure for lockless update of refcount Waiman Long
2013-07-05 18:59   ` Thomas Gleixner
2013-07-08 14:21     ` Waiman Long
2013-07-15 14:41       ` Thomas Gleixner
2013-07-15 21:24         ` Waiman Long
2013-07-15 23:47           ` Thomas Gleixner
2013-07-16  1:40             ` Waiman Long [this message]
2013-07-05 14:47 ` [PATCH v5 02/12] spinlock: Enable x86 architecture to do lockless refcount update Waiman Long
2013-07-05 20:19   ` Thomas Gleixner
2013-07-05 14:47 ` [PATCH v5 03/12] dcache: rename d_count field of dentry to d_refcount Waiman Long
2013-07-05 15:02 ` [PATCH v5 00/12] Lockless update of reference count protected by spinlock Al Viro
2013-07-05 15:29   ` Waiman Long
2013-07-05 15:31     ` Waiman Long
2013-07-05 17:54     ` Al Viro
2013-07-05 18:56       ` Waiman Long
2013-07-05 20:33 ` Thomas Gleixner
2013-07-08 14:22   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51E4A491.8030201@hp.com \
    --to=waiman.long@hp.com \
    --cc=andi@firstfloor.org \
    --cc=aswin@hp.com \
    --cc=benh@kernel.crashing.org \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mszeredi@suse.cz \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.