From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
To: Rik van Riel <riel@redhat.com>
Cc: jeremy@goop.org, kvm@vger.kernel.org, peterz@infradead.org,
virtualization@lists.linux-foundation.org,
paul.gortmaker@windriver.com, hpa@zytor.com, ak@linux.intel.com,
gleb@redhat.com, x86@kernel.org, mingo@redhat.com,
xen-devel@lists.xenproject.org, paulmck@linux.vnet.ibm.com,
konrad.wilk@oracle.com, oleg@redhat.com, davej@redhat.com,
tglx@linutronix.de, fernando_b1@lab.ntt.co.jp,
chegu_vinod@hp.com, waiman.long@hp.com,
linux-kernel@vger.kernel.org, pbonzini@redhat.com,
torvalds@linux-foundation.org
Subject: Re: [RFC] Implement Batched (group) ticket lock
Date: Thu, 29 May 2014 15:14:40 +0530 [thread overview]
Message-ID: <53870188.5060209@linux.vnet.ibm.com> (raw)
In-Reply-To: <53865B53.7050809@redhat.com>
On 05/29/2014 03:25 AM, Rik van Riel wrote:
> On 05/28/2014 08:16 AM, Raghavendra K T wrote:
>
> This patch looks very promising.
Thank you Rik.
[...]
>>
>> - My kernbench/ebizzy test on baremetal (32 cpu +ht sandybridge) did not seem to
>> show the impact of extra cmpxchg. but there should be effect of extra cmpxchg.
>
> Canceled out by better NUMA locality?
Yes perhaps. it was even slightly better.
[...]
>> - we can further add dynamically changing batch_size implementation (inspiration and
>> hint by Paul McKenney) as necessary.
>
> I could see a larger batch size being beneficial.
>
> Currently the maximum wait time for a spinlock on a system
> with N CPUs is N times the length of the largest critical
> section.
>
> Having the batch size set equal to the number of CPUs would only
> double that, and better locality (CPUs local to the current
> lock holder winning the spinlock operation) might speed things
> up enough to cancel that part of that out again...
having batch size = number of cpus would definitely help contended cases
especially on larger machines (by my experience with testing on a 4
node 32 core machine). +ht case should make it even more
beneficial.
My only botheration was overhead in undercommit cases because of extra
cmpxchg.
So may be batch_size = total cpus / numa node be optimal?...
[...]
>> +#define TICKET_LOCK_INC_SHIFT 1
>> +#define __TICKET_LOCK_TAIL_INC (1<<TICKET_LOCK_INC_SHIFT)
>> +
>> #ifdef CONFIG_PARAVIRT_SPINLOCKS
>> -#define __TICKET_LOCK_INC 2
>> #define TICKET_SLOWPATH_FLAG ((__ticket_t)1)
>> #else
>> -#define __TICKET_LOCK_INC 1
>> #define TICKET_SLOWPATH_FLAG ((__ticket_t)0)
>> #endif
>
> For the !CONFIG_PARAVIRT case, TICKET_LOCK_INC_SHIFT used to be 0,
> now you are making it one. Probably not an issue, since even people
> who compile with 128 < CONFIG_NR_CPUS <= 256 will likely have their
> spinlocks padded out to 32 or 64 bits anyway in most data structures.
Yes..
[...]
>> +#define TICKET_BATCH 0x4 /* 4 waiters can contend simultaneously */
>> +#define TICKET_LOCK_BATCH_MASK (~(TICKET_BATCH<<TICKET_LOCK_INC_SHIFT) + \
>> + TICKET_LOCK_TAIL_INC - 1)
>
> I do not see the value in having TICKET_BATCH declared with a
> hexadecimal number,
yes.. It had only helped me to make the idea readable to myself, I
could get rid of this if needed.
and it may be worth making sure the code
> does not compile if someone tried a TICKET_BATCH value that
> is not a power of 2.
I agree. will have BUILD_BUG for not power of 2 in next version.
But yes it reminds me that I wanted to have TICKET_BATCH = 1 for
!CONFIG_PARAVIRT so that we continue to have original fair lock version.
Does that make sense? I left it after thinking about same kernel running
on host/guest which would anyway will have CONFIG_PARAVIRT on.
next prev parent reply other threads:[~2014-05-29 9:44 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-28 12:16 [RFC] Implement Batched (group) ticket lock Raghavendra K T
2014-05-28 21:55 ` Rik van Riel
2014-05-28 22:19 ` Linus Torvalds
2014-05-28 22:29 ` Thomas Gleixner
2014-05-29 1:18 ` Rik van Riel
2014-05-29 9:44 ` Raghavendra K T [this message]
2014-05-29 6:46 ` Peter Zijlstra
2014-05-29 9:51 ` Raghavendra K T
2014-05-29 22:45 ` Waiman Long
2014-05-30 8:53 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53870188.5060209@linux.vnet.ibm.com \
--to=raghavendra.kt@linux.vnet.ibm.com \
--cc=ak@linux.intel.com \
--cc=chegu_vinod@hp.com \
--cc=davej@redhat.com \
--cc=fernando_b1@lab.ntt.co.jp \
--cc=gleb@redhat.com \
--cc=hpa@zytor.com \
--cc=jeremy@goop.org \
--cc=konrad.wilk@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=paul.gortmaker@windriver.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=waiman.long@hp.com \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).