From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753746AbcD1Uum (ORCPT ); Thu, 28 Apr 2016 16:50:42 -0400 Received: from mail-bn1bon0131.outbound.protection.outlook.com ([157.56.111.131]:2052 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753550AbcD1Uuh (ORCPT ); Thu, 28 Apr 2016 16:50:37 -0400 Authentication-Results: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=hpe.com; Message-ID: <57227791.2020908@hpe.com> Date: Thu, 28 Apr 2016 16:50:25 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Boqun Feng CC: Peter Zijlstra , Ingo Molnar , , Pan Xinhui , Scott J Norton , Douglas Hatch Subject: Re: [RESEND PATCH v3] locking/pvqspinlock: Add lock holder CPU argument to pv_wait() References: <1461250925-39599-1-git-send-email-Waiman.Long@hpe.com> <20160428133213.GC25392@insomnia> In-Reply-To: <20160428133213.GC25392@insomnia> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.37] X-ClientProxiedBy: BY2PR03CA039.namprd03.prod.outlook.com (10.141.249.12) To AT5PR84MB0306.NAMPRD84.PROD.OUTLOOK.COM (10.162.138.28) X-MS-Office365-Filtering-Correlation-Id: 56d23235-b240-44b8-1cbe-08d36fa6bf02 X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;2:WMPzFTJEzN+e77k7USspJUj1bJ0Q6MYgKnVw90/8wcJGrJRE9uIDdbYPT+ZsgS+3PMCDDekoHf34pLawFUkhedkspzeO6UUhA83v7RiofCU+P1sC1Rq5HO/H2CZAaCMU+VeM47KSm3nQJ4d5Jy5pLBnoGoDloeWypwdZHWTBzruwDXtuicbu8cmfK5tiNyK+;3:IAF0WSSISX0G8pM9BBZBQP8AxGN6nPSkQkVT3zBMTnTWlpCLJusApqKJHtGqssm6H2kRDdp1w45knlYPdjot/SnAZc7ESO2HcB+mFSCq4L5cgZO5uGx02Q3IRtHqrvJC;25:xScDbgkUb5RW4WVMOeImIimbdACSSevI/5tj6MJ/+uK9/Y5pcsd08b64Xh2Ap9GEi4vHo41DsL3HZMV8nOU6n0xV8pBSdfn64nTahGQ4GaW3oCeBESwxDR9qyJ+12dUZAxc9+DXu3YU06R8JzspWNx2dKETXnOKNmvwbpduJGDjmcfEMpViVXb7kMUuNK7a8139WL47+nTbGCkfloK0NqKVx02NVtPzjOAzroZMFkySoKdX7xqxRoeHMe9Br42fO1maTVYuQZqeTvkI1FAfq+XNR7Mgmw1c/V+x/8AFAAKlVYJmFV0ABMDNefgCy+omEksmMWJQ0KNuLvZNXS4iH7Q== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0306; X-LD-Processed: 105b2061-b669-4b31-92ac-24d304d195dc,ExtAddr X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;20:uUi8z8JRExY4oHIFZ46UA/YRq1y2JedgZw+kJQFhRNuLhcvhwhj2bGzTuVHL+p21glQwU3sVFSbtXh4VBKNxIq64chMgunYo5svfbtcAjo4Nc/PVDuZrPdk7ysxsg4EWfjitijBv88LUEM6TtUoMmQZy53Bjw0Uugm7IJhybHRo8Ba2mdDtBjXdDc8WERSvsVpI5XvefnISVIRD+haf3emfl4c3rnky3sZWMT5WTmb1+hcj0gXWs7/ZTnX6CGL077VJom4DkMOsOEw6Nri3DEj7ZZtD7dtydFNNTG3Sy1RuyIhfg1b8zfUNdKd6sP8HarL9VAynYw1L4v9XpDJ6PUQ==;4:7MdXQ9e5OEh5fAgaLzTH9OB4sNGBE+8pcYanKYny6ZukmUzSDLuASWHwIXmJs44K4U0k1XRu6wxhWnpcLRP23SvPnE95odsW7l+NkwyePJsy0nMhxEzLOCG8XFpnifKKXYgG8ugkqDtguEWPPGRT6FE7CEW4YRlxMbOdIVax4c92sUEYDeiMxt4hsy8iVHaCxLjPWouI4Dvq0MISh2a/18MkRNOoENlclgVqyEVqwzWqtxZFckZVJt1SYwqWLOD1X3Izx+1SRBZDpEEQfgJZlVdvSK1rwmQfy0xhIxcdfwGF67H4WWuXcniyfGvu51Dd/DsDh5wRy/VP61ELUQKHxP0pHr6WLaJwlAON63Le5yZC2/cmXm1caRElWetd3dLumHj8iZGu2TI8dz61D3wQ7A== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(9101521072)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046);SRVR:AT5PR84MB0306;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0306; X-Forefront-PRVS: 0926B0E013 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(24454002)(377454003)(76176999)(65816999)(50986999)(189998001)(54356999)(65956001)(66066001)(87266999)(65806001)(64126003)(122286003)(83506001)(2950100001)(42186005)(5008740100001)(1096002)(50466002)(3846002)(23756003)(36756003)(586003)(6116002)(4326007)(110136002)(230700001)(47776003)(92566002)(2906002)(117636001)(59896002)(117156001)(77096005)(86362001)(19580395003)(5004730100002)(19580405001)(4001350100001)(81166005)(217873001)(62816006);DIR:OUT;SFP:1102;SCL:1;SRVR:AT5PR84MB0306;H:[192.168.142.163];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;AT5PR84MB0306;23:osn92FULd4KGUCS6WOCcQ23D1u8/GyCT85PrSHS?= =?iso-8859-1?Q?ovNBEm3WWFwR173SsB11NW1Egs8E+RZIBf7H7cWffpSySyFrUyk7M2Th9o?= =?iso-8859-1?Q?ZgUswqk11AdjhdQ5MfLdVxyGtwJE9gsMYlDYP9mKVHSy5TYPTOz9smXE6d?= =?iso-8859-1?Q?Kmtuz3Qze0w+NcjHv81k/7TIZ7KAajTQXMv63I4wGvMWyiK26l4ZGp3IMj?= =?iso-8859-1?Q?tOr60PAQ8el9w3/gkft9b8FjAUfAIF3E9GtjX1/o9BO6Tt+7A06gkrAuRj?= =?iso-8859-1?Q?24pJf81p6RhQDOHQq8QOSBArK9AvCFRCh5DYMFy6HPY2r3EQUWrnkt5FdM?= =?iso-8859-1?Q?NKsacPRgt/7usTWQS85dc9CfT4lXPis5RDSeR3CaHKsGapiLidk5gN/iGH?= =?iso-8859-1?Q?EYnyWHn+msEXou/79ihfIXuXyKoBA3PQFJ6nmYgSpZTKUG3ofkdwLKa3b9?= =?iso-8859-1?Q?1kG9e+fCtfJGqPFeE160u04qZGa1LaYh0hxlCH0TvztubETEVCcTR7lElI?= =?iso-8859-1?Q?4TDpFizASuOChs1ztWZQ5U2NcCc+hqbdNq/233U+IIcI2DjpReo+U6bWFG?= =?iso-8859-1?Q?22zHNW+2BoTBrFKWNBDNHarqyakjngqqrinYo8RmT+JCpwelGbTUQ0Jlgh?= =?iso-8859-1?Q?DtHaG3p6t5tTHanP8yRHarsXLHB/nXr+PiTdqhmBBnKgwefVdk9D0AbRPy?= =?iso-8859-1?Q?iIX9wJQIlglYk1YsbkU5HSPOH1pENBpPG1xyawGvuAzngCQMMsmJ1XOPFP?= =?iso-8859-1?Q?RytSQg7SCiv2LF6TXvkVluMoffF0nPFAsfLgZiWDUBQi2emAYYV9HzLdHz?= =?iso-8859-1?Q?4nUZA4RHM4DsrkdKCFUPf/boNnIjbX1+B1CBTlObKmDP1SsQad1zcuCYk4?= =?iso-8859-1?Q?Sqf2AxEJLxfgEqdl++2K4ZIcGpSASFqdZK5B032IMyblXpfAQTaYk25wwc?= =?iso-8859-1?Q?I4sztdSWVThWZ7wwER/1QDRMZ5iYEah/zSZ7W0DO8DbSxGOVAroAUgbBKM?= =?iso-8859-1?Q?P36O7o6JUyd+rUnlCBD3P3dQ5qqRWzFOHKvcNE6lAVkkNkh7J/glGxqRei?= =?iso-8859-1?Q?e/QLbBsW7GCo6RyP1Rnftd4LFEnW97hUWzaeFD1hyVeQrpBmwMsS+EAwg2?= =?iso-8859-1?Q?ivMppbFeimG42xbNNeMwpsgDTwaBt4eW+tZaei/cWTCgAUItc2izrb5UDM?= =?iso-8859-1?Q?H3/jsz3/7PIS9C/ZG5RN3imLbArwo+h4A=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;5:MCRJYhuGwDGE6yKwvRvFzviP3xLnmk5miBoC8lEs/oU8EmCsarw/SNrbzE6aLtcrDSZgOq6ExIsKzL70+ZCO2shXEAtC252ry9+ov0s13Dk84LDR1UMdZLLHKaL9iFFyK/zDd1Ut3knRjoU7UULHeQ==;24:rhfV5xOHVytgn4KBM13zZII+TYIjGDBwkyRPfIKG6NNSHL8henT6imEa7kXxrTeqUHpGXJOA5rYfA8gc8p/tNRtOg3mnTUdrS7DV5ar6hmE=;7:OLTPwM9CPdjaBIz/Jp3BtGec4MKH/OdojZwf5wGDD+au/RJI/MWtvWXPxfmgYLDs+GGVmwvqpw9l7clbzdWmhVpl4NjNntrtn4lj4e+aQwiGAGWyL1MJD50njj5P/NBj9CljZZcZ/2lJTBFivYq1eWRX+JLP44xa7qOTu4PXZpDNRYvxvtk0iW+FPv//Ox+n SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Apr 2016 20:50:33.6399 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AT5PR84MB0306 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/28/2016 09:32 AM, Boqun Feng wrote: > Hi Waiman, > > On Thu, Apr 21, 2016 at 11:02:05AM -0400, Waiman Long wrote: >> Pan Xinhui was asking for a lock holder cpu argument in pv_wait() >> to help the porting of pvqspinlock to PPC. The new argument will can >> help hypervisor expediate the execution of the critical section by >> the lock holder, if its vCPU isn't running, so that it can release >> the lock sooner. >> >> The pv_wait() function is now extended to have a third argument >> that holds the vCPU number of the lock holder, if this is >> known. Architecture specific hypervisor code can then make use of >> that information to make better decision of which vCPU should be >> running next. >> >> This patch also adds a new field in the pv_node structure to hold >> the vCPU number of the previous queue entry. For the queue head, >> the prev_cpu entry is likely to be the that of the lock holder, >> though lock stealing may render this information inaccurate. >> >> In pv_wait_head_or_lock(), pv_wait() will now be called with that >> vCPU number as it is likely to be the lock holder. >> >> In pv_wait_node(), the newly added pv_lookup_hash() function will >> be called to look up the queue head and pass in the lock holder vCPU >> number stored there, if found. >> >> Suggested-by: Pan Xinhui >> Signed-off-by: Waiman Long >> --- >> >> v2->v3: >> - Rephrase the changelog and some of the comments to make the >> intention of this patch more clear. >> - Make pv_lookup_hash() architecture dependent to eliminate its cost >> if an architecture doesn't need it. Also make the number of >> cachelines searched in pv_lookup_hash() to between 1-4. >> - [RESEND] Fix typo error. >> >> v1->v2: >> - In pv_wait_node(), lookup the hash table for the queue head and pass >> the lock holder cpu number to pv_wait(). >> > [snip] > >> @@ -282,7 +328,8 @@ static void pv_init_node(struct mcs_spinlock *node) >> * pv_kick_node() is used to set _Q_SLOW_VAL and fill in hash table on its >> * behalf. >> */ >> -static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev) >> +static void pv_wait_node(struct qspinlock *lock, struct mcs_spinlock *node, >> + struct mcs_spinlock *prev) >> { >> struct pv_node *pn = (struct pv_node *)node; >> struct pv_node *pp = (struct pv_node *)prev; >> @@ -290,6 +337,8 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev) >> int loop; >> bool wait_early; >> >> + pn->prev_cpu = pp->cpu; /* Save vCPU # of previous queue entry */ >> + >> /* waitcnt processing will be compiled out if !QUEUED_LOCK_STAT */ >> for (;; waitcnt++) { >> for (wait_early = false, loop = SPIN_THRESHOLD; loop; loop--) { >> @@ -314,10 +363,23 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev) >> smp_store_mb(pn->state, vcpu_halted); >> >> if (!READ_ONCE(node->locked)) { >> + struct pv_node *ph; >> + >> qstat_inc(qstat_pv_wait_node, true); >> qstat_inc(qstat_pv_wait_again, waitcnt); >> qstat_inc(qstat_pv_wait_early, wait_early); >> - pv_wait(&pn->state, vcpu_halted); >> + >> + /* >> + * If the current queue head is in the hash table, >> + * the prev_cpu field of its pv_node may contain the >> + * vCPU # of the lock holder. However, lock stealing >> + * may make that information inaccurate. Anyway, we >> + * look up the hash table to try to get the lock >> + * holder vCPU number. >> + */ >> + ph = pv_lookup_hash(lock); > I am thinking, could we save the hash lookup here if wait_early == true? > Because in that case, the previous node is likely to get the lock soon, > it makes sense we yield to that node rather than the holder, so may we > can do something like: > > if (wait_early) > pv_wait(&pn->state, vcpu_halted, pn->prev_cpu); > else { > ph = pv_lookup_hash(lock); > pv_wait(&pn->state, vcpu_halted, > ph ? ph->prev_cpu : -1); > } > > Thoughts? > > Regards, > Boqun > One reason why I don't want to just wake up the previous node vCPU is because I am concerned that a vCPU may go into a suspend-wakeup loop. So I will wait until we get experimental evidence that this really help performance before we commit to this kind of change. Cheers, Longman