From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755019AbcEZSqe (ORCPT ); Thu, 26 May 2016 14:46:34 -0400 Received: from mail-bl2on0107.outbound.protection.outlook.com ([65.55.169.107]:12029 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753400AbcEZSqc (ORCPT ); Thu, 26 May 2016 14:46:32 -0400 Authentication-Results: linux.vnet.ibm.com; dkim=none (message not signed) header.d=none;linux.vnet.ibm.com; dmarc=none action=none header.from=hpe.com; Message-ID: <57474114.1030506@hpe.com> Date: Thu, 26 May 2016 14:31:48 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Pan Xinhui CC: , , Subject: Re: [PATCH] pv-qspinlock: Try to re-hash the lock after spurious_wakeup References: <1464156575-4732-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> In-Reply-To: <1464156575-4732-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [71.168.64.34] X-ClientProxiedBy: CY1PR19CA0035.namprd19.prod.outlook.com (10.162.38.173) To DF4PR84MB0315.NAMPRD84.PROD.OUTLOOK.COM (10.162.193.29) X-MS-Office365-Filtering-Correlation-Id: e46c1ff8-96d9-4cca-dc33-08d385940307 X-Microsoft-Exchange-Diagnostics: 1;DF4PR84MB0315;2:dpZMttHkoHlJuzB2PGi/KszfwNA4g1t3+YwnUaTPSLTSaa9pMvKQK0xKYP4qAwFqnvpgjaEmVJ30F0+PsJBu9B0wBKyTVOaslDlG/e8dhhS8xtImP+AkOV9942uR/FrgPJaTKTeqMbnI5/W04v81F98s62APDWHyx2CF8/cNP66CYRyFv6fBx4bbah2Re4Oa;3:/HQiaM1Haf5UhF/2D7D0qYlqBSUjOuy7pX1ihFpFq8yCmTkJlSLOreRuRWoKXdMFTCsiquFLSmHL+gWv5gtcPE0xdpzEqx/53j3CaZZEv9WYCPkNgZ+BJ0Y/jZPfNgaQ X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DF4PR84MB0315; X-Microsoft-Exchange-Diagnostics: 1;DF4PR84MB0315;25:KWSXRM5cJwSE1CrG8byufSmiRRR3WDVlcFVcSZesJ8S3MPIpK5tZ+Af46aUKnAM3YnW3xZd0fpwEqGo1Nw6fVVqrv9ZYpCYXj9Z7exMG1egifB2Xi8u7Zx392yokuB8b0KnJ72ze4mp/N6+WiNKmDH/Dcr9RJOgyulO0HiqJwhyWmkg1G/b0IF75atszHl6wg2VagAX89UsL2AU/EHR2K8laGKP7mAP79uRPPDZPJghLXkANK5SMxi3dVHft7E9l/yE3LLszX+LOPPFiJSqgSRBfsETP+eg2KZ83ZPqDdGVxzC/s/KcnlYny667LvOnZcIXn63V5SVljrOTLGr309bD2gJkYjMA79d+0KQJcOP/OsbCeD5S3xbBg5o2+kaK8XoBSX6ixvyV+/oMesnoDjrJ7hOmigIduRU3sesYZTtHdSlisyyia3xSEj//xPCbik4WaIJd5MoNxnnCXaEyuLgwyeQz9OqHXSGcD0PejQP314TA5ipO1pU0wxHcJ/NXzUh1LsjZKGcR4gzwsjYWwMVWDg0uouG826QB2nM6hkm9cZwn5m1lQD1e7NRNFWrdOz+1OA6jYKdBnyTlXrIyUW6sSv+k0FwSR5khN6uxh2I+v8MylCwGMrkF6wk3PDN+USEJL0qcWxE+Dmweu9f4v34N6JTGv5I9umDJgHrot0haIAbA3UoVW59V8cOqHId/pLGwC/rhsyP9F06oRa2XO8/7oECuZC7tIXx90BNhcS3Y= X-Microsoft-Exchange-Diagnostics: 1;DF4PR84MB0315;20:bs+4EFlR50/R2hpDGq58DUv7XMJpXhMjWKC12eiF/b/64wA2bL1u4Do9p0jBx5kGsy3LbwHFcnL9vlHzEqIlt9n7tO0H+rqZNASYENqR+MrIKtxyWdL9REnMYVhxcclWGMQt9KWIKH1czmZVrqogZ773PbCAXYqFy59zQHlO6kz4khz3I3kxT+pft8YpJT3GumV8UV3vFMwFqPUqCd0BqO9wCQMM38caKLI7NF+qhfhQbB1tb5DqWYujWQjMQU7uRIgDVCRMJYdX4IClwBI7t1H/WAjdqJRESHq3cc2oyUyepewX4Cr9nnuPanOIPqowjoPex9jYqP5ohVuQcw7NuQ==;4:NIQjMfgqgDN9B7f5oq2gjHhb2C7Zv+5TSImfGHrPwXHjtc3Y8RF/hcTgcmt04/WxqeN480c+FrJnAcUEgsBNI9+DflJ88ezDLgeM0kizdcfUExkZn7iUvKJhdMtQmTqqMfBkJ5rbNQT6nz9LXT49HcUzQWYQjvtVDVXdSswr5tzniTRAQjKa4Y+uFZMgNVYu1FSCEdBzQycRX5k8fCKgNFrKV1OGQwsyB5SnYyyuiwgTIYtvDLfvvVUGw2QUeyfPJviQUaXeAbGIS6asaxyt43QlJE1QRxJrx38yIwqABCB0L9SlqYiR1h8tqILOsx6Y3ruRMFhbr+n9yOcYUgLnYUbj0vHLioTTerYTZ9X9XZ+Y+GjLCyHaBb6KPtJ+hmWL4Nuy+SOkNRMu7JPc7Qf6Qo4tASmDkBUpm0Kjf3X/U5o= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(104084551191319); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001);SRVR:DF4PR84MB0315;BCL:0;PCL:0;RULEID:;SRVR:DF4PR84MB0315; X-Forefront-PRVS: 0954EE4910 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(24454002)(377454003)(50466002)(83506001)(2950100001)(77096005)(117156001)(19580405001)(36756003)(110136002)(19580395003)(122286003)(189998001)(92566002)(2906002)(4326007)(64126003)(5008740100001)(23756003)(117636001)(59896002)(4001350100001)(230700001)(66066001)(3846002)(76176999)(47776003)(65956001)(87266999)(65806001)(54356999)(586003)(5004730100002)(42186005)(86362001)(81166006)(50986999)(8676002)(65816999)(62816006);DIR:OUT;SFP:1102;SCL:1;SRVR:DF4PR84MB0315;H:[192.168.142.140];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;DF4PR84MB0315;23:+g1cH5oA1kQgWN8azz7u8GAzyTjlGYVutxtt9bh?= =?iso-8859-1?Q?uF9Q1wR4cKCERqQOQ5j8Q0ZdYHfcZe5bC56p0ohFkqEYKDRxzQUdPkzjXe?= =?iso-8859-1?Q?2AUjcLYFWx8N10w+JDUDx5g7OwZlgaD0wScKVTG5m4VgTLc6fo5ay2uYFm?= =?iso-8859-1?Q?aH4XeT06+gPn8g4zKVBqyRUXkpZ01aKCflQAy3wyjJARF1gFAzFIo0tnVN?= =?iso-8859-1?Q?tD0y83serkV/rfJtjHqyNv9CpD8K448JJmxbUZPAo9RefZpdf5Rr7pJWv7?= =?iso-8859-1?Q?Cix0QNHyhkSMLxP/SIHATCNj6K8fweLy0pFCenh1tWKV2LZT07/y/2plVz?= =?iso-8859-1?Q?3k10QHGD5fragDwiu32djRYWW1J8fcdKDS+KakJvpbClvTVNKlo+hx1Czi?= =?iso-8859-1?Q?QvbzAFkfofQBj1pyP0d8F2EyHWcICuAAfNIQXRje0pubVpRl9K88NXV+ML?= =?iso-8859-1?Q?NE92Djr+HdwFmN9zTFYNxQCuCqHZBv3oRlsxkwTyr4rYkjgqJBmFGwlQFC?= =?iso-8859-1?Q?HSJ6426YYnJDKd1xNbZJrrJp4rBa3+Evz+9ab7TqfJtqjBAdNa49q3wKvR?= =?iso-8859-1?Q?dY3xUECEY8I4ocfQfu25Vop1vILQ2yBRN24Txb7qRyfV09+cVyCZux0SYw?= =?iso-8859-1?Q?71Y5nYwy8QZlW5Du7revVi4L5HHkc0zoUSlaF1zSNM/WTxx8sEvz5x6Mws?= =?iso-8859-1?Q?jprV7twDqhPP3VTtKrXw3Wkc3JHjSLrIJWSuJoO8QZxwxcUODGro0/E8af?= =?iso-8859-1?Q?Trp4+G3OBEFS/dBF6TFiZdCBsSL2Jj/1KIsKIOwiZV3ix+dEkoN0Edf4kD?= =?iso-8859-1?Q?4gqls2u3o/L0bP5UkTyjqE59Mearr8bXdOeA/LySqVNBMnOUJSel3Lgl4i?= =?iso-8859-1?Q?S1eaKCsA7/Srf0ZnjE2otZBu0b//GDS0eg2jkxC+NO6SoVuWF4+0NzXq8F?= =?iso-8859-1?Q?VdPQeJW6n6H7XZ1kVRyhFWzl92iz2dRZoEkrCPJHGDRRX7L49KKC6HEijB?= =?iso-8859-1?Q?h+Zg0DmeD4X3Y0nPtfeNRTXcNSHKp4rLQ4DdA/iWrLjL4aSEP+GPQhsEb8?= =?iso-8859-1?Q?E+JTa9kXJY9ofzZEqb9EGZZFeyIWB6c/bNv+OfvkAQZgIQuUABRewHdPA3?= =?iso-8859-1?Q?XH57st5BXC54AgbMcVmQsnIT3nDygbNjzpg3rx6wtTTls2ew9Ned55UgbM?= =?iso-8859-1?Q?uqVN9kQVfFz?= X-Microsoft-Exchange-Diagnostics: 1;DF4PR84MB0315;5:6YjfGqyYY7KSNkWRid0cZToEGYG0nMmgRAcVx/wZggQme4gdt0D1EwkL3jZEWIlNtjUM0EkcvCtsmtgOAKOSUuLX/Isyufa/YcfjENkuZP/DXfhtXy/022FHijE2KYQR6N5H10QcUuFy+XKQIEtfmg==;24:V91GEdYUBtKje2bSzHunCnC+uyyZGWKou2cm8nmtux3KsR8pb1KdvkhxDhYJYtjd4DOtwZinqJdMfnCwxjChr2eCkgB/toka7jE+zWJoK3M=;7:rvB2Ga7mZ/mJfafG2Zxl+2Mi/VBckezKymVwNbbyZlrSs1uXjAhMPt5LFT5yo5eWi5BKSPgdGxSCHA+XfmxFq4iiHZEKb6QDd2T5Rz5TOshLdJrc6edqfXNMNPzBEpfqBvVDrxhIabd1m9mi/f3RChZELq/uiE6EAPoFDqEUsM3qoucpNKZh5nDNQJb4GVQP SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 May 2016 18:31:53.5622 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DF4PR84MB0315 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/25/2016 02:09 AM, Pan Xinhui wrote: > In pv_wait_head_or_lock, if there is a spurious_wakeup, and it fails to > get the lock as there is lock stealing, then after a short spin, we need > hash the lock again and enter pv_wait to yield. > > Currently after a spurious_wakeup, as l->locked is not _Q_SLOW_VAL, > pv_wait might do nothing and return directly, that is not > paravirt-friendly because pv_wait_head_or_lock will just spin on the > lock then. > > Signed-off-by: Pan Xinhui > --- > kernel/locking/qspinlock_paravirt.h | 39 +++++++++++++++++++++++++++++-------- > 1 file changed, 31 insertions(+), 8 deletions(-) Is this a problem you can easily reproduce on PPC? I have not observed this issue when testing on x86. > > diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h > index 2bbffe4..3482ce9 100644 > --- a/kernel/locking/qspinlock_paravirt.h > +++ b/kernel/locking/qspinlock_paravirt.h > @@ -429,14 +429,15 @@ static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node) > return; > > /* > - * Put the lock into the hash table and set the _Q_SLOW_VAL. > - * > - * As this is the same vCPU that will check the _Q_SLOW_VAL value and > - * the hash table later on at unlock time, no atomic instruction is > - * needed. > + * Put the lock into the hash table and set the _Q_SLOW_VAL later > */ > - WRITE_ONCE(l->locked, _Q_SLOW_VAL); > (void)pv_hash(lock, pn); > + > + /* > + * Match the smp_load_acquire in pv_wait_head_or_lock() > + * We mush set the _Q_SLOW_VAL after hash. > + */ > + smp_store_release(&l->locked, _Q_SLOW_VAL); > } > > /* > @@ -454,6 +455,7 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) > struct qspinlock **lp = NULL; > int waitcnt = 0; > int loop; > + u8 lock_val; > > /* > * If pv_kick_node() already advanced our state, we don't need to > @@ -487,7 +489,7 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) > clear_pending(lock); > > > - if (!lp) { /* ONCE */ > + if (!lp) { > lp = pv_hash(lock, pn); > > /* > @@ -517,6 +519,13 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) > qstat_inc(qstat_pv_wait_again, waitcnt); > > /* > + * make sure pv_kick_node has hashed the lock, so after pv_wait > + * if ->locked is not _Q_SLOW_VAL, we can hash the lock again. > + */ > + if (lp == (struct qspinlock **)1 > + && smp_load_acquire(&l->locked) == _Q_SLOW_VAL) > + lp = (struct qspinlock **)2; > + /* > * Pass in the previous node vCPU nmber which is likely to be > * the lock holder vCPU. This additional information may help > * the hypervisor to give more resource to that vCPU so that > @@ -525,13 +534,27 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) > */ > pv_wait(&l->locked, _Q_SLOW_VAL, pn->prev_cpu); > > + lock_val = READ_ONCE(l->locked); > + > + /* if ->locked is zero, then lock owner has unhashed the lock > + * if ->locked is _Q_LOCKED_VAL, > + * 1) pv_kick_node didnot hash the lock, and lp != 0x1 > + * 2) lock stealing, lock owner has unhashed the lock too > + * 3) race with pv_kick_node, if lp == 2, we know it has hashed > + * the lock and the lock is unhashed in unlock() > + * if ->lock is _Q_SLOW_VAL, spurious_wakeup? > + */ > + if (lock_val != _Q_SLOW_VAL) { > + if (lock_val == 0 || lp != (struct qspinlock **)1) > + lp = 0; > + } It is a bit hard to verify the correctness of these checks as many variables are involved. Instead, I posted an alternative way to solve this problem by focusing just on the atomic setting of _Q_SLOW_VAL which should be easier to understand and verified. Would you mind testing that patch to see if it can fix the problem that you see? Cheers, Longman