From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751761AbcGOUWG (ORCPT ); Fri, 15 Jul 2016 16:22:06 -0400 Received: from mail-by2nam03on0111.outbound.protection.outlook.com ([104.47.42.111]:51300 "EHLO NAM03-BY2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751698AbcGOUWD (ORCPT ); Fri, 15 Jul 2016 16:22:03 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=waiman.long@hpe.com; Message-ID: <5789424D.7020908@hpe.com> Date: Fri, 15 Jul 2016 16:06:37 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Peter Zijlstra CC: Pan Xinhui , Ingo Molnar , , Boqun Feng , Scott J Norton , Douglas Hatch Subject: Re: [PATCH v2 2/5] locking/pvqspinlock: Fix missed PV wakeup problem References: <1464713631-1066-1-git-send-email-Waiman.Long@hpe.com> <1464713631-1066-3-git-send-email-Waiman.Long@hpe.com> <20160715084732.GF30921@twins.programming.kicks-ass.net> <3c5d5c29-7956-572f-2638-b85299c72432@linux.vnet.ibm.com> <20160715100703.GQ30154@twins.programming.kicks-ass.net> In-Reply-To: <20160715100703.GQ30154@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.182] X-ClientProxiedBy: BLUPR08CA0057.namprd08.prod.outlook.com (10.141.200.37) To CS1PR84MB0312.NAMPRD84.PROD.OUTLOOK.COM (10.162.190.30) X-MS-Office365-Filtering-Correlation-Id: 9e500b44-4f67-4023-29ea-08d3aceb8bbb X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0312;2:+n7jvVkG4xkBo5WVGwW21CEFA8BkwD3Luc+4Fo86klb90jd1iSudNjSrWZQJXZgA7GIQNyKnKKnsTkUOdc2AZ1XFVN2T1hpy35Gwcci7DJPEpA1mqJVCfATpGdpUCcffxME+IUX/YBWIWOQQj9/SGvWvtq5KW1zrM/u+ebbfqeUglnE+6YiskXPdwj/pVL72;3:yj+3XXsrUZrbQouXwInr7UEpLMztSOqcP6xFQRT8Kik5EzCzgf4O4YciYPDHISb0a55eMuxteSJdLCMScptxw5RFlW/EyQwhUYcZ8DAs3FGWeLRICjbvGU3waQnLa3ZH X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0312; X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0312;25:bDV2C4+7TIgGCfceQYv/ziX4xWycJq4mFq3hS4R/3SuH3dwUBkofNItivzQBW5GNY6F8SO5WYP4O7uMD+sQmsVSEEybUC2tR0MI2n8a05mGzDpQFAxLVWqmoQonCr3N4O70ip9UtcJQsbCBdkYFf/qZgZw1RlY0dn2qouseMApk/bVvdL5p/+NWmjdcMGcT9kxfQPqH294LoyD7xaZ4yD5MACoH+OBffShjSLpt7ilnlNc47/vNWbzksGsKQQfpADdAvB42MYKGtloEf97ztUCAz88X/UWpRD4h8tQbg2CYBBIKV8Iplm1BV8jqBturTxDgRQEeUyCUIm3qqkQxR6YMKXoggrpqm2ep0n0Tf/XFY+cSRRHZ1bd1OyxGgt1XDuBzbPPja247O+MaPn0Zk2kQE9bHHTo3hJd4cGZgj/Pf4yAEWcVIa1oD7h0AO3J18q6fW9rVfDv6W0AhTH4RheHJH5rrgiPhtn7lANBDbozrSjBUnWaKKIXwaKJP4fc/091qDrb/yJ0J5ZxrhWmjI01IWCN9+/yOvogXZDuRe1x2jokxZnQSAz/5TOv6tOhU/skZ+yhxlRB7PnqVSbF16QlcSHeJYySOUNs5/9wtO1ykabnbkDuV27ILYfsmmiohuaWIPd9o8580Xp1QkyDHDNYXVQmsCjbG2SnPjR1MXb6oUMSGPDg/gi4ikiOX7x3qjEpFg4lw3CbeRykbZLAzcsw==;31:5F7dky/3OeG5gSxznG3SCHTkGaSyk3kUFqZOgk0p7OcgbmBW5IhJTA79BZ8XQl+YIQgFzN+8mqQCaA2OhrdcJdvaBRoYkyvHvWfS1uJm/4lZDUAgTX0kCM2LDDEsGUghAxbMj0YtJbZAEc0R5JnAjn2+5AdF/rLgpdHCwcdwlRVbE/haJxGt9pF3ubhBbf/JHH6hNOr+ikR/wmC1odHSVg== X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0312;20:tTqgiIX1ZbDY93sffqzAetXYXY4uBjyxnaGdMGKS5PdqwwUJW7T5asWLBWhF0qxTs8N4/ZS96KtJVk83TFML2u1Ed6tB84/kViSIpqeJ0ZoGHlkFYWCxgAQslwEJ4WAEd3UdxwdwM+HOTAtfD8SLLxGfihEWfyOgfUmHfOqiRvtPLSHaiS9ECZyA6DMg0MIs/oN5A5zEO2DicU1+jXxHqk5wt2CHDAzSoL+VFwieTE34dMKIosBfcfpvV33foLXZw+eNGFcRGa1Ymlo0iNOUwFfEMhdE3FZtmDQ33pGxuoJ2xPRv+dIbiQwCH644bqPojBiEO5h/0ibHsy2NoGpk8Spy0xX+N/McZQRIwJ27CrCrb2vvBgiOar+gyeHTXjahmgXN8cJdAGDH8cS903Dd22qepA+5JeBwvgYqZyXp7TluQeJg6hooXqpZyhnW0EvWGL1UPCYC2oUQ7C30u8EtTfRBesq4HGTOEa8zLmb9HuBjlnvfJd4M+ZHti0Slh8I8;4:Wr94XlfpAs6C4Frzr6mW9KN0dimgu+PSq1ajCXzr+P/oDUhDZKfFMi39xvoPAOIlMf4PORwjz8/MTkv4aaUIeCHCuXPiujrGF+YO+jwK/T2kZiJrN0QT7h+tKr48ZZSJkc2rBrPeqAIRUcGxTvPAAaHWVyOAsg/nMrBl5S4xn9QHbU1c2o0Ndsv8ICwgbkKCpvIY2eYJzb2dF5lTub6qDvmRJxnw5121kDsG/JcHD0Zk4GlcRn266hVKy2rkHNxsOa81Xx+knbg5ra0FJGncsiHvbLF9wf0v4XWNMwm7AwhmZs86BNzVAKJpf3nC3/9JeTd5a8vZGALZuVXElR1hWaN/qDw/lRHmhGjAOIAVJaOqQ806EAqG9jcd/ZfaLQC8eFyGW8278WJGc0QPWLkc0g== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6055026);SRVR:CS1PR84MB0312;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0312; X-Forefront-PRVS: 00046D390F X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(7916002)(24454002)(189002)(199003)(377454003)(586003)(4326007)(54356999)(110136002)(101416001)(65816999)(117156001)(42186005)(59896002)(80316001)(47776003)(76176999)(83506001)(6116002)(68736007)(87266999)(3846002)(99136001)(50986999)(92566002)(86362001)(2906002)(65806001)(106356001)(93886004)(105586002)(4001350100001)(50466002)(8676002)(23756003)(2950100001)(77096005)(66066001)(81166006)(81156014)(64126003)(7736002)(189998001)(305945005)(33656002)(7846002)(230700001)(97736004)(36756003)(65956001);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0312;H:[192.168.142.168];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;CS1PR84MB0312;23:Iy+zM7vXORtlJxUljHks4ZHz3Ahq4lNtF+oeK+B?= =?iso-8859-1?Q?vSuwSSsBTXQruuTwWNuchFSIGzfQF+4GdA+agl/P4zA0yZKSlaR6nsJKx4?= =?iso-8859-1?Q?Pc8pkXIxxc/s1vTdlAe7lolU5T58Ne/FBKPvW4QeKDyDuJ87iY4mmlG2zQ?= =?iso-8859-1?Q?uZSOyL8gMXZK7pYe9xhaFB5h/Jvd1BHapC6yuiJ9LAlZopB8oizLtUspVG?= =?iso-8859-1?Q?d44AAFe2GVGUt50xi/UMnmMgIcG4qEgXMVsHTQJzPE/VN71j1Uafoepsq/?= =?iso-8859-1?Q?R6rTM87lIZ+MSA6Gc9bgVjunrS68z1rNAWAhi1hYZgapWKDC7RKAGP7VKU?= =?iso-8859-1?Q?ndzCMQw0XGJ5kLixP8G5nBjAvTr2dSpqkH2eY0wjQsebc3iv3d7kfNHmnN?= =?iso-8859-1?Q?JfNz1iGxGA6Xrqh7L4dG28ilnG9wFmct6wJJ6C0BCXUiAH+RUjuTIshT2/?= =?iso-8859-1?Q?IAzfo/+IqyYQJpSseAl4phBjc4dAME3cvrjG+onodQhGgA7ev8IzqgEQzA?= =?iso-8859-1?Q?uK0ePVZLXxpLeMODqhrTLQ8YF4yqVdKaBhTSq11yaS65bFf1ZIh+mB2K+d?= =?iso-8859-1?Q?4dYVOxJyh/BoGqgRFx49yLOCa0o3IZ1uLwAEOnRejBy5xW4Ze/t4FNn1+I?= =?iso-8859-1?Q?+n5/ZbnWJNtXVN8jS3jDmdkBxMsmiWfv55vgrv2FTQDH7Fvg3Jl1TeqsDc?= =?iso-8859-1?Q?+6e7mlym2QzvZKlkZ5IHnVqYAYG7IFbdD9kQcr0xFvVxPH2Dk1PhdnKshK?= =?iso-8859-1?Q?RgtltHfWIkroQSyNEk1k9cahdNE10KanUWe4OS9vY+5VxjmN1LiqvAxfcb?= =?iso-8859-1?Q?6fEEdrZrw6PzQFAc77LCmJKhvFBdwffGMobnzEDaz1ALsZqVowgYeh039u?= =?iso-8859-1?Q?ZiyNz4f310w0NZHhF6Z+1mFWhdxvVXVP1NrY9CGhhtXseI1HwfMvDlHQxG?= =?iso-8859-1?Q?tWiPOiPOooiER5eGTkT2ju1QShxLFFHlLYf0zpw8zjw8EgLwiUiReufuh2?= =?iso-8859-1?Q?pQOLQeXrhACTic6eLwNugwN7Tc/byuaVv85Sf9mZhUsXW6G2jy5MpzBf0I?= =?iso-8859-1?Q?wcTWDNGEpG4hKSRqjYd/vTJEZNshdaPmLy5NWxvkmGznywXDna5arm3/W9?= =?iso-8859-1?Q?W0DhfJuRyvvMdi3eUcjKrLBFM4fb7C2yldJKxnaLp23WP/a+b9MzudXnFv?= =?iso-8859-1?Q?YKNmhTAZm+REK/3pZDvtmEAHE40xhlB++tE8XoLPjzSNSDCkVXj+4AWCfK?= =?iso-8859-1?Q?Uz+0TRSpEuyhFJqt4pzTsyaCPbB6eEQ3v3iIig8faAiXfQZWc91T4vid9A?= =?iso-8859-1?Q?la7rKf0LDuj2i+dvR0Rr1JDeiQ074YPQKnfLgTLdAqZHg=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0312;6:+TjV4IMYJ0H3nSLURywYYzdvsd38KhK/eMevfeRpmhVsdidE6gP5XvJNpR2KU7TIcJ/kUXY5Eb9kIg2irVrKh6XVEQTV1/WwHoTIxERDAdWptSFugIatMYDLulu1g3RagOhnlZE4t/KuoV+GnXv+yv0b0suYMNQSb3bn0q6KdYwnZjRcW0oK9Rh8wcIrbxdcx9xuNaIvamijpAEGHeUeLK0jgaUWkOFHtrXd5nEoYYLolTamGoghO2a1dL0PtdDS2RD7ZYL/26gYQLWqgAnbABv9lriJlZfWLPp/haX2ag7GrXEzdQejlQEW4FL9hxuNb51LfkhnFqMWuCkYtoUocQ==;5:+3C6jaGVTx6ibCG84d9CkUG8dlTou3JPxDbE5yr3XTdD/ZDaFtictL8gauw1YV59s2AvhPW8JeTf1JAtGGHLeiFeHVPDka1VqSGxy2WIpyoWRn7TF/g/jkO0cWK5fRerqR5wiVFmqK0DBS1Zzsu1aQ==;24:/ZAgKFp7vRSFiid/dhbVw+5yRENShMwHi3TVDxferrSL5WMlLv79bEubolkzaKHfZFbe+UB1M/GkIy6WQ/lmdp9t39klxfWwL468D2m0+pk=;7:F2aSycO7cKXfJQiRSs4rfVUxBsez6djca+MHUk64Q0hqzM0II6kkp9xE3TXlZeQfrTx3VpIzcjhbKrXx9njirq9ZQRHzgRH5S1FY3d4J2N+oSNLDU8YhYd0jiddLRRX3PxH7XfhLEUm0f3BfK5yPkj9FqDp0cvtRGkI0kq5EPhWsH1LQLwAa4wRnVpuSxoqrLQRYQoPQWOoojQHQVOLBm66LpioyX8GlWNpTGwR5U4Q2jMl2CzkWK8RtdIS5tDi0 SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2016 20:06:44.3126 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0312 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/15/2016 06:07 AM, Peter Zijlstra wrote: > On Fri, Jul 15, 2016 at 05:39:46PM +0800, Pan Xinhui wrote: >>> I'm thinking you're trying to say this: >>> >>> >>> CPU0 CPU1 CPU2 >>> >>> __pv_queued_spin_unlock_slowpath() >>> ... >>> smp_store_release(&l->locked, 0); >>> __pv_queued_spin_lock_slowpath() >>> ... >>> pv_queued_spin_steal_lock() >>> cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0 >>> >>> >>> pv_wait_head_or_lock() >>> >>> pv_kick(node->cpu); ----------------------> pv_wait(&l->locked, _Q_SLOW_VAL); >>> >>> __pv_queued_spin_unlock() >>> cmpxchg(&l->locked, _Q_LOCKED_VAL, 0) == _Q_LOCKED_VAL >>> >>> for () { >>> trylock_clear_pending(); >>> cpu_relax(); >>> } >>> >>> pv_wait(&l->locked, _Q_SLOW_VAL); >>> >>> >>> Which is indeed 'bad', but not fatal, note that the later pv_wait() will >>> not in fact go wait, since l->locked will _not_ be _Q_SLOW_VAL. >> the problem is that "this later pv_wait will do nothing as l->locked >> is not _Q_SLOW_VAL", So it is not paravirt friendly then. we will go >> into the trylock loop again and again until the lock is unlocked. > Agreed, which is 'bad'. But the patch spoke about a missing wakeup, > which is worse, as that would completely inhibit progress. Sorry, it is my mistake. There is no missing pv_wait(). >> So if we are kicked by the unlock_slowpath, and the lock is stealed by >> someone else, we need hash its node again and set l->locked to >> _Q_SLOW_VAL, then enter pv_wait. > Right, let me go think about this a bit. Yes, the purpose of this patch is to do exactly that. Let's the queue head vCPU sleeps until the lock holder release the lock and wake the queue head vCPU up. > >> but I am worried about lock stealing. could the node in the queue >> starve for a long time? I notice the latency of pv_wait on an >> over-commited guest can be bigger than 300us. I have not seen such >> starving case, but I think it is possible to happen. > I share that worry, which is why we limit the steal attempt to one. > But yes, theoretically its possible to starve things AFAICT. > > We've not come up with sensible way to completely avoid starvation. If you guys are worrying about lock constantly getting stolen between pv_kick() of queue head vCPU and it is ready to take the lock, we can keep the pending bit set across pv_wait() if it is the 2nd or later time that pv_wait() is called. That will ensure that no lock stealing can happen and cap the maximum wait time to about 2x (spin + pv_wait). I will add that patch to my patch series. Cheers, Longman