From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755475AbbIBQLL (ORCPT ); Wed, 2 Sep 2015 12:11:11 -0400 Received: from mail-db3on0075.outbound.protection.outlook.com ([157.55.234.75]:29178 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754771AbbIBQLJ (ORCPT ); Wed, 2 Sep 2015 12:11:09 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=cmetcalf@ezchip.com; Subject: Re: futex atomic vs ordering constraints To: Peter Zijlstra , Thomas Gleixner References: <20150826181659.GW16853@twins.programming.kicks-ass.net> <20150901163140.GK1612@arm.com> <20150901164247.GO16853@twins.programming.kicks-ass.net> <20150902125555.GT16853@twins.programming.kicks-ass.net> CC: Will Deacon , Linus Torvalds , Oleg Nesterov , Paul McKenney , Ingo Molnar , "mtk.manpages@gmail.com" , "dvhart@infradead.org" , "dave@stgolabs.net" , "Vineet.Gupta1@synopsys.com" , "ralf@linux-mips.org" , "ddaney@caviumnetworks.com" , "linux-kernel@vger.kernel.org" , , From: Chris Metcalf Message-ID: <55E71F92.4000001@ezchip.com> Date: Wed, 2 Sep 2015 12:10:58 -0400 User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20150902125555.GT16853@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: BLUPR07CA0046.namprd07.prod.outlook.com (10.255.223.159) To AM2PR02MB0770.eurprd02.prod.outlook.com (25.163.146.155) X-Microsoft-Exchange-Diagnostics: 1;AM2PR02MB0770;2:5e51Uyfvlw8qnMD3SCyDBZ1XCkJNDh13M1RIIyfX4jZIFS3xNWK447vc2zx39YLcM/Qb8kHTiF1sBiOb8Q8BM3h2xSlOPSSAGSr60DCGkbpkFD8C+rl0vgugjG28HUmLBXG+C7ILR0GyXdkbbDoj6lt2P/JJEGSA46DrgEsCX5A=;3:D6IvOWZ0ehrtSwntKwXdBplGXvwH0nOg8wjhPWbAk+nE1GVTTMSkJFcqzCq0wu765pCe19fhVO9aqAU/SidGtKJ0A7bQ3/DCapkHi+gDIGzVnNi3MeUwtEN71AAW6WTtrXh6ma7LVavSYKrLjFskeQ==;25:4ZLz0oYodUwhwea8oB3lubflFUyCl+JDNrK1g+YmS+iNGAjvPavLF6QB6h8HygIIx35ZrAmCQcH/9LIw8dN1RDQsgA64BIJR+rlh7A7DFYCRhEwD/fYqh+hgQW078qLQky+4Eg3Qpp/vfJtBUr051FvOCszfM/g2m9y43jhXLfLqga3G7QUUzbOxf51pSXyKPsYRCtE0wPcNkBb952STDbkmbaD7+YkfHzljDheJ54kbIsEGDy4BZkIWQ9iUHnqt;20:QKfq7dqBB27KjqeGk3IM4rPt8Lr/hQmfEVPQOzhUGE4+BJNxI2JrWgtj0tS9lYw074cJQm5uHgvh82fYCjh7v71MPPfgp8ZToHcdciIQayIgOuE8TqhwS+1ZEeSiX+Ia2dbdm90FCERONvWeyPrDA1M4Bbi0ydo7UxwCqIXrvQk= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM2PR02MB0770; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(8121501046)(3002001);SRVR:AM2PR02MB0770;BCL:0;PCL:0;RULEID:;SRVR:AM2PR02MB0770; X-Microsoft-Exchange-Diagnostics: 1;AM2PR02MB0770;4:rjqqW98ThEQ2ETLxmNArvYPMH0HY2G0OGfOXe7mTasirA3pCCxBOwSGSlezXmIzoP9nXo1RCEeSBrJj47GYWMq+OquiNBAQ+FyynnQrovXOrujJiSW6/F39KBhHX2mCLO7s3IEtlH/7ZGvY80GTzF+AQLxJTivvofc+90nVSg4CZqfGcmQ4Hp2M5PyzznEqrpKucSqcWVFSdrW8jz31ix95kZtQH8llC3fbP+8nsJjpyj+fvQbQFRWEjzFHVlCxYq4earjHq+gSqcv8shXRyxSXE0Vx0Do3B85IKLtO0KRnRPd8iAARqSt6vdYe5L0Q7 X-Forefront-PRVS: 0687389FB0 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6009001)(6049001)(377454003)(199003)(189002)(24454002)(479174004)(33656002)(93886004)(65816999)(5007970100001)(101416001)(5004730100002)(40100003)(122386002)(76176999)(87266999)(59896002)(106356001)(105586002)(54356999)(23746002)(86362001)(36756003)(83506001)(50986999)(46102003)(87976001)(5001860100001)(50466002)(42186005)(77096005)(68736005)(66066001)(15975445007)(65806001)(65956001)(77156002)(62966003)(5001960100002)(19580395003)(64126003)(5001920100001)(189998001)(92566002)(80316001)(2950100001)(64706001)(5001770100001)(5001830100001)(81156007)(47776003)(97736004)(4001540100001)(4001350100001)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM2PR02MB0770;H:[10.7.0.41];FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;AM2PR02MB0770;23:QFDcF1XYiWUFA83VDRoyJyr0b02uL9kg85ME+?= =?Windows-1252?Q?R3KLv7cfs/FxfG0SmPUsanGBSgDLnIGVsz+37Bv16V2xTpaQbCJdiOMU?= =?Windows-1252?Q?fAmr6rJvEZmksrPYdS/ppCgNigWTzgqpbNd3iuZQnyFO5K1/VVoCDkCf?= =?Windows-1252?Q?crEOnzplEGq+d819AdFFtSLsQt1ClnkqPhV9gWzRCYvz0LC5X1w9MbUP?= =?Windows-1252?Q?CDsw6BlaPuZxbYaDCxnMZzugFmKx0+QuHdNyqZoCh5nlpiZrWE1GXHbG?= =?Windows-1252?Q?YXyBBJfn2EIVHV2zby25OSasPgqhp9rifl1t7xAT2KZouFqSCZnlhKke?= =?Windows-1252?Q?mJdHjZ+yZIJP3QaCh4vy/adhE25B7L9DooU1QWv0z5LBJw1lt3ZgLKaf?= =?Windows-1252?Q?jOtBDbRX3hKKzo4dnH4kDK3gTtrP9OFy9fzGzSpRNL+KtaegfDSg1CHW?= =?Windows-1252?Q?zAIK9fU7FHS6exelRhVP+QJdkY6quMMhSjwGnLgNPTlWXLJRyJ4PzZuh?= =?Windows-1252?Q?lfiVtap4yyJfkKA4AW9jBm6x1o9RL879pQ+57A1e2y9dnhcWSHMtfTLr?= =?Windows-1252?Q?ZSDS/nUyqGlufGeRfoFmcE2BSGRhlkdhQLmTATrW58QhOkq/g7VD3vca?= =?Windows-1252?Q?8sGt9MuZnIczI4TIxBDDHfwS/pvWuaxeUbw0/s80zZmPsoeXcVx6Yr2Q?= =?Windows-1252?Q?uAkiUNN2w4MKhr6wKp5GQVhQSF9fBBtoQa10EO5FuRcL1MWef4+jsnAp?= =?Windows-1252?Q?lJ6V52whLYsA8ykCYhjF/b0/3cvkPESPYYo5P/azyQDSb/beWuaC/176?= =?Windows-1252?Q?gWeuFA/voInDWRoleyNov9RvCE8Y73sLcPCaVYPH+HEexZ8MWxPZF11L?= =?Windows-1252?Q?smrs1AiMIHLpkvA30UwgUlQlV3u7HpL7APWOz0BXzlCPMyN9mczKl0//?= =?Windows-1252?Q?51qjfVr9MDu84EcOU3Q+NYsrQC9H4zrnoclBceShS7ZhzX50sTGx86l6?= =?Windows-1252?Q?sLRojnzS5J7pHB4B3YwHGyKiBBsjs+iFGLu00T7NreW9yrQ/sKU7FTNS?= =?Windows-1252?Q?fbKE9K+T0+rFy4N+08A9+lyGjnRkh71/sDCK7P/JrCjqHNZhJ1USij+6?= =?Windows-1252?Q?Nd7r30Su9rATPnD1u0F/sweghC1ZlAl/FLVyE7pAB0BcDolQxONuGoWE?= =?Windows-1252?Q?//TUWO8/TiMihOWFeURGD7tgXm6S6CswI8DqP368XNKLt2JyJEy5PC+z?= =?Windows-1252?Q?FVWdyThH0fBdJGBDHbazAoIeJzVBCGhpP0X6GTa0TCHtVkUdfA/HTAH6?= =?Windows-1252?Q?PQCVSi/lrXj6Qk//4O8FAs0N83d1x0p8737QO2NEiewY8i4gaDq15pyO?= =?Windows-1252?Q?s+xduwgxxEUUxwK2wZ6KprXk/2jMc4N4LMzwuje2Rv93P+TR2Cwv2JSG?= =?Windows-1252?Q?awE3IVINEACK2Y+JnAZlAh/vwyRnveutIv49EHySi1W1c4B7vDT52vZA?= =?Windows-1252?Q?yKdJKuH1NZVixVxiI1bnnV0wcPRKFTP+nkxgcI5zmveRop79HycGQns7?= =?Windows-1252?Q?f0axl7c3IBhZ9hogFQlVfC9GatfTULFKDUWyf0cw/y5G9aKB1UQ3eqYx?= =?Windows-1252?B?UT09?= X-Microsoft-Exchange-Diagnostics: 1;AM2PR02MB0770;5:VXgkkk4fkl1FDHK/zL5Zqzy+EFSBcSNzpF7HqqYStDF18ZHb66DdLxohz04inKDd3aCPCBmFKDuVM26edWJY7RSHGgtlJFJ5Kq7h5+2ecDNThwOqvGECEtM6HzPtATaqSUqC86MA/PdgCva1BTvYFA==;24:vv6+LM3y/7ohbXYlo2ZV865y+dvMB09NnC7nZ4S5RwgWwBsSspjiteGJNI+lfUEaDc8fp076pgejBabRayh1ig9yJKL8ST+QCe2ODC1y8xs=;20:fV/STZBgWmHRXLtFaO+aaKjgzCYXZQgZJwdGhXSfyQWY8+g7NRtNdwF4W8bdhVwdPDn0zQ5ZDDgfpITBg+RYuQ== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Sep 2015 16:11:04.7787 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM2PR02MB0770 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/02/2015 08:55 AM, Peter Zijlstra wrote: > So here goes.. > > Chris, I'm awfully sorry, but I seem to be Tile challenged. > > TileGX seems to define: > > #define smp_mb__before_atomic() smp_mb() > #define smp_mb__after_atomic() smp_mb() > > However, its atomic_add_return() implementation looks like: > > static inline int atomic_add_return(int i, atomic_t *v) > { > int val; > smp_mb(); /* barrier for proper semantics */ > val = __insn_fetchadd4((void *)&v->counter, i) + i; > barrier(); /* the "+ i" above will wait on memory */ > return val; > } > > Which leaves me confused on smp_mb__after_atomic(). Are you concerned about whether it has proper memory barrier semantics already, i.e. full barriers before and after? In fact we do have a full barrier before, but then because of the "+ i" / "barrier()", we know that the only other operation since the previous mb(), namely the read of v->counter, has completed after the atomic operation. As a result we can omit explicitly having a second barrier. It does seem like all the current memory-order semantics are correct, unless I'm missing something! > That said, your futex ops seem to lack any memory barrier, so naively > I'd add both, its just that your add_return() confuses me. So something like this? diff --git a/arch/tile/include/asm/futex.h b/arch/tile/include/asm/futex.h index 1a6ef1b69cb1..0a5501b11d02 100644 --- a/arch/tile/include/asm/futex.h +++ b/arch/tile/include/asm/futex.h @@ -39,6 +39,7 @@ #ifdef __tilegx__ #define __futex_asm(OP) \ + smp_mb(); \ asm("1: {" #OP " %1, %3, %4; movei %0, 0 }\n" \ ".pushsection .fixup,\"ax\"\n" \ "0: { movei %0, %5; j 9f }\n" \ @@ -48,7 +49,8 @@ ".popsection\n" \ "9:" \ : "=r" (ret), "=r" (val), "+m" (*(uaddr)) \ - : "r" (uaddr), "r" (oparg), "i" (-EFAULT)) + : "r" (uaddr), "r" (oparg), "i" (-EFAULT)); \ + smp_mb() #define __futex_set() __futex_asm(exch4) #define __futex_add() __futex_asm(fetchadd4) @@ -75,7 +77,10 @@ #define __futex_call(FN) \ { \ - struct __get_user gu = FN((u32 __force *)uaddr, lock, oparg); \ + struct __get_user gu; \ + smp_mb(); \ + gu = FN((u32 __force *)uaddr, lock, oparg); \ + /* See smp_mb__after_atomic() */ \ val = gu.val; \ ret = gu.err; \ } -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com