From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752783AbeDMWrX (ORCPT ); Fri, 13 Apr 2018 18:47:23 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:33562 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752666AbeDMWrV (ORCPT ); Fri, 13 Apr 2018 18:47:21 -0400 Date: Fri, 13 Apr 2018 15:48:17 -0700 From: "Paul E. McKenney" To: Dan Williams Cc: Jan Kara , linux-nvdimm , Jeff Moyer , Dave Chinner , Matthew Wilcox , Alexander Viro , "Darrick J. Wong" , Ross Zwisler , Dave Hansen , Andrew Morton , Christoph Hellwig , linux-fsdevel , linux-xfs , Linux Kernel Mailing List , Mike Snitzer , Josh Triplett Subject: Re: [PATCH v8 15/18] mm, fs, dax: handle layout changes to pinned dax mappings Reply-To: paulmck@linux.vnet.ibm.com References: <152246892890.36038.18436540150980653229.stgit@dwillia2-desk3.amr.corp.intel.com> <152246901060.36038.4487158506830998280.stgit@dwillia2-desk3.amr.corp.intel.com> <20180404094656.dssixqvvdcp5jff2@quack2.suse.cz> <20180409164944.6u7i4wgbp6yihvin@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 18041322-0052-0000-0000-000002DABBFA X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008852; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000257; SDB=6.01017497; UDB=6.00518944; IPR=6.00796734; MB=3.00020565; MTD=3.00000008; XFM=3.00000015; UTC=2018-04-13 22:47:17 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18041322-0053-0000-0000-00005C527DEF Message-Id: <20180413224817.GK3948@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-04-13_12:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1804130210 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 13, 2018 at 03:03:51PM -0700, Dan Williams wrote: > On Mon, Apr 9, 2018 at 9:51 AM, Dan Williams wrote: > > On Mon, Apr 9, 2018 at 9:49 AM, Jan Kara wrote: > >> On Sat 07-04-18 12:38:24, Dan Williams wrote: > > [..] > >>> I wonder if this can be trivially solved by using srcu. I.e. we don't > >>> need to wait for a global quiescent state, just a > >>> get_user_pages_fast() quiescent state. ...or is that an abuse of the > >>> srcu api? > >> > >> Well, I'd rather use the percpu rwsemaphore (linux/percpu-rwsem.h) than > >> SRCU. It is a more-or-less standard locking mechanism rather than relying > >> on implementation properties of SRCU which is a data structure protection > >> method. And the overhead of percpu rwsemaphore for your use case should be > >> about the same as that of SRCU. > > > > I was just about to ask that. Yes, it seems they would share similar > > properties and it would be better to use the explicit implementation > > rather than a side effect of srcu. > > ...unfortunately: > > BUG: sleeping function called from invalid context at > ./include/linux/percpu-rwsem.h:34 > [..] > Call Trace: > dump_stack+0x85/0xcb > ___might_sleep+0x15b/0x240 > dax_layout_lock+0x18/0x80 > get_user_pages_fast+0xf8/0x140 > > ...and thinking about it more srcu is a better fit. We don't need the > 100% exclusion provided by an rwsem we only need the guarantee that > all cpus that might have been running get_user_pages_fast() have > finished it at least once. > > In my tests synchronize_srcu is a bit slower than unpatched for the > trivial 100 truncate test, but certainly not the 200x latency you were > seeing with syncrhonize_rcu. > > Elapsed time: > 0.006149178 unpatched > 0.009426360 srcu You might want to try synchronize_srcu_expedited(). Unlike plain RCU, it does not send IPIs, so should be less controversial. And it might well more than make up the performance difference you are seeing above. Thanx, Paul