From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47F7AC3F68F for ; Wed, 15 Jan 2020 06:56:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 12A2C2467C for ; Wed, 15 Jan 2020 06:56:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 12A2C2467C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6489E8E0005; Wed, 15 Jan 2020 01:56:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 61F598E0003; Wed, 15 Jan 2020 01:56:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55C7F8E0005; Wed, 15 Jan 2020 01:56:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0099.hostedemail.com [216.40.44.99]) by kanga.kvack.org (Postfix) with ESMTP id 3FED68E0003 for ; Wed, 15 Jan 2020 01:56:18 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id F042B8248047 for ; Wed, 15 Jan 2020 06:56:17 +0000 (UTC) X-FDA: 76378959594.02.nest60_36f0682d03732 X-HE-Tag: nest60_36f0682d03732 X-Filterd-Recvd-Size: 2828 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Jan 2020 06:56:17 +0000 (UTC) Received: by verein.lst.de (Postfix, from userid 2407) id E956E68AFE; Wed, 15 Jan 2020 07:56:14 +0100 (CET) Date: Wed, 15 Jan 2020 07:56:14 +0100 From: Christoph Hellwig To: Jason Gunthorpe Cc: Christoph Hellwig , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: RFC: hold i_rwsem until aio completes Message-ID: <20200115065614.GC21219@lst.de> References: <20200114161225.309792-1-hch@lst.de> <20200114192700.GC22037@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200114192700.GC22037@ziepe.ca> User-Agent: Mutt/1.5.17 (2007-11-01) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 14, 2020 at 03:27:00PM -0400, Jason Gunthorpe wrote: > I've seen similar locking patterns quite a lot, enough I've thought > about having a dedicated locking primitive to do it. It really wants > to be a rwsem, but as here the rwsem rules don't allow it. > > The common pattern I'm looking at looks something like this: > > 'try begin read'() // aka down_read_trylock() > > /* The lockdep release hackery you describe, > the rwsem remains read locked */ > 'exit reader'() > > .. delegate unlock to work queue, timer, irq, etc .. > > in the new context: > > 're_enter reader'() // Get our lockdep tracking back > > 'end reader'() // aka up_read() > > vs a typical write side: > > 'begin write'() // aka down_write() > > /* There is no reason to unlock it before kfree of the rwsem memory. > Somehow the user prevents any new down_read_trylock()'s */ > 'abandon writer'() // The object will be kfree'd with a locked writer > kfree() > > The typical goal is to provide an object destruction path that can > serialize and fence all readers wherever they may be before proceeding > to some synchronous destruction. > > Usually this gets open coded with some atomic/kref/refcount and a > completion or wait queue. Often implemented wrongly, lacking the write > favoring bias in the rwsem, and lacking any lockdep tracking on the > naked completion. > > Not to discourage your patch, but to ask if we can make the solution > more broadly applicable? Your requirement seems a little different, and in fact in many ways similar to the percpu_ref primitive.