From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1593DC11D32 for ; Mon, 24 Feb 2020 17:56:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E058E20836 for ; Mon, 24 Feb 2020 17:56:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727775AbgBXR4G (ORCPT ); Mon, 24 Feb 2020 12:56:06 -0500 Received: from verein.lst.de ([213.95.11.211]:39459 "EHLO verein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727438AbgBXR4G (ORCPT ); Mon, 24 Feb 2020 12:56:06 -0500 Received: by verein.lst.de (Postfix, from userid 2407) id BF32268B20; Mon, 24 Feb 2020 18:56:03 +0100 (CET) Date: Mon, 24 Feb 2020 18:56:03 +0100 From: Christoph Hellwig To: Dave Chinner Cc: Christoph Hellwig , ira.weiny@intel.com, linux-kernel@vger.kernel.org, Alexander Viro , "Darrick J. Wong" , Dan Williams , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH V4 07/13] fs: Add locking for a dynamic address space operations state Message-ID: <20200224175603.GE7771@lst.de> References: <20200221004134.30599-1-ira.weiny@intel.com> <20200221004134.30599-8-ira.weiny@intel.com> <20200221174449.GB11378@lst.de> <20200221224419.GW10776@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20200221224419.GW10776@dread.disaster.area> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Sat, Feb 22, 2020 at 09:44:19AM +1100, Dave Chinner wrote: > > I am very much against this. There is a reason why we don't support > > changes of ops vectors at runtime anywhere else, because it is > > horribly complicated and impossible to get right. IFF we ever want > > to change the DAX vs non-DAX mode (which I'm still not sold on) the > > right way is to just add a few simple conditionals and merge the > > aops, which is much easier to reason about, and less costly in overall > > overhead. > > *cough* > > That's exactly what the original code did. And it was broken > because page faults call multiple aops that are dependent on the > result of the previous aop calls setting up the state correctly for > the latter calls. And when S_DAX changes between those calls, shit > breaks. No, the original code was broken because it didn't serialize between DAX and buffer access. Take a step back and look where the problems are, and they are not mostly with the aops. In fact the only aop useful for DAX is is ->writepages, and how it uses ->writepages is actually a bit of an abuse of that interface. So what we really need it just a way to prevent switching the flag when a file is mapped, and in the normal read/write path ensure the flag can't be switch while I/O is going on, which could either be done by ensuring it is only switched under i_rwsem or equivalent protection, or by setting the DAX flag once in the iocb similar to IOCB_DIRECT. And they easiest way to get all this done is as a first step to just allowing switching the flag when no blocks are allocated at all, similar to how the rt flag works. Once that works properly and use cases show up we can relax the requirements, and we need to do that in a way without bloating the inode and various VFS code paths, as DAX is a complete fringe feature, and ѕwitching the flag and runtime is the fringe of a fringe. It just isn't worth bloating the inode and wasting tons of developer time on it.