From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14FEEC4167B for ; Tue, 31 Oct 2023 23:12:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347390AbjJaXMz (ORCPT ); Tue, 31 Oct 2023 19:12:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345439AbjJaXMy (ORCPT ); Tue, 31 Oct 2023 19:12:54 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9392DB9; Tue, 31 Oct 2023 16:12:51 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1F876C433C7; Tue, 31 Oct 2023 23:12:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698793971; bh=HUHN15gYL1zUQAZ/9kbQyYuReBV1adG4NcrFQI5O1EA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=UWVsN5tO54rx3f5rm/2Q31gDwhbSjKfQC6V5qBLtEVr4Rwpj7I7yMkvepTMR/lTtT VZqTAW/Ni44oDy4DxIjDO6jVJnB3XhJ1iXh/jS/mwEM6mkBwXfzeYhmhhzk0O3GH1C FpD5ET6CpsJs564xQRPuK4enmdSn0K5L9FR73kX/HjPNJDmk0zU7wpeb5rhu5lYajK tqce7Z3t64Zk3H+3LHPpkEK22xLM57Jc8HFHfmu9721nBL0Xsq/svRRP9VsvB3XdNt WDyDN0dl8KXCOPtrduLbhpDz2EwAI3OEgEsUZRP0GYvAfodXBg89A0KEIH+Eygdi1J T4yRHuxrW0+lQ== Date: Tue, 31 Oct 2023 16:12:50 -0700 From: "Darrick J. Wong" To: Amir Goldstein Cc: Dave Chinner , Linus Torvalds , Jeff Layton , Kent Overstreet , Christian Brauner , Alexander Viro , John Stultz , Thomas Gleixner , Stephen Boyd , Chandan Babu R , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Jan Kara , David Howells , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-mm@kvack.org, linux-nfs@vger.kernel.org Subject: Re: [PATCH RFC 2/9] timekeeping: new interfaces for multigrain timestamp handing Message-ID: <20231031231250.GA1205221@frogsfrogsfrogs> References: <2ef9ac6180e47bc9cc8edef20648a000367c4ed2.camel@kernel.org> <6df5ea54463526a3d898ed2bd8a005166caa9381.camel@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Tue, Oct 31, 2023 at 09:03:57AM +0200, Amir Goldstein wrote: > On Tue, Oct 31, 2023 at 3:42 AM Dave Chinner wrote: > > > [...] > > .... and what is annoying is that that the new i_version just a > > glorified ctime change counter. What we should be fixing is ctime - > > integrating this change counting into ctime would allow us to make > > i_version go away entirely. i.e. We don't need a persistent ctime > > change counter if the ctime has sufficient resolution or persistent > > encoding that it does not need an external persistent change > > counter. > > > > That was reasoning behind the multi-grain timestamps. While the mgts > > implementation was flawed, the reasoning behind it certainly isn't. > > We should be trying to get rid of i_version by integrating it into > > ctime updates, not arguing how atime vs i_version should work. > > > > > So I don't think the issue here is "i_version" per se. I think in a > > > vacuum, the best option of i_version is pretty obvious. But if you > > > want i_version to track di_changecount, *then* you end up with that > > > situation where the persistence of atime matters, and i_version needs > > > to update whenever a (persistent) atime update happens. > > > > Yet I don't want i_version to track di_changecount. > > > > I want to *stop supporting i_version altogether* in XFS. > > > > I want i_version as filesystem internal metadata to die completely. > > > > I don't want to change the on disk format to add a new i_version > > field because we'll be straight back in this same siutation when the > > next i_version bug is found and semantics get changed yet again. > > > > Hence if we can encode the necessary change attributes into ctime, > > we can drop VFS i_version support altogether. Then the "atime bumps > > i_version" problem also goes away because then we *don't use > > i_version*. > > > > But if we can't get the VFS to do this with ctime, at least we have > > the abstractions available to us (i.e. timestamp granularity and > > statx change cookie) to allow XFS to implement this sort of > > ctime-with-integrated-change-counter internally to the filesystem > > and be able to drop i_version support.... > > > > I don't know if it was mentioned before in one of the many threads, > but there is another benefit of ctime-with-integrated-change-counter > approach - it is the ability to extend the solution with some adaptations > also to mtime. > > The "change cookie" is used to know if inode metadata cache should > be invalidated and mtime is often used to know if data cache should > be invalidated, or if data comparison could be skipped (e.g. rsync). > > The difference is that mtime can be set by user, so using lower nsec > bits for modification counter would require to truncate the user set > time granularity to 100ns - that is probably acceptable, but only as > an opt-in behavior. > > The special value 0 for mtime-change-counter could be reserved for > mtime that was set by the user or for upgrade of existing inode, > where 0 counter means that mtime cannot be trusted as an accurate > data modification-cookie. What about write faults on an mmap region? The first ro->rw transition results in an mtime update, but not again until the page gets cleaned. > This feature is going to be useful for the vfs HSM implementation [1] > that I am working on and it actually rhymes with the XFS DMAPI > patches that were never fully merged upstream. Kudos, I cannot figure out a non-pejorative word that rhymes with "**API". ;) --D > Speaking on behalf of my employer, we would love to see the data > modification-cookie feature implemented, whether in vfs or in xfs. > > *IF* the result on this thread is that the chosen solution is > ctime-with-change-counter in XFS > *AND* if there is agreement among XFS developers to extend it with > an opt-in mkfs/mount option to 100ns-mtime-with-change-counter in XFS > *THEN* I think I will be able to allocate resources to drive this xfs work. > > Thanks, > Amir. > > [1] https://github.com/amir73il/fsnotify-utils/wiki/Hierarchical-Storage-Management-API >