From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CBFECE79AD for ; Tue, 19 Sep 2023 20:46:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233271AbjISUqp (ORCPT ); Tue, 19 Sep 2023 16:46:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233253AbjISUqm (ORCPT ); Tue, 19 Sep 2023 16:46:42 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32DC8CE; Tue, 19 Sep 2023 13:46:33 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A819FC433C9; Tue, 19 Sep 2023 20:46:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1695156393; bh=BmRXQZIEgC13FrkspW8zNhSJjLbclay0bgLqEV3Rrl8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=q9KdgZu946MZfw7DxJLZIxiiIbHL9p62pycuILxZHzw8MqP3sd2qzr3E/ZkzfhbEv Sv560WUNQE0HlwP/5rEqpm3bw7jCmrV5akvxgau1zBqN4rLOK+/ZjoboBONeleNI+5 6SP6lExUu15oD0xQisxQolfz9bgwYx59X/D7X0LTGvi5p2a+XVoY+oMnByTr4quQRa dZXyYP7t6FAFej2yuwhu+Ce8pxPjyRxtNPk1CXlFaDLgBYYr9H9mjHCggTcasxMD+A UZV+ik47sNRSAwVomJhe40/Ndx/tHAngLhTIM40zYtZFnzpPzdTJ314IhHsCYJFhNI PFuADp6gLFx9g== Message-ID: <6e6da8a875a0defec1a0f58314995a6a12dca74e.camel@kernel.org> Subject: Re: [PATCH v7 12/13] ext4: switch to multigrain timestamps From: Jeff Layton To: Paul Eggert , Bruno Haible , Jan Kara , Xi Ruoyao , bug-gnulib@gnu.org Cc: Alexander Viro , Christian Brauner , Eric Van Hensbergen , Latchesar Ionkov , Dominique Martinet , Christian Schoenebeck , David Howells , Marc Dionne , Chris Mason , Josef Bacik , David Sterba , Xiubo Li , Ilya Dryomov , Jan Harkes , coda@cs.cmu.edu, Tyler Hicks , Gao Xiang , Chao Yu , Yue Hu , Jeffle Xu , Namjae Jeon , Sungjong Seo , Jan Kara , Theodore Ts'o , Andreas Dilger , Jaegeuk Kim , OGAWA Hirofumi , Miklos Szeredi , Bo b Peterson , Andreas Gruenbacher , Greg Kroah-Hartman , Tejun Heo , Trond Myklebust , Anna Schumaker , Konstantin Komarov , Mark Fasheh , Joel Becker , Joseph Qi , Mike Marshall , Martin Brandenburg , Luis Chamberlain , Kees Cook , Iurii Zaikin , Steve French , Paulo Alcantara , Ronnie Sahlberg , Shyam Prasad N , Tom Talpey , Sergey Senozhatsky , Richard Weinberger , Hans de Goede , Hugh Dickins , Andrew Morton , Amir Goldstein , "Darrick J. Wong" , Benjamin Coddington , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, v9fs@lists.linux.dev, linux-afs@lists.infradead.org, linux-btrfs@vger.kernel.org, ceph-devel@vger.kernel.org, codalist@coda.cs.cmu.edu, ecryptfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, ntfs3@lists.linux.dev, ocfs2-devel@lists.linux.dev, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, linux-mm@kvack.org, linux-unionfs@vger.kernel.org, linux-xfs@vger.kernel.org Date: Tue, 19 Sep 2023 16:46:25 -0400 In-Reply-To: References: <20230807-mgctime-v7-0-d1dec143a704@kernel.org> <20230919110457.7fnmzo4nqsi43yqq@quack3> <1f29102c09c60661758c5376018eac43f774c462.camel@kernel.org> <4511209.uG2h0Jr0uP@nimes> <08b5c6fd3b08b87fa564bb562d89381dd4e05b6a.camel@kernel.org> Content-Type: text/plain; charset="ISO-8859-15" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Tue, 2023-09-19 at 13:10 -0700, Paul Eggert wrote: > On 2023-09-19 09:31, Jeff Layton wrote: > > The typical case for make > > timestamp comparisons is comparing source files vs. a build target. If > > those are being written nearly simultaneously, then that could be an > > issue, but is that a typical behavior? >=20 > I vaguely remember running into problems with 'make' a while ago=20 > (perhaps with a BSDish system) when filesystem timestamps were=20 > arbitrarily truncated in some cases but not others. These files would=20 > look older than they really were, so 'make' would think they were=20 > up-to-date when they weren't, and 'make' would omit actions that it=20 > should have done, thus screwing up the build. >=20 > File timestamps can be close together with 'make -j' on fast hosts.=20 > Sometimes a shell script (or 'make' itself) will run 'make', then modify= =20 > a file F, then immediately run 'make' again; the latter 'make' won't=20 > work if F's timestamp is mistakenly older than targets that depend on it. >=20 > Although 'make'-like apps are the biggest canaries in this coal mine,=20 > the issue also affects 'find -newer' (as Bruno mentioned), 'rsync -u',= =20 > 'mv -u', 'tar -u', Emacs file-newer-than-file-p, and surely many other= =20 > places. For example, any app that creates a timestamp file, then backs= =20 > up all files newer than that file, would be at risk. >=20 >=20 > > I wonder if it would be feasible to just advance the coarse-grained > > current_time whenever we end up updating a ctime with a fine-grained > > timestamp? >=20 > Wouldn't this need to be done globally, that is, not just on a per-file= =20 > or per-filesystem basis? If so, I don't see how we'd avoid locking=20 > performance issues. >=20 Maybe. Another idea might be to introduce a new timekeeper for multigrain filesystems, but all of those would likely have to share the same coarse-grained clock source. So yeah, if you stat an inode and then update it, any inode written on a multigrain filesystem within the same jiffy-sized window would have to log an extra transaction to write out the inode. That's what I meant when I was talking about write amplification. >=20 > PS. Although I'm no expert in the Linux inode code I hope you don't mind= =20 > my asking a question about this part of inode_set_ctime_current: >=20 > /* > * If we've recently updated with a fine-grained timestamp, > * then the coarse-grained one may still be earlier than the > * existing ctime. Just keep the existing value if so. > */ > ctime.tv_sec =3D inode->__i_ctime.tv_sec; > if (timespec64_compare(&ctime, &now) > 0) > return ctime; >=20 > Suppose root used clock_settime to set the clock backwards. Won't this= =20 > code incorrectly refuse to update the file's timestamp afterwards? That= =20 > is, shouldn't the last line be "goto fine_grained;" rather than "return= =20 > ctime;", with the comment changed from "keep the existing value" to "use= =20 > a fine-grained value"? It is a problem, and Linus pointed that out yesterday, which is why I sent this earlier today: https://lore.kernel.org/linux-fsdevel/20230919-ctime-v1-1-97b3da92f504@kern= el.org/T/#u Bear in mind that we're not dealing with a situation where the value has not been queried since its last update, so we don't need to use a fine grained timestamp there (and really, it's preferable not to do so). A coarse one should be fine in this case. --=20 Jeff Layton