From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ot1-f43.google.com (mail-ot1-f43.google.com [209.85.210.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A60337883C for ; Tue, 31 Mar 2026 18:47:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.43 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774982838; cv=none; b=X5b5g2hL1wVXOrFCUwbfV8WuSltBvfqTmx893n7hcm8PDm/bUibXNKcWbMgsFoBoFM0cPTL1gHhrrBddrUeVx/D8UmC79FXEUQaNPX/KfRB3ZMRzOM/lxMqo1y118Y7a43GNBxRp91foNGrKra6VOgb5U1Uto2mK1BmVrf0aSsc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774982838; c=relaxed/simple; bh=klca8zZnNvUVLhlOO2LRAQgkuJB8sDzaNxfUFUDM+aY=; h=From:Date:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=E9oc4rOAtPKEGJf7Vj4rl4RK6y/E+3tsSsJJKenG8xrHDewoKSnkcjB1XpkC57Wzo8hG/IdyyvBszUtXeUGPz95JIja8nm6Q01Q8cPatl1OLRHmTzRxG0BXa26M6hcngZMHHHnANpsyOzCGNVap9y6QyPk9kihxmxk1h2BkUcaE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YAh4gGvr; arc=none smtp.client-ip=209.85.210.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YAh4gGvr" Received: by mail-ot1-f43.google.com with SMTP id 46e09a7af769-7d75ed779bfso6139086a34.2 for ; Tue, 31 Mar 2026 11:47:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774982836; x=1775587636; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:date:from:from:to :cc:subject:date:message-id:reply-to; bh=bV9BROuCXz7vmDIjNMuDEUixDsIrGdMikx5FHekafU0=; b=YAh4gGvrqWdqXOBCPmSS/99BQIp2XBmuRSwP2ER1MtXFrDrnlA5f2HX5BAmfHrhtAv 1Fy8LY/2nfwyYlRPuyd/Pa+Q6Z1i/HRs1XJgJB+/x1RLow4MAkPTgsxIW0rAPANMuFE4 W1YgipGF0wkxtHpiG8cLHM80BySoBw/faRpylzjq0r69z9bQIhblavxtdxDsEntyF3bx 9FrzOEF5kb2sxhDMbqT0I3YHEyhA8ZEHjgMkHiHlMtyYlRZSmkLNB9iXVbRhZI6P1bTQ VFdLcr7JRpYwrFQAhp87zYemzN36YhgGfl4fRpcZYV9h8jV/cylnxdiWFPEQ/UBO4Auh 6c0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774982836; x=1775587636; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:date:from:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bV9BROuCXz7vmDIjNMuDEUixDsIrGdMikx5FHekafU0=; b=YqH/wwU3ybzAex6lFvyo+vcxODpyH+teF9BezyrBX/vqB74tiJbdJuV+3W0P061OzT WfIEGpdG5v5Gm+bRnPInxrJKA1HIWL+rEP6D4/3EiN7yMUL9LdRqtc2U4LNdKcemIUQD 7AyxF1kuV2yk7gRvxBCQbR71xXXocH3aKiLJ+9yfIMb0L0BPgJIMZ2JJ/X7qOqGgtTKz sjYe4XVLqKD3XMAsU17ACzTOksOFg+ytDA32Hb/R7bZ6LknE3mPpfh2HeyKNHxOK9Ftm a9XPAUlVdPRVhS06o+PpHuPFmf75NSiIXyMUFPK55H5aqXyPykFJQG5noe6/c8QvtQLI ixVQ== X-Forwarded-Encrypted: i=1; AJvYcCU+1nalEHvS0MHGh0LelRD/er9tUXuEnK7frNSxKjPwUUkYgnCfqjK8xFzWeMuvNQmRppVYysnNQeKbuRw=@vger.kernel.org X-Gm-Message-State: AOJu0YwOXwBGTdHqiLRLQf+Ki2koi7+StPJ7y6ZIIGEBiuv7w/XIW20Y 58AyzNbr/UZTWxJP3p40v8O1LvyZSI14XStjhP5KC7Ob4MeTavhMvteW X-Gm-Gg: ATEYQzxjTwmnmNvQN63aCI0yBlqIPd8LksXG086BooaCYVCCfuQRzLDIAj8FA82qhvG WsV5IM1NLwrStexU4ltshfD9eUvwn8f4ZJOrlUtjMC+1ffXonORKFGZ8x1O4bAKD1Y1tsgsYI/L ODH5yqL+05nS7o9/wru2Vgx1FBLPntj6k59v+A7MJkht7+UH10bboabcgX5etHsKrmve9TrQmPg Lk0mmNp/AS8GtWT5Te693g/uwIuTpMSSs16bsfFceAR8nqME2Z53+XfIqKLxdre9wWofyXemxAK wlaA9/nAbhHGrrN2s1wBuy852r+ewUsDweVBvvtESg8zYrxThDrAk1cn33AOgxK+LikKtAmq8md dtI8TDrgvCgplBzliNpzR9BqlmtmHMY6rX1+q3WjNjFs4lI3X01weeCfMSQOo6A+YCiQxngyavW ydRNCWTK0d2H+nK2W9k/DH7psZySImkyo= X-Received: by 2002:a05:6830:82b8:b0:7d7:d1e1:6987 with SMTP id 46e09a7af769-7db993469admr484012a34.21.1774982835911; Tue, 31 Mar 2026 11:47:15 -0700 (PDT) Received: from localhost ([12.117.181.174]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7da0a335421sm8591876a34.3.2026.03.31.11.47.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2026 11:47:15 -0700 (PDT) From: Thomas Haynes X-Google-Original-From: Thomas Haynes Date: Tue, 31 Mar 2026 11:47:14 -0700 To: Jeff Layton Cc: Thomas Haynes , Olga Kornievskaia , Trond Myklebust , Anna Schumaker , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 2/2] nfs: update inode ctime after removexattr operation Message-ID: References: <20260324-nfs-7-1-v2-0-d110da3c0036@kernel.org> <20260324-nfs-7-1-v2-2-d110da3c0036@kernel.org> <284ca17e74af8c4f5942b2952f2bf75490dd17c0.camel@kernel.org> <80b423c66dba84b46be1084307d2c66b935065bc.camel@kernel.org> <13f1fd90b75c73e8d5220dadb6eb9d9473bc96e8.camel@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <13f1fd90b75c73e8d5220dadb6eb9d9473bc96e8.camel@kernel.org> On Tue, Mar 31, 2026 at 07:42:56AM -0800, Jeff Layton wrote: > On Fri, 2026-03-27 at 11:22 -0700, Thomas Haynes wrote: > > On Fri, Mar 27, 2026 at 12:59:54PM -0800, Jeff Layton wrote: > > > On Fri, 2026-03-27 at 12:20 -0400, Olga Kornievskaia wrote: > > > > On Fri, Mar 27, 2026 at 11:50 AM Jeff Layton wrote: > > > > > > > > > > On Fri, 2026-03-27 at 11:11 -0400, Olga Kornievskaia wrote: > > > > > > On Tue, Mar 24, 2026 at 1:32 PM Jeff Layton wrote: > > > > > > > > > > > > > > xfstest generic/728 fails with delegated timestamps. The client does a > > > > > > > removexattr and then a stat to test the ctime, which doesn't change. The > > > > > > > stat() doesn't trigger a GETATTR because of the delegated timestamps, so > > > > > > > it relies on the cached ctime, which is wrong. > > > > > > > > > > > > > > The setxattr compound has a trailing GETATTR, which ensures that its > > > > > > > ctime gets updated. Follow the same strategy with removexattr. > > > > > > > > > > > > This approach relies on the fact that the server the serves delegated > > > > > > attributes would update change_attr on operations which might now > > > > > > necessarily happen (ie, linux server does not update change_attribute > > > > > > on writes or clone). I propose an alternative fix for the failing > > > > > > generic/728. > > > > > > > > > > > > diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c > > > > > > index 7b3ca68fb4bb..ede1835a45b3 100644 > > > > > > --- a/fs/nfs/nfs42proc.c > > > > > > +++ b/fs/nfs/nfs42proc.c > > > > > > @@ -1389,7 +1389,13 @@ static int _nfs42_proc_removexattr(struct inode > > > > > > *inode, const char *name) > > > > > > &res.seq_res, 1); > > > > > > trace_nfs4_removexattr(inode, name, ret); > > > > > > if (!ret) > > > > > > - nfs4_update_changeattr(inode, &res.cinfo, timestamp, 0); > > > > > > + if (nfs_have_delegated_attributes(inode)) { > > > > > > + nfs_update_delegated_mtime(inode); > > > > > > + spin_lock(&inode->i_lock); > > > > > > + nfs_set_cache_invalid(inode, NFS_INO_INVALID_BLOCKS); > > > > > > + spin_unlock(&inode->i_lock); > > > > > > + } else > > > > > > + nfs4_update_changeattr(inode, &res.cinfo, timestamp, 0); > > > > > > > > > > > > return ret; > > > > > > } > > > > > > > > > > > > > > > > What's the advantage of doing it this way? > > > > > > > > > > You just sent a REMOVEXATTR operation to the server that will change > > > > > the mtime there. The server has the most up-to-date version of the > > > > > mtime and ctime at that point. > > > > > > > > In presence of delegated attributes, Is the server required to update > > > > its mtime/ctime on an operation? As I mentioned, the linux server does > > > > not update its ctime/mtime for WRITE, CLONE, COPY. > > > > > > > > Is possible that > > > > some implementations might be different and also do not update the > > > > ctime/mtime on REMOVEXATTR? > > > > > > > > Therefore I was suggesting that the patch > > > > relies on the fact that it would receive an updated value. Of course > > > > perhaps all implementations are done the same as the linux server and > > > > my point is moot. I didn't see anything in the spec that clarifies > > > > what the server supposed to do (and client rely on). > > > > > > > > > > (cc'ing Tom) > > > > > > That is a very good point. > > > > > > My interpretation was that delegated timestamps generally covered > > > writes, but SETATTR style operations that do anything beyond only > > > changing the mtime can't be cached. > > > > > > We probably need some delstid spec clarification: for what operations > > > is the server required to disable timestamp updates when a write > > > delegation is outstanding? > > > > > > In the case of nfsd, we disable timestamp updates for WRITE/COPY/CLONE > > > but not SETATTR/SETXATTR/REMOVEXATTR. > > > > > > How does the Hammerspace anvil behave? Does it disable c/mtime updates > > > for writes when there is an outstanding timestamp delegation like we're > > > doing in nfsd? If so, does it do the same for > > > SETATTR/SETXATTR/REMOVEXATTR operations as well? > > > > Jeff, > > > > I think the right way to look at this is closer to how size is > > handled under delegation in RFC8881, rather than as a per-op rule. > > > > In our implementation, because we are acting as an MDS and data I/O > > goes to DSes, we already treat size as effectively delegated when > > a write layout is outstanding. The MDS does not maintain authoritative > > size locally in that case. We may refresh size/timestamps internally > > (e.g., on GETATTR by querying DSes), but we don’t treat that as > > overriding the delegated authority. > > > > For timestamps, our behavior is effectively the same model. When > > the client holds the relevant delegation, the server does not > > consider itself authoritative for ctime/mtime. If current values > > are needed, we can obtain them from the client (e.g., via CB_GETATTR), > > and the client must present the delegation stateid to demonstrate > > that authority. So the authority follows the delegation, not the > > specific operation. > > > > That said, I don’t think we’ve fully resolved the semantics for all > > metadata-style ops either. WRITE and SETATTR are clear in our model, > > but for things like CLONE/COPY/SETXATTR/REMOVEXATTR, we’ve likely > > been relying on assumptions rather than a fully consistent rule. > > I.e., CLONE and COPY we just pass through to the DS and we don't > > implement SETXATTR/REMOVEXATTR. > > > > So the spec question, as I see it, is not whether REMOVEXATTR (or > > any particular op) should update ctime/mtime, but whether delegated > > timestamps are meant to follow the same attribute-authority model > > as delegated size in RFC8881. If so, then we expect that the server > > should query the client via CB_GETATTR to return updated ctime/mtime > > after such operations while the delegation is outstanding. > > > > The dilemma we have is: because we _do_ allow local processes to stat() > files that have an outstanding write delegation, we can never allow the > ctime in particular to roll backward (modulo clock jumps). I agree we do not want visible ctime rollback, but I do not see how that can be guaranteed from delegated timestamps alone when the authoritative timestamp may be generated on a different node with a different clock and the object may change during the CB_GETATTR window. That seems to require either monotonic clamping of exposed ctime, or treating change_attr rather than ctime as the real serialization signal. > > If we're dealing with changes that have been cached in the client and > are being lazily flushed out, then we can't update the timestamp when > that operation occurs. The time of the RPC to flush the changes will > almost certainly be later than the cached timestamps on the client that > will eventually be set, so when the client comes back we'd end up > violating the rollback rule. > > Our only option is to freeze timestamp updates on anything that might > represent such an operation. So far, we only do that on WRITE and COPY > operations -- in general, operations that require an open file, since > FMODE_NOCMTIME is attached to the file. > > Some SETATTRs that only update the mtime and atime can be cached on the > client by virtue of the fact that it's authoritative for timestamps. > There are some exceptions though: > > - atime-only updates can't be cached since the ctime won't change with > a timestamp update if the mtime didn't change > > - if you set the mtime to a time that is later than the time you got > the delegation from the server, but earlier than the current time, you > can't cache that. The ctime would be later than the mtime in that case, > and we don't have a mechanism to handle that in a delegated timestamp > SETATTR. > > I don't see how you could reasonably buffer a SETXATTR or REMOVEXATTR > operation to be sent later. These need to be done synchronously since > they could always fail for some reason and we don't have a mechanism at > the syscall layer to handle a deferred error. They also only update the > ctime and not the mtime, and we have no mechanism to do that with > delegated timestamps. > > Based on that, I think the client and server both need to ignore the > timestamp delegation on a SETXATTR or REMOVEXATTR. The server should > update the ctime and the client needs to send a trailing GETATTR on the > REMOVEXATTR compound in order to get it and the change attr. One concern I have with a per-op formulation is extensibility. If delegated timestamp behavior is defined by enumerating specific operations, then every new operation added to the protocol creates a fresh ambiguity until the spec is updated again. It seems better to define the behavior in terms of operation properties - e.g., whether the operation is synchronously visible, can be deferred/cached at the client, and whether it affects only ctime versus mtime/atime - so future operations can be classified without reopening the base rule. I.e., I can't tell if you want me to update the spec with guidance per-op or you are just documenting what you did. > > Exactly what this patch does, fwiw... > -- > Jeff Layton