[PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes
@ 2024-12-13 13:50 David Howells
  2024-12-13 13:50 ` [PATCH 01/10] kheaders: Ignore silly-rename files David Howells
                   ` (11 more replies)
  0 siblings, 12 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel

Hi Christian,

Here are some miscellaneous fixes and changes for netfslib and the ceph and
nfs filesystems:

 (1) Ignore silly-rename files from afs and nfs when building the header
     archive in a kernel build.

 (2) netfs: Fix the way read result collection applies results to folios
     when each folio is being read by multiple subrequests and the results
     come out of order.

 (3) netfs: Fix ENOMEM handling in buffered reads.

 (4) nfs: Fix an oops in nfs_netfs_init_request() when copying to the cache.

 (5) cachefiles: Parse the "secctx" command immediately to get the correct
     error rather than leaving it to the "bind" command.

 (6) netfs: Remove a redundant smp_rmb().  This isn't a bug per se and
     could be deferred.

 (7) netfs: Fix missing barriers by using clear_and_wake_up_bit().

 (8) netfs: Work around recursion in read retry by failing and abandoning
     the retried subrequest if no I/O is performed.

     [!] NOTE: This only works around the recursion problem if the
     	 recursion keeps returning no data.  If the server manages, say, to
     	 repeatedly return a single byte of data faster than the retry
     	 algorithm can complete, it will still recurse and the stack
     	 overrun may still occur.  Actually fixing this requires quite an
     	 intrusive change which will hopefully make the next merge window.

 (9) netfs: Fix the clearance of a folio_queue when unlocking the page if
     we're going to want to subsequently send the queue for copying to the
     cache (if, for example, we're using ceph).

(10) netfs: Fix the lack of cancellation of copy-to-cache when the cache
     for a file is temporarily disabled (for example when a DIO write is
     done to the file).  This patch and (9) fix hangs with ceph.

With these patches, I can run xfstest -g quick to completion on ceph with a
local cache.

The patches can also be found here with a bonus cifs patch:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes

Thanks,
David

David Howells (8):
  kheaders: Ignore silly-rename files
  netfs: Fix non-contiguous donation between completed reads
  netfs: Fix enomem handling in buffered reads
  nfs: Fix oops in nfs_netfs_init_request() when copying to cache
  netfs: Fix missing barriers by using clear_and_wake_up_bit()
  netfs: Work around recursion by abandoning retry if nothing read
  netfs: Fix ceph copy to cache on write-begin
  netfs: Fix the (non-)cancellation of copy when cache is temporarily
    disabled

Max Kellermann (1):
  cachefiles: Parse the "secctx" immediately

Zilin Guan (1):
  netfs: Remove redundant use of smp_rmb()

 fs/9p/vfs_addr.c         |  6 +++++-
 fs/afs/write.c           |  5 ++++-
 fs/cachefiles/daemon.c   | 14 +++++++-------
 fs/cachefiles/internal.h |  3 ++-
 fs/cachefiles/security.c |  6 +++---
 fs/netfs/buffered_read.c | 28 ++++++++++++++++------------
 fs/netfs/direct_write.c  |  1 -
 fs/netfs/read_collect.c  | 33 +++++++++++++++++++--------------
 fs/netfs/read_pgpriv2.c  |  4 ++++
 fs/netfs/read_retry.c    |  6 ++++--
 fs/netfs/write_collect.c | 14 +++++---------
 fs/netfs/write_issue.c   |  2 ++
 fs/nfs/fscache.c         |  9 ++++++++-
 fs/smb/client/cifssmb.c  | 13 +++++++++----
 fs/smb/client/smb2pdu.c  |  9 ++++++---
 include/linux/netfs.h    |  6 +++---
 kernel/gen_kheaders.sh   |  1 +
 17 files changed, 98 insertions(+), 62 deletions(-)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/10] kheaders: Ignore silly-rename files
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-21  5:15   ` Masahiro Yamada
  2024-12-13 13:50 ` [PATCH 02/10] netfs: Fix non-contiguous donation between completed reads David Howells
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Masahiro Yamada,
	Marc Dionne

Tell tar to ignore silly-rename files (".__afs*" and ".nfs*") when building
the header archive.  These occur when a file that is open is unlinked
locally, but hasn't yet been closed.  Such files are visible to the user
via the getdents() syscall and so programs may want to do things with them.

During the kernel build, such files may be made during the processing of
header files and the cleanup may get deferred by fput() which may result in
tar seeing these files when it reads the directory, but they may have
disappeared by the time it tries to open them, causing tar to fail with an
error.  Further, we don't want to include them in the tarball if they still
exist.

With CONFIG_HEADERS_INSTALL=y, something like the following may be seen:

   find: './kernel/.tmp_cpio_dir/include/dt-bindings/reset/.__afs2080': No such file or directory
   tar: ./include/linux/greybus/.__afs3C95: File removed before we read it

The find warning doesn't seem to cause a problem.

Fix this by telling tar when called from in gen_kheaders.sh to exclude such
files.  This only affects afs and nfs; cifs uses the Windows Hidden
attribute to prevent the file from being seen.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Masahiro Yamada <masahiroy@kernel.org>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-nfs@vger.kernel.org
cc: linux-kernel@vger.kernel.org
---
 kernel/gen_kheaders.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/gen_kheaders.sh b/kernel/gen_kheaders.sh
index 383fd43ac612..7e1340da5aca 100755
--- a/kernel/gen_kheaders.sh
+++ b/kernel/gen_kheaders.sh
@@ -89,6 +89,7 @@ find $cpio_dir -type f -print0 |

 # Create archive and try to normalize metadata for reproducibility.
 tar "${KBUILD_BUILD_TIMESTAMP:+--mtime=$KBUILD_BUILD_TIMESTAMP}" \
+    --exclude=".__afs*" --exclude=".nfs*" \
     --owner=0 --group=0 --sort=name --numeric-owner --mode=u=rw,go=r,a+X \
     -I $XZ -cf $tarfile -C $cpio_dir/ . > /dev/null

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 02/10] netfs: Fix non-contiguous donation between completed reads
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
  2024-12-13 13:50 ` [PATCH 01/10] kheaders: Ignore silly-rename files David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-13 13:50 ` [PATCH 03/10] netfs: Fix enomem handling in buffered reads David Howells
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Shyam Prasad N,
	Steve French, Paulo Alcantara

When a read subrequest finishes, if it doesn't have sufficient coverage to
complete the folio(s) covering either side of it, it will donate the excess
coverage to the adjacent subrequests on either side, offloading
responsibility for unlocking the folio(s) covered to them.

Now, preference is given to donating down to a lower file offset over
donating up because that check is done first - but there's no check that
the lower subreq is actually contiguous, and so we can end up donating
incorrectly.

The scenario seen[1] is that an 8MiB readahead request spanning four 2MiB
folios is split into eight 1MiB subreqs (numbered 1 through 8).  These
terminate in the order 1,6,2,5,3,7,4,8.  What happens is:

	- 1 donates to 2
	- 6 donates to 5
	- 2 completes, unlocking the first folio (with 1).
	- 5 completes, unlocking the third folio (with 6).
	- 3 donates to 4
	- 7 donates to 4 incorrectly
	- 4 completes, unlocking the second folio (with 3), but can't use
	  the excess from 7.
	- 8 donates to 4, also incorrectly.

Fix this by preventing downward donation if the subreqs are not contiguous
(in the example above, 7 donates to 4 across the gap left by 5 and 6).

Reported-by: Shyam Prasad N <nspmangalore@gmail.com>
Closes: https://lore.kernel.org/r/CANT5p=qBwjBm-D8soFVVtswGEfmMtQXVW83=TNfUtvyHeFQZBA@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/526707.1733224486@warthog.procyon.org.uk/ [1]
---
 fs/netfs/read_collect.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index 3cbb289535a8..b415e3972336 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -247,16 +247,17 @@ static bool netfs_consume_read_data(struct netfs_io_subrequest *subreq, bool was
 
 	/* Deal with the trickiest case: that this subreq is in the middle of a
 	 * folio, not touching either edge, but finishes first.  In such a
-	 * case, we donate to the previous subreq, if there is one, so that the
-	 * donation is only handled when that completes - and remove this
-	 * subreq from the list.
+	 * case, we donate to the previous subreq, if there is one and if it is
+	 * contiguous, so that the donation is only handled when that completes
+	 * - and remove this subreq from the list.
 	 *
 	 * If the previous subreq finished first, we will have acquired their
 	 * donation and should be able to unlock folios and/or donate nextwards.
 	 */
 	if (!subreq->consumed &&
 	    !prev_donated &&
-	    !list_is_first(&subreq->rreq_link, &rreq->subrequests)) {
+	    !list_is_first(&subreq->rreq_link, &rreq->subrequests) &&
+	    subreq->start == prev->start + prev->len) {
 		prev = list_prev_entry(subreq, rreq_link);
 		WRITE_ONCE(prev->next_donated, prev->next_donated + subreq->len);
 		subreq->start += subreq->len;


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 03/10] netfs: Fix enomem handling in buffered reads
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
  2024-12-13 13:50 ` [PATCH 01/10] kheaders: Ignore silly-rename files David Howells
  2024-12-13 13:50 ` [PATCH 02/10] netfs: Fix non-contiguous donation between completed reads David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-13 13:50 ` [PATCH 04/10] nfs: Fix oops in nfs_netfs_init_request() when copying to cache David Howells
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel,
	syzbot+404b4b745080b6210c6c, Dmitry Antipov

If netfs_read_to_pagecache() gets an error from either ->prepare_read() or
from netfs_prepare_read_iterator(), it needs to decrement ->nr_outstanding,
cancel the subrequest and break out of the issuing loop.  Currently, it
only does this for two of the cases, but there are two more that aren't
handled.

Fix this by moving the handling to a common place and jumping to it from
all four places.  This is in preference to inserting a wrapper around
netfs_prepare_read_iterator() as proposed by Dmitry Antipov[1].

Link: https://lore.kernel.org/r/20241202093943.227786-1-dmantipov@yandex.ru/ [1]

Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Reported-by: syzbot+404b4b745080b6210c6c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=404b4b745080b6210c6c
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dmitry Antipov <dmantipov@yandex.ru>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
 fs/netfs/buffered_read.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 7ac34550c403..4dc9b8286355 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -275,22 +275,14 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq)
 			netfs_stat(&netfs_n_rh_download);
 			if (rreq->netfs_ops->prepare_read) {
 				ret = rreq->netfs_ops->prepare_read(subreq);
-				if (ret < 0) {
-					atomic_dec(&rreq->nr_outstanding);
-					netfs_put_subrequest(subreq, false,
-							     netfs_sreq_trace_put_cancel);
-					break;
-				}
+				if (ret < 0)
+					goto prep_failed;
 				trace_netfs_sreq(subreq, netfs_sreq_trace_prepare);
 			}
 
 			slice = netfs_prepare_read_iterator(subreq);
-			if (slice < 0) {
-				atomic_dec(&rreq->nr_outstanding);
-				netfs_put_subrequest(subreq, false, netfs_sreq_trace_put_cancel);
-				ret = slice;
-				break;
-			}
+			if (slice < 0)
+				goto prep_iter_failed;
 
 			rreq->netfs_ops->issue_read(subreq);
 			goto done;
@@ -302,6 +294,8 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq)
 			trace_netfs_sreq(subreq, netfs_sreq_trace_submit);
 			netfs_stat(&netfs_n_rh_zero);
 			slice = netfs_prepare_read_iterator(subreq);
+			if (slice < 0)
+				goto prep_iter_failed;
 			__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
 			netfs_read_subreq_terminated(subreq, 0, false);
 			goto done;
@@ -310,6 +304,8 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq)
 		if (source == NETFS_READ_FROM_CACHE) {
 			trace_netfs_sreq(subreq, netfs_sreq_trace_submit);
 			slice = netfs_prepare_read_iterator(subreq);
+			if (slice < 0)
+				goto prep_iter_failed;
 			netfs_read_cache_to_pagecache(rreq, subreq);
 			goto done;
 		}
@@ -318,6 +314,14 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq)
 		WARN_ON_ONCE(1);
 		break;
 
+	prep_iter_failed:
+		ret = slice;
+	prep_failed:
+		subreq->error = ret;
+		atomic_dec(&rreq->nr_outstanding);
+		netfs_put_subrequest(subreq, false, netfs_sreq_trace_put_cancel);
+		break;
+
 	done:
 		size -= slice;
 		start += slice;


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 04/10] nfs: Fix oops in nfs_netfs_init_request() when copying to cache
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (2 preceding siblings ...)
  2024-12-13 13:50 ` [PATCH 03/10] netfs: Fix enomem handling in buffered reads David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-13 13:50 ` [PATCH 05/10] cachefiles: Parse the "secctx" immediately David Howells
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Anna Schumaker,
	Dave Wysochanski

When netfslib wants to copy some data that has just been read on behalf of
nfs, it creates a new write request and calls nfs_netfs_init_request() to
initialise it, but with a NULL file pointer.  This causes
nfs_file_open_context() to oops - however, we don't actually need the nfs
context as we're only going to write to the cache.

Fix this by just returning if we aren't given a file pointer and emit a
warning if the request was for something other than copy-to-cache.

Further, fix nfs_netfs_free_request() so that it doesn't try to free the
context if the pointer is NULL.

Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Closes: https://lore.kernel.org/r/CAKPOu+9DyMbKLhyJb7aMLDTb=Fh0T8Teb9sjuf_pze+XWT1VaQ@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Trond Myklebust <trondmy@kernel.org>
cc: Anna Schumaker <anna@kernel.org>
cc: Dave Wysochanski <dwysocha@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-nfs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
 fs/nfs/fscache.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index 810269ee0a50..d49e4ce27999 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -263,6 +263,12 @@ int nfs_netfs_readahead(struct readahead_control *ractl)
 static atomic_t nfs_netfs_debug_id;
 static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
 {
+	if (!file) {
+		if (WARN_ON_ONCE(rreq->origin != NETFS_PGPRIV2_COPY_TO_CACHE))
+			return -EIO;
+		return 0;
+	}
+
 	rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
 	rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
 	/* [DEPRECATED] Use PG_private_2 to mark folio being written to the cache. */
@@ -274,7 +280,8 @@ static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *fi
 
 static void nfs_netfs_free_request(struct netfs_io_request *rreq)
 {
-	put_nfs_open_context(rreq->netfs_priv);
+	if (rreq->netfs_priv)
+		put_nfs_open_context(rreq->netfs_priv);
 }
 
 static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 05/10] cachefiles: Parse the "secctx" immediately
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (3 preceding siblings ...)
  2024-12-13 13:50 ` [PATCH 04/10] nfs: Fix oops in nfs_netfs_init_request() when copying to cache David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-13 13:50 ` [PATCH 06/10] netfs: Remove redundant use of smp_rmb() David Howells
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel

From: Max Kellermann <max.kellermann@ionos.com>

Instead of storing an opaque string, call security_secctx_to_secid()
right in the "secctx" command handler and store only the numeric
"secid".  This eliminates an unnecessary string allocation and allows
the daemon to receive errors when writing the "secctx" command instead
of postponing the error to the "bind" command handler.  For example,
if the kernel was built without `CONFIG_SECURITY`, "bind" will return
`EOPNOTSUPP`, but the daemon doesn't know why.  With this patch, the
"secctx" will instead return `EOPNOTSUPP` which is the right context
for this error.

This patch adds a boolean flag `have_secid` because I'm not sure if we
can safely assume that zero is the special secid value for "not set".
This appears to be true for SELinux, Smack and AppArmor, but since
this attribute is not documented, I'm unable to derive a stable
guarantee for that.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20241209141554.638708-1-max.kellermann@ionos.com/
---
 fs/cachefiles/daemon.c   | 14 +++++++-------
 fs/cachefiles/internal.h |  3 ++-
 fs/cachefiles/security.c |  6 +++---
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 89b11336a836..1806bff8e59b 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -15,6 +15,7 @@
 #include <linux/namei.h>
 #include <linux/poll.h>
 #include <linux/mount.h>
+#include <linux/security.h>
 #include <linux/statfs.h>
 #include <linux/ctype.h>
 #include <linux/string.h>
@@ -576,7 +577,7 @@ static int cachefiles_daemon_dir(struct cachefiles_cache *cache, char *args)
  */
 static int cachefiles_daemon_secctx(struct cachefiles_cache *cache, char *args)
 {
-	char *secctx;
+	int err;
 
 	_enter(",%s", args);
 
@@ -585,16 +586,16 @@ static int cachefiles_daemon_secctx(struct cachefiles_cache *cache, char *args)
 		return -EINVAL;
 	}
 
-	if (cache->secctx) {
+	if (cache->have_secid) {
 		pr_err("Second security context specified\n");
 		return -EINVAL;
 	}
 
-	secctx = kstrdup(args, GFP_KERNEL);
-	if (!secctx)
-		return -ENOMEM;
+	err = security_secctx_to_secid(args, strlen(args), &cache->secid);
+	if (err)
+		return err;
 
-	cache->secctx = secctx;
+	cache->have_secid = true;
 	return 0;
 }
 
@@ -820,7 +821,6 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
 	put_cred(cache->cache_cred);
 
 	kfree(cache->rootdirname);
-	kfree(cache->secctx);
 	kfree(cache->tag);
 
 	_leave("");
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 7b99bd98de75..38c236e38cef 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -122,7 +122,6 @@ struct cachefiles_cache {
 #define CACHEFILES_STATE_CHANGED	3	/* T if state changed (poll trigger) */
 #define CACHEFILES_ONDEMAND_MODE	4	/* T if in on-demand read mode */
 	char				*rootdirname;	/* name of cache root directory */
-	char				*secctx;	/* LSM security context */
 	char				*tag;		/* cache binding tag */
 	refcount_t			unbind_pincount;/* refcount to do daemon unbind */
 	struct xarray			reqs;		/* xarray of pending on-demand requests */
@@ -130,6 +129,8 @@ struct cachefiles_cache {
 	struct xarray			ondemand_ids;	/* xarray for ondemand_id allocation */
 	u32				ondemand_id_next;
 	u32				msg_id_next;
+	u32				secid;		/* LSM security id */
+	bool				have_secid;	/* whether "secid" was set */
 };
 
 static inline bool cachefiles_in_ondemand_mode(struct cachefiles_cache *cache)
diff --git a/fs/cachefiles/security.c b/fs/cachefiles/security.c
index fe777164f1d8..fc6611886b3b 100644
--- a/fs/cachefiles/security.c
+++ b/fs/cachefiles/security.c
@@ -18,7 +18,7 @@ int cachefiles_get_security_ID(struct cachefiles_cache *cache)
 	struct cred *new;
 	int ret;
 
-	_enter("{%s}", cache->secctx);
+	_enter("{%u}", cache->have_secid ? cache->secid : 0);
 
 	new = prepare_kernel_cred(current);
 	if (!new) {
@@ -26,8 +26,8 @@ int cachefiles_get_security_ID(struct cachefiles_cache *cache)
 		goto error;
 	}
 
-	if (cache->secctx) {
-		ret = set_security_override_from_ctx(new, cache->secctx);
+	if (cache->have_secid) {
+		ret = set_security_override(new, cache->secid);
 		if (ret < 0) {
 			put_cred(new);
 			pr_err("Security denies permission to nominate security context: error %d\n",


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 06/10] netfs: Remove redundant use of smp_rmb()
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (4 preceding siblings ...)
  2024-12-13 13:50 ` [PATCH 05/10] cachefiles: Parse the "secctx" immediately David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-16 10:13   ` Akira Yokosawa
  2024-12-13 13:50 ` [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit() David Howells
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Zilin Guan, Akira Yokosawa

From: Zilin Guan <zilin@seu.edu.cn>

The function netfs_unbuffered_write_iter_locked() in
fs/netfs/direct_write.c contains an unnecessary smp_rmb() call after
wait_on_bit(). Since wait_on_bit() already incorporates a memory barrier
that ensures the flag update is visible before the function returns, the
smp_rmb() provides no additional benefit and incurs unnecessary overhead.

This patch removes the redundant barrier to simplify and optimize the code.

Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Akira Yokosawa <akiyks@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20241207021952.2978530-1-zilin@seu.edu.cn/
---
 fs/netfs/direct_write.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index 88f2adfab75e..173e8b5e6a93 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -104,7 +104,6 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
 		trace_netfs_rreq(wreq, netfs_rreq_trace_wait_ip);
 		wait_on_bit(&wreq->flags, NETFS_RREQ_IN_PROGRESS,
 			    TASK_UNINTERRUPTIBLE);
-		smp_rmb(); /* Read error/transferred after RIP flag */
 		ret = wreq->error;
 		if (ret == 0) {
 			ret = wreq->transferred;


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit()
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (5 preceding siblings ...)
  2024-12-13 13:50 ` [PATCH 06/10] netfs: Remove redundant use of smp_rmb() David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-14 10:16   ` Akira Yokosawa
  2024-12-14 13:44   ` David Howells
  2024-12-13 13:50 ` [PATCH 08/10] netfs: Work around recursion by abandoning retry if nothing read David Howells
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Zilin Guan, Akira Yokosawa

Use clear_and_wake_up_bit() rather than something like:

	clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
	wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS);

as there needs to be a barrier inserted between which is present in
clear_and_wake_up_bit().

Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Zilin Guan <zilin@seu.edu.cn>
cc: Akira Yokosawa <akiyks@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
 fs/netfs/read_collect.c  | 3 +--
 fs/netfs/write_collect.c | 9 +++------
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index b415e3972336..46ce3b7adf07 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -379,8 +379,7 @@ static void netfs_rreq_assess(struct netfs_io_request *rreq)
 	task_io_account_read(rreq->transferred);
 
 	trace_netfs_rreq(rreq, netfs_rreq_trace_wake_ip);
-	clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
-	wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS);
+	clear_and_wake_up_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
 
 	trace_netfs_rreq(rreq, netfs_rreq_trace_done);
 	netfs_clear_subrequests(rreq, false);
diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c
index 1d438be2e1b4..82290c92ba7a 100644
--- a/fs/netfs/write_collect.c
+++ b/fs/netfs/write_collect.c
@@ -501,8 +501,7 @@ static void netfs_collect_write_results(struct netfs_io_request *wreq)
 		goto need_retry;
 	if ((notes & MADE_PROGRESS) && test_bit(NETFS_RREQ_PAUSE, &wreq->flags)) {
 		trace_netfs_rreq(wreq, netfs_rreq_trace_unpause);
-		clear_bit_unlock(NETFS_RREQ_PAUSE, &wreq->flags);
-		wake_up_bit(&wreq->flags, NETFS_RREQ_PAUSE);
+		clear_and_wake_up_bit(NETFS_RREQ_PAUSE, &wreq->flags);
 	}
 
 	if (notes & NEED_REASSESS) {
@@ -605,8 +604,7 @@ void netfs_write_collection_worker(struct work_struct *work)
 
 	_debug("finished");
 	trace_netfs_rreq(wreq, netfs_rreq_trace_wake_ip);
-	clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &wreq->flags);
-	wake_up_bit(&wreq->flags, NETFS_RREQ_IN_PROGRESS);
+	clear_and_wake_up_bit(NETFS_RREQ_IN_PROGRESS, &wreq->flags);
 
 	if (wreq->iocb) {
 		size_t written = min(wreq->transferred, wreq->len);
@@ -714,8 +712,7 @@ void netfs_write_subrequest_terminated(void *_op, ssize_t transferred_or_error,
 
 	trace_netfs_sreq(subreq, netfs_sreq_trace_terminated);
 
-	clear_bit_unlock(NETFS_SREQ_IN_PROGRESS, &subreq->flags);
-	wake_up_bit(&subreq->flags, NETFS_SREQ_IN_PROGRESS);
+	clear_and_wake_up_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags);
 
 	/* If we are at the head of the queue, wake up the collector,
 	 * transferring a ref to it if we were the ones to do so.


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 08/10] netfs: Work around recursion by abandoning retry if nothing read
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (6 preceding siblings ...)
  2024-12-13 13:50 ` [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit() David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-13 13:50 ` [PATCH 09/10] netfs: Fix ceph copy to cache on write-begin David Howells
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Lizhi Xu,
	Dominique Martinet

syzkaller reported recursion with a loop of three calls (netfs_rreq_assess,
netfs_retry_reads and netfs_rreq_terminated) hitting the limit of the stack
during an unbuffered or direct I/O read.

There are a number of issues:

 (1) There is no limit on the number of retries.

 (2) A subrequest is supposed to be abandoned if it does not transfer
     anything (NETFS_SREQ_NO_PROGRESS), but that isn't checked under all
     circumstances.

 (3) The actual root cause, which is this:

	if (atomic_dec_and_test(&rreq->nr_outstanding))
		netfs_rreq_terminated(rreq, ...);

     When we do a retry, we bump the rreq->nr_outstanding counter to
     prevent the final cleanup phase running before we've finished
     dispatching the retries.  The problem is if we hit 0, we have to do
     the cleanup phase - but we're in the cleanup phase and end up
     repeating the retry cycle, hence the recursion.

Work around the problem by limiting the number of retries.  This is based
on Lizhi Xu's patch[1], and makes the following changes:

 (1) Replace NETFS_SREQ_NO_PROGRESS with NETFS_SREQ_MADE_PROGRESS and make
     the filesystem set it if it managed to read or write at least one byte
     of data.  Clear this bit before issuing a subrequest.

 (2) Add a ->retry_count member to the subrequest and increment it any time
     we do a retry.

 (3) Remove the NETFS_SREQ_RETRYING flag as it is superfluous with
     ->retry_count.  If the latter is non-zero, we're doing a retry.

 (4) Abandon a subrequest if retry_count is non-zero and we made no
     progress.

 (5) Use ->retry_count in both the write-side and the read-size.

[?] Question: Should I set a hard limit on retry_count in both read and
    write?  Say it hits 50, we always abandon it.  The problem is that
    these changes only mitigate the issue.  As long as it made at least one
    byte of progress, the recursion is still an issue.  This patch
    mitigates the problem, but does not fix the underlying cause.  I have
    patches that will do that, but it's an intrusive fix that's currently
    pending for the next merge window.

The oops generated by KASAN looks something like:

   BUG: TASK stack guard page was hit at ffffc9000482ff48 (stack is ffffc90004830000..ffffc90004838000)
   Oops: stack guard page: 0000 [#1] PREEMPT SMP KASAN NOPTI
   ...
   RIP: 0010:mark_lock+0x25/0xc60 kernel/locking/lockdep.c:4686
    ...
    mark_usage kernel/locking/lockdep.c:4646 [inline]
    __lock_acquire+0x906/0x3ce0 kernel/locking/lockdep.c:5156
    lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5825
    local_lock_acquire include/linux/local_lock_internal.h:29 [inline]
    ___slab_alloc+0x123/0x1880 mm/slub.c:3695
    __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908
    __slab_alloc_node mm/slub.c:3961 [inline]
    slab_alloc_node mm/slub.c:4122 [inline]
    kmem_cache_alloc_noprof+0x2a7/0x2f0 mm/slub.c:4141
    radix_tree_node_alloc.constprop.0+0x1e8/0x350 lib/radix-tree.c:253
    idr_get_free+0x528/0xa40 lib/radix-tree.c:1506
    idr_alloc_u32+0x191/0x2f0 lib/idr.c:46
    idr_alloc+0xc1/0x130 lib/idr.c:87
    p9_tag_alloc+0x394/0x870 net/9p/client.c:321
    p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644
    p9_client_zc_rpc.constprop.0+0x105/0x880 net/9p/client.c:793
    p9_client_read_once+0x443/0x820 net/9p/client.c:1570
    p9_client_read+0x13f/0x1b0 net/9p/client.c:1534
    v9fs_issue_read+0x115/0x310 fs/9p/vfs_addr.c:74
    netfs_retry_read_subrequests fs/netfs/read_retry.c:60 [inline]
    netfs_retry_reads+0x153a/0x1d00 fs/netfs/read_retry.c:232
    netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371
    netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407
    netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
    netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371
    netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407
    netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
    netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371
    ...
    netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407
    netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
    netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371
    netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407
    netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
    netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371
    netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407
    netfs_dispatch_unbuffered_reads fs/netfs/direct_read.c:103 [inline]
    netfs_unbuffered_read fs/netfs/direct_read.c:127 [inline]
    netfs_unbuffered_read_iter_locked+0x12f6/0x19b0 fs/netfs/direct_read.c:221
    netfs_unbuffered_read_iter+0xc5/0x100 fs/netfs/direct_read.c:256
    v9fs_file_read_iter+0xbf/0x100 fs/9p/vfs_file.c:361
    do_iter_readv_writev+0x614/0x7f0 fs/read_write.c:832
    vfs_readv+0x4cf/0x890 fs/read_write.c:1025
    do_preadv fs/read_write.c:1142 [inline]
    __do_sys_preadv fs/read_write.c:1192 [inline]
    __se_sys_preadv fs/read_write.c:1187 [inline]
    __x64_sys_preadv+0x22d/0x310 fs/read_write.c:1187
    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
    do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83

Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Closes: https://syzkaller.appspot.com/bug?extid=1fc6f64c40a9d143cfb6
Signed-off-by: David Howells <dhowells@redhat.com>
Suggested-by: Lizhi Xu <lizhi.xu@windriver.com>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Jeff Layton <jlayton@kernel.org>
cc: v9fs@lists.linux.dev
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20241108034020.3695718-1-lizhi.xu@windriver.com/ [1]
---
 fs/9p/vfs_addr.c         |  6 +++++-
 fs/afs/write.c           |  5 ++++-
 fs/netfs/read_collect.c  | 15 +++++++++------
 fs/netfs/read_retry.c    |  6 ++++--
 fs/netfs/write_collect.c |  5 ++---
 fs/netfs/write_issue.c   |  2 ++
 fs/smb/client/cifssmb.c  | 13 +++++++++----
 fs/smb/client/smb2pdu.c  |  9 ++++++---
 include/linux/netfs.h    |  6 +++---
 9 files changed, 44 insertions(+), 23 deletions(-)

diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index 819c75233235..3bc9ce6c575e 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -57,6 +57,8 @@ static void v9fs_issue_write(struct netfs_io_subrequest *subreq)
 	int err, len;
 
 	len = p9_client_write(fid, subreq->start, &subreq->io_iter, &err);
+	if (len > 0)
+		__set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
 	netfs_write_subrequest_terminated(subreq, len ?: err, false);
 }
 
@@ -80,8 +82,10 @@ static void v9fs_issue_read(struct netfs_io_subrequest *subreq)
 	if (pos + total >= i_size_read(rreq->inode))
 		__set_bit(NETFS_SREQ_HIT_EOF, &subreq->flags);
 
-	if (!err)
+	if (!err) {
 		subreq->transferred += total;
+		__set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
+	}
 
 	netfs_read_subreq_terminated(subreq, err, false);
 }
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 34107b55f834..ccb6aa8027c5 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -122,7 +122,7 @@ static void afs_issue_write_worker(struct work_struct *work)
 	if (subreq->debug_index == 3)
 		return netfs_write_subrequest_terminated(subreq, -ENOANO, false);
 
-	if (!test_bit(NETFS_SREQ_RETRYING, &subreq->flags)) {
+	if (!subreq->retry_count) {
 		set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags);
 		return netfs_write_subrequest_terminated(subreq, -EAGAIN, false);
 	}
@@ -149,6 +149,9 @@ static void afs_issue_write_worker(struct work_struct *work)
 	afs_wait_for_operation(op);
 	ret = afs_put_operation(op);
 	switch (ret) {
+	case 0:
+		__set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
+		break;
 	case -EACCES:
 	case -EPERM:
 	case -ENOKEY:
diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index 46ce3b7adf07..47ed3a5044e2 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -438,7 +438,7 @@ void netfs_read_subreq_progress(struct netfs_io_subrequest *subreq,
 	     rreq->origin == NETFS_READPAGE ||
 	     rreq->origin == NETFS_READ_FOR_WRITE)) {
 		netfs_consume_read_data(subreq, was_async);
-		__clear_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags);
+		__set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
 	}
 }
 EXPORT_SYMBOL(netfs_read_subreq_progress);
@@ -497,7 +497,7 @@ void netfs_read_subreq_terminated(struct netfs_io_subrequest *subreq,
 		     rreq->origin == NETFS_READPAGE ||
 		     rreq->origin == NETFS_READ_FOR_WRITE)) {
 			netfs_consume_read_data(subreq, was_async);
-			__clear_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags);
+			__set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
 		}
 		rreq->transferred += subreq->transferred;
 	}
@@ -511,10 +511,13 @@ void netfs_read_subreq_terminated(struct netfs_io_subrequest *subreq,
 		} else {
 			trace_netfs_sreq(subreq, netfs_sreq_trace_short);
 			if (subreq->transferred > subreq->consumed) {
-				__set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags);
-				__clear_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags);
-				set_bit(NETFS_RREQ_NEED_RETRY, &rreq->flags);
-			} else if (!__test_and_set_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags)) {
+				/* If we didn't read new data, abandon retry. */
+				if (subreq->retry_count &&
+				    test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags)) {
+					__set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags);
+					set_bit(NETFS_RREQ_NEED_RETRY, &rreq->flags);
+				}
+			} else if (test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags)) {
 				__set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags);
 				set_bit(NETFS_RREQ_NEED_RETRY, &rreq->flags);
 			} else {
diff --git a/fs/netfs/read_retry.c b/fs/netfs/read_retry.c
index 0350592ea804..0e72e9226fc8 100644
--- a/fs/netfs/read_retry.c
+++ b/fs/netfs/read_retry.c
@@ -56,6 +56,8 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq)
 			if (test_bit(NETFS_SREQ_FAILED, &subreq->flags))
 				break;
 			if (__test_and_clear_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags)) {
+				__clear_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
+				subreq->retry_count++;
 				netfs_reset_iter(subreq);
 				netfs_reissue_read(rreq, subreq);
 			}
@@ -137,7 +139,8 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq)
 			stream0->sreq_max_len = subreq->len;
 
 			__clear_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags);
-			__set_bit(NETFS_SREQ_RETRYING, &subreq->flags);
+			__clear_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
+			subreq->retry_count++;
 
 			spin_lock_bh(&rreq->lock);
 			list_add_tail(&subreq->rreq_link, &rreq->subrequests);
@@ -213,7 +216,6 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq)
 			subreq->error = -ENOMEM;
 		__clear_bit(NETFS_SREQ_FAILED, &subreq->flags);
 		__clear_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags);
-		__clear_bit(NETFS_SREQ_RETRYING, &subreq->flags);
 	}
 	spin_lock_bh(&rreq->lock);
 	list_splice_tail_init(&queue, &rreq->subrequests);
diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c
index 82290c92ba7a..ca3a11ed9b54 100644
--- a/fs/netfs/write_collect.c
+++ b/fs/netfs/write_collect.c
@@ -179,7 +179,6 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq,
 				struct iov_iter source = subreq->io_iter;
 
 				iov_iter_revert(&source, subreq->len - source.count);
-				__set_bit(NETFS_SREQ_RETRYING, &subreq->flags);
 				netfs_get_subrequest(subreq, netfs_sreq_trace_get_resubmit);
 				netfs_reissue_write(stream, subreq, &source);
 			}
@@ -234,7 +233,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq,
 			/* Renegotiate max_len (wsize) */
 			trace_netfs_sreq(subreq, netfs_sreq_trace_retry);
 			__clear_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags);
-			__set_bit(NETFS_SREQ_RETRYING, &subreq->flags);
+			subreq->retry_count++;
 			stream->prepare_write(subreq);
 
 			part = min(len, stream->sreq_max_len);
@@ -279,7 +278,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq,
 			subreq->start		= start;
 			subreq->debug_index	= atomic_inc_return(&wreq->subreq_counter);
 			subreq->stream_nr	= to->stream_nr;
-			__set_bit(NETFS_SREQ_RETRYING, &subreq->flags);
+			subreq->retry_count	= 1;
 
 			trace_netfs_sreq_ref(wreq->debug_id, subreq->debug_index,
 					     refcount_read(&subreq->ref),
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index bf6d507578e5..ff0e82505a0b 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -244,6 +244,8 @@ void netfs_reissue_write(struct netfs_io_stream *stream,
 	iov_iter_advance(source, size);
 	iov_iter_truncate(&subreq->io_iter, size);
 
+	subreq->retry_count++;
+	__clear_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
 	__set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags);
 	netfs_do_issue_write(stream, subreq);
 }
diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c
index bd42a419458e..6cb1e81993f8 100644
--- a/fs/smb/client/cifssmb.c
+++ b/fs/smb/client/cifssmb.c
@@ -1319,14 +1319,16 @@ cifs_readv_callback(struct mid_q_entry *mid)
 	}
 
 	if (rdata->result == -ENODATA) {
-		__set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags);
 		rdata->result = 0;
+		__set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags);
 	} else {
 		size_t trans = rdata->subreq.transferred + rdata->got_bytes;
 		if (trans < rdata->subreq.len &&
 		    rdata->subreq.start + trans == ictx->remote_i_size) {
-			__set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags);
 			rdata->result = 0;
+			__set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags);
+		} else if (rdata->got_bytes > 0) {
+			__set_bit(NETFS_SREQ_MADE_PROGRESS, &rdata->subreq.flags);
 		}
 	}
 
@@ -1670,10 +1672,13 @@ cifs_writev_callback(struct mid_q_entry *mid)
 		if (written > wdata->subreq.len)
 			written &= 0xFFFF;
 
-		if (written < wdata->subreq.len)
+		if (written < wdata->subreq.len) {
 			result = -ENOSPC;
-		else
+		} else {
 			result = written;
+			if (written > 0)
+				__set_bit(NETFS_SREQ_MADE_PROGRESS, &wdata->subreq.flags);
+		}
 		break;
 	case MID_REQUEST_SUBMITTED:
 	case MID_RETRY_NEEDED:
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 010eae9d6c47..458b53d1f9cb 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -4615,6 +4615,7 @@ smb2_readv_callback(struct mid_q_entry *mid)
 			__set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags);
 			rdata->result = 0;
 		}
+		__set_bit(NETFS_SREQ_MADE_PROGRESS, &rdata->subreq.flags);
 	}
 	trace_smb3_rw_credits(rreq_debug_id, subreq_debug_index, rdata->credits.value,
 			      server->credits, server->in_flight,
@@ -4840,10 +4841,12 @@ smb2_writev_callback(struct mid_q_entry *mid)
 		if (written > wdata->subreq.len)
 			written &= 0xFFFF;
 
-		if (written < wdata->subreq.len)
+		if (written < wdata->subreq.len) {
 			wdata->result = -ENOSPC;
-		else
+		} else if (written > 0) {
 			wdata->subreq.len = written;
+			__set_bit(NETFS_SREQ_MADE_PROGRESS, &wdata->subreq.flags);
+		}
 		break;
 	case MID_REQUEST_SUBMITTED:
 	case MID_RETRY_NEEDED:
@@ -5012,7 +5015,7 @@ smb2_async_writev(struct cifs_io_subrequest *wdata)
 	}
 #endif
 
-	if (test_bit(NETFS_SREQ_RETRYING, &wdata->subreq.flags))
+	if (wdata->subreq.retry_count > 0)
 		smb2_set_replay(server, &rqst);
 
 	cifs_dbg(FYI, "async write at %llu %u bytes iter=%zx\n",
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 5eaceef41e6c..4083d77e3f39 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -185,6 +185,7 @@ struct netfs_io_subrequest {
 	short			error;		/* 0 or error that occurred */
 	unsigned short		debug_index;	/* Index in list (for debugging output) */
 	unsigned int		nr_segs;	/* Number of segs in io_iter */
+	u8			retry_count;	/* The number of retries (0 on initial pass) */
 	enum netfs_io_source	source;		/* Where to read from/write to */
 	unsigned char		stream_nr;	/* I/O stream this belongs to */
 	unsigned char		curr_folioq_slot; /* Folio currently being read */
@@ -194,14 +195,13 @@ struct netfs_io_subrequest {
 #define NETFS_SREQ_COPY_TO_CACHE	0	/* Set if should copy the data to the cache */
 #define NETFS_SREQ_CLEAR_TAIL		1	/* Set if the rest of the read should be cleared */
 #define NETFS_SREQ_SEEK_DATA_READ	3	/* Set if ->read() should SEEK_DATA first */
-#define NETFS_SREQ_NO_PROGRESS		4	/* Set if we didn't manage to read any data */
+#define NETFS_SREQ_MADE_PROGRESS	4	/* Set if we transferred at least some data */
 #define NETFS_SREQ_ONDEMAND		5	/* Set if it's from on-demand read mode */
 #define NETFS_SREQ_BOUNDARY		6	/* Set if ends on hard boundary (eg. ceph object) */
 #define NETFS_SREQ_HIT_EOF		7	/* Set if short due to EOF */
 #define NETFS_SREQ_IN_PROGRESS		8	/* Unlocked when the subrequest completes */
 #define NETFS_SREQ_NEED_RETRY		9	/* Set if the filesystem requests a retry */
-#define NETFS_SREQ_RETRYING		10	/* Set if we're retrying */
-#define NETFS_SREQ_FAILED		11	/* Set if the subreq failed unretryably */
+#define NETFS_SREQ_FAILED		10	/* Set if the subreq failed unretryably */
 };
 
 enum netfs_io_origin {


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 09/10] netfs: Fix ceph copy to cache on write-begin
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (7 preceding siblings ...)
  2024-12-13 13:50 ` [PATCH 08/10] netfs: Work around recursion by abandoning retry if nothing read David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-13 13:50 ` [PATCH 10/10] netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled David Howells
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel

At the end of netfs_unlock_read_folio() in which folios are marked
appropriately for copying to the cache (either with by being marked dirty
and having their private data set or by having PG_private_2 set) and then
unlocked, the folio_queue struct has the entry pointing to the folio
cleared.  This presents a problem for netfs_pgpriv2_write_to_the_cache(),
which is used to write folios marked with PG_private_2 to the cache as it
expects to be able to trawl the folio_queue list thereafter to find the
relevant folios, leading to a hang.

Fix this by not clearing the folio_queue entry if we're going to do the
deprecated copy-to-cache.  The clearance will be done instead as the folios
are written to the cache.

This can be reproduced by starting cachefiles, mounting a ceph filesystem
with "-o fsc" and writing to it.

Fixes: 796a4049640b ("netfs: In readahead, put the folio refs as soon extracted")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Closes: https://lore.kernel.org/r/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: netfs@lists.linux.dev
cc: ceph-devel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
 fs/netfs/read_collect.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index 47ed3a5044e2..e8624f5c7fcc 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -62,10 +62,14 @@ static void netfs_unlock_read_folio(struct netfs_io_subrequest *subreq,
 		} else {
 			trace_netfs_folio(folio, netfs_folio_trace_read_done);
 		}
+
+		folioq_clear(folioq, slot);
 	} else {
 		// TODO: Use of PG_private_2 is deprecated.
 		if (test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags))
 			netfs_pgpriv2_mark_copy_to_cache(subreq, rreq, folioq, slot);
+		else
+			folioq_clear(folioq, slot);
 	}
 
 	if (!test_bit(NETFS_RREQ_DONT_UNLOCK_FOLIOS, &rreq->flags)) {
@@ -77,8 +81,6 @@ static void netfs_unlock_read_folio(struct netfs_io_subrequest *subreq,
 			folio_unlock(folio);
 		}
 	}
-
-	folioq_clear(folioq, slot);
 }
 
 /*


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 10/10] netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (8 preceding siblings ...)
  2024-12-13 13:50 ` [PATCH 09/10] netfs: Fix ceph copy to cache on write-begin David Howells
@ 2024-12-13 13:50 ` David Howells
  2024-12-13 14:04 ` ceph xfstests failures [was Re: [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes] David Howells
  2024-12-16 20:34 ` [PATCH 11/10] netfs: Fix is-caching check in read-retry David Howells
  11 siblings, 0 replies; 19+ messages in thread
From: David Howells @ 2024-12-13 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Howells, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel

When the caching for a cookie is temporarily disabled (e.g. due to a DIO
write on that file), future copying to the cache for that file is disabled
until all fds open on that file are closed.  However, if netfslib is using
the deprecated PG_private_2 method (such as is currently used by ceph), and
decides it wants to copy to the cache, netfs_advance_write() will just bail
at the first check seeing that the cache stream is unavailable, and
indicate that it dealt with all the content.

This means that we have no subrequests to provide notifications to drive
the state machine or even to pin the request and the request just gets
discarded, leaving the folios with PG_private_2 set.

Fix this by jumping directly to cancel the request if the cache is not
available.  That way, we don't remove mark3 from the folio_queue list and
netfs_pgpriv2_cancel() will clean up the folios.

This was found by running the generic/013 xfstest against ceph with an
active cache and the "-o fsc" option passed to ceph.  That would usually
hang

Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Closes: https://lore.kernel.org/r/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: netfs@lists.linux.dev
cc: ceph-devel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
 fs/netfs/read_pgpriv2.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index ba5af89d37fa..54d5004fec18 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -170,6 +170,10 @@ void netfs_pgpriv2_write_to_the_cache(struct netfs_io_request *rreq)

 	trace_netfs_write(wreq, netfs_write_trace_copy_to_cache);
 	netfs_stat(&netfs_n_wh_copy_to_cache);
+	if (!wreq->io_streams[1].avail) {
+		netfs_put_request(wreq, false, netfs_rreq_trace_put_return);
+		goto couldnt_start;
+	}

 	for (;;) {
 		error = netfs_pgpriv2_copy_folio(wreq, folio);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* ceph xfstests failures [was Re: [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes]
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (9 preceding siblings ...)
  2024-12-13 13:50 ` [PATCH 10/10] netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled David Howells
@ 2024-12-13 14:04 ` David Howells
  2024-12-18 15:10   ` Alex Markuze
  2024-12-16 20:34 ` [PATCH 11/10] netfs: Fix is-caching check in read-retry David Howells
  11 siblings, 1 reply; 19+ messages in thread
From: David Howells @ 2024-12-13 14:04 UTC (permalink / raw)
  To: Christian Brauner, Ilya Dryomov
  Cc: dhowells, Max Kellermann, Xiubo Li, Trond Myklebust, Jeff Layton,
	Matthew Wilcox, netfs, linux-afs, linux-cifs, linux-nfs,
	ceph-devel, v9fs, linux-erofs, linux-fsdevel, linux-mm,
	linux-kernel

David Howells <dhowells@redhat.com> wrote:

> With these patches, I can run xfstest -g quick to completion on ceph with a
> local cache.

I should qualify that.  The thing completes and doesn't hang, but I get 6
failures:

    Failures: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732

Though these don't appear to be anything to do with netfslib (see attached).
There are two cases where the mount is busy and the rest seems to be due to
id-mapped mounts and/or user namespaces.

The xfstest local.config file looks something like:

    export FSTYP=ceph
    export TEST_DEV=<ipaddr>:/test
    export TEST_DIR=/xfstest.test
    TEST_FS_MOUNT_OPTS='-o name=admin,mds_namespace=test,fs=test,fsc'
    export SCRATCH_DEV=<ipaddr>:/scratch
    export SCRATCH_MNT=/xfstest.scratch
    export MOUNT_OPTIONS='-o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch'

David
---
# ./check -E .exclude generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
FSTYP         -- ceph
PLATFORM      -- Linux/x86_64 andromeda 6.13.0-rc2-build3+ #5311 SMP Fri Dec 13 09:03:34 GMT 2024
MKFS_OPTIONS  -- <ipaddr>:/scratch
MOUNT_OPTIONS -- -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.scratch

generic/604 2s ... [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/604.out.bad)
    --- tests/generic/604.out   2024-09-12 12:36:14.187441830 +0100
    +++ /root/xfstests-dev/results//generic/604.out.bad 2024-12-13 13:18:51.910900871 +0000
    @@ -1,2 +1,4 @@
     QA output created by 604
    -Silence is golden
    +mount error 16 = Device or resource busy
    +mount -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.scratch failed
    +(see /root/xfstests-dev/results//generic/604.full for details)
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/604.out /root/xfstests-dev/results//generic/604.out.bad'  to see the entire diff)
generic/633       [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/633.out.bad)
    --- tests/generic/633.out   2024-09-12 12:36:14.187441830 +0100
    +++ /root/xfstests-dev/results//generic/633.out.bad 2024-12-13 13:18:55.958979531 +0000
    @@ -1,2 +1,4 @@
     QA output created by 633
     Silence is golden
    +idmapped-mounts.c: 307: tcore_create_in_userns - Input/output error - failure: open file
    +vfstest.c: 2418: run_test - Success - failure: create operations in user namespace
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/633.out /root/xfstests-dev/results//generic/633.out.bad'  to see the entire diff)
generic/645       [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/645.out.bad)
    --- tests/generic/645.out   2024-09-12 12:36:14.191441810 +0100
    +++ /root/xfstests-dev/results//generic/645.out.bad 2024-12-13 13:19:25.526908024 +0000
    @@ -1,2 +1,4 @@
     QA output created by 645
     Silence is golden
    +idmapped-mounts.c: 6671: nested_userns - Invalid argument - failure: sys_mount_setattr
    +vfstest.c: 2418: run_test - Invalid argument - failure: test that nested user namespaces behave correctly when attached to idmapped mounts
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/645.out /root/xfstests-dev/results//generic/645.out.bad'  to see the entire diff)
generic/696       - output mismatch (see /root/xfstests-dev/results//generic/696.out.bad)
    --- tests/generic/696.out   2024-09-12 12:36:14.195441791 +0100
    +++ /root/xfstests-dev/results//generic/696.out.bad 2024-12-13 13:19:30.254804087 +0000
    @@ -1,2 +1,6 @@
     QA output created by 696
    +idmapped-mounts.c: 7763: setgid_create_umask_idmapped - Input/output error - failure: create
    +vfstest.c: 2418: run_test - Success - failure: create operations by using umask in directories with setgid bit set on idmapped mount
    +idmapped-mounts.c: 7763: setgid_create_umask_idmapped - Input/output error - failure: create
    +vfstest.c: 2418: run_test - Success - failure: create operations by using umask in directories with setgid bit set on idmapped mount
     Silence is golden
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/696.out /root/xfstests-dev/results//generic/696.out.bad'  to see the entire diff)

HINT: You _MAY_ be missing kernel fix:
      ac6800e279a2 fs: Add missing umask strip in vfs_tmpfile 1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers

generic/697       - output mismatch (see /root/xfstests-dev/results//generic/697.out.bad)
    --- tests/generic/697.out   2024-09-12 12:36:14.195441791 +0100
    +++ /root/xfstests-dev/results//generic/697.out.bad 2024-12-13 13:19:31.749225548 +0000
    @@ -1,2 +1,4 @@
     QA output created by 697
    +idmapped-mounts.c: 8218: setgid_create_acl_idmapped - Input/output error - failure: create
    +vfstest.c: 2418: run_test - Success - failure: create operations by using acl in directories with setgid bit set on idmapped mount
     Silence is golden
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/697.out /root/xfstests-dev/results//generic/697.out.bad'  to see the entire diff)

HINT: You _MAY_ be missing kernel fix:
      1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers

generic/732 1s ... [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/732.out.bad)
    --- tests/generic/732.out   2024-09-12 12:36:14.195441791 +0100
    +++ /root/xfstests-dev/results//generic/732.out.bad 2024-12-13 13:19:34.482858235 +0000
    @@ -1,2 +1,5 @@
     QA output created by 732
     Silence is golden
    +mount error 16 = Device or resource busy
    +mount -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.test/mountpoint2-732 failed
    +(see /root/xfstests-dev/results//generic/732.full for details)
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/732.out /root/xfstests-dev/results//generic/732.out.bad'  to see the entire diff)
Ran: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
Failures: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
Failed 6 of 6 tests


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit()
  2024-12-13 13:50 ` [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit() David Howells
@ 2024-12-14 10:16   ` Akira Yokosawa
  2024-12-14 13:44   ` David Howells
  1 sibling, 0 replies; 19+ messages in thread
From: Akira Yokosawa @ 2024-12-14 10:16 UTC (permalink / raw)
  To: David Howells, Christian Brauner
  Cc: Max Kellermann, Ilya Dryomov, Xiubo Li, Trond Myklebust,
	Jeff Layton, Matthew Wilcox, netfs, linux-afs, linux-cifs,
	linux-nfs, ceph-devel, v9fs, linux-erofs, linux-fsdevel, linux-mm,
	linux-kernel, Zilin Guan, Akira Yokosawa

Hi David,

David Howells wrote:
> Use clear_and_wake_up_bit() rather than something like:
> 
> 	clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
> 	wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS);
> 
> as there needs to be a barrier inserted between which is present in
> clear_and_wake_up_bit().

If I am reading the kernel-doc comment of clear_bit_unlock() [1, 2]:

    This operation is atomic and provides release barrier semantics.

correctly, there already seems to be a barrier which should be
good enough.

[1]: https://www.kernel.org/doc/html/latest/core-api/kernel-api.html#c.clear_bit_unlock
[2]: include/asm-generic/bitops/instrumented-lock.h

> 
> Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")

So I'm not sure this fixes anything.

What am I missing?

        Thanks, Akira

> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Zilin Guan <zilin@seu.edu.cn>
> cc: Akira Yokosawa <akiyks@gmail.com>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: netfs@lists.linux.dev
> cc: linux-fsdevel@vger.kernel.org
> ---
>  fs/netfs/read_collect.c  | 3 +--
>  fs/netfs/write_collect.c | 9 +++------
>  2 files changed, 4 insertions(+), 8 deletions(-)
> 
[...]


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit()
  2024-12-13 13:50 ` [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit() David Howells
  2024-12-14 10:16   ` Akira Yokosawa
@ 2024-12-14 13:44   ` David Howells
  2024-12-16 10:11     ` Akira Yokosawa
  1 sibling, 1 reply; 19+ messages in thread
From: David Howells @ 2024-12-14 13:44 UTC (permalink / raw)
  To: Akira Yokosawa, Paul E. McKenney
  Cc: dhowells, Christian Brauner, Max Kellermann, Ilya Dryomov,
	Xiubo Li, Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs,
	linux-afs, linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Zilin Guan

[Adding Paul McKenney as he's the expert.]

Akira Yokosawa <akiyks@gmail.com> wrote:

> David Howells wrote:
> > Use clear_and_wake_up_bit() rather than something like:
> > 
> > 	clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
> > 	wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS);
> > 
> > as there needs to be a barrier inserted between which is present in
> > clear_and_wake_up_bit().
> 
> If I am reading the kernel-doc comment of clear_bit_unlock() [1, 2]:
> 
>     This operation is atomic and provides release barrier semantics.
> 
> correctly, there already seems to be a barrier which should be
> good enough.
> 
> [1]: https://www.kernel.org/doc/html/latest/core-api/kernel-api.html#c.clear_bit_unlock
> [2]: include/asm-generic/bitops/instrumented-lock.h
> 
> > 
> > Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
> > Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
> 
> So I'm not sure this fixes anything.
> 
> What am I missing?

We may need two barriers.  You have three things to synchronise:

 (1) The stuff you did before unlocking.

 (2) The lock bit.

 (3) The task state.

clear_bit_unlock() interposes a release barrier between (1) and (2).

Neither clear_bit_unlock() nor wake_up_bit(), however, necessarily interpose a
barrier between (2) and (3).  I'm not sure it entirely matters, but it seems
that since we have a function that combines the two, we should probably use
it - though, granted, it might not actually be a fix.

David


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit()
  2024-12-14 13:44   ` David Howells
@ 2024-12-16 10:11     ` Akira Yokosawa
  0 siblings, 0 replies; 19+ messages in thread
From: Akira Yokosawa @ 2024-12-16 10:11 UTC (permalink / raw)
  To: David Howells, Paul E. McKenney
  Cc: Christian Brauner, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Zilin Guan, Akira Yokosawa

David Howells wrote:
> [Adding Paul McKenney as he's the expert.]
> 
> Akira Yokosawa <akiyks@gmail.com> wrote:
> 
>> David Howells wrote:
>>> Use clear_and_wake_up_bit() rather than something like:
>>>
>>> 	clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
>>> 	wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS);
>>>
>>> as there needs to be a barrier inserted between which is present in
>>> clear_and_wake_up_bit().
>>
>> If I am reading the kernel-doc comment of clear_bit_unlock() [1, 2]:
>>
>>     This operation is atomic and provides release barrier semantics.
>>
>> correctly, there already seems to be a barrier which should be
>> good enough.
>>
>> [1]: https://www.kernel.org/doc/html/latest/core-api/kernel-api.html#c.clear_bit_unlock
>> [2]: include/asm-generic/bitops/instrumented-lock.h
>>
>>>
>>> Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
>>> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
>>
>> So I'm not sure this fixes anything.
>>
>> What am I missing?
> 
> We may need two barriers.  You have three things to synchronise:
> 
>  (1) The stuff you did before unlocking.
> 
>  (2) The lock bit.
> 
>  (3) The task state.
> 
> clear_bit_unlock() interposes a release barrier between (1) and (2).
> 
> Neither clear_bit_unlock() nor wake_up_bit(), however, necessarily interpose a
> barrier between (2) and (3).

Got it!

I was confused because I compared kernel-doc comments of clear_bit_unlock()
and clear_and_wake_up_bit() only, without looking at latter's code.

clear_and_wake_up_bit() has this description in its kernel-doc:

 * The designated bit is cleared and any tasks waiting in wait_on_bit()
 * or similar will be woken.  This call has RELEASE semantics so that
 * any changes to memory made before this call are guaranteed to be visible
 * after the corresponding wait_on_bit() completes.

, without any mention of additional full barrier at your (3) above.

It might be worth mentioning it there.

Thoughts?

FWIW,

Reviewed-by: Akira Yokosawa <akiyks@gmail.com>

>                               I'm not sure it entirely matters, but it seems
> that since we have a function that combines the two, we should probably use
> it - though, granted, it might not actually be a fix.

Looks like it should matter where smp_mb__after_atomic() is stronger than
a plain barrier().

Akira


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 06/10] netfs: Remove redundant use of smp_rmb()
  2024-12-13 13:50 ` [PATCH 06/10] netfs: Remove redundant use of smp_rmb() David Howells
@ 2024-12-16 10:13   ` Akira Yokosawa
  0 siblings, 0 replies; 19+ messages in thread
From: Akira Yokosawa @ 2024-12-16 10:13 UTC (permalink / raw)
  To: David Howells, Christian Brauner
  Cc: Max Kellermann, Ilya Dryomov, Xiubo Li, Trond Myklebust,
	Jeff Layton, Matthew Wilcox, netfs, linux-afs, linux-cifs,
	linux-nfs, ceph-devel, v9fs, linux-erofs, linux-fsdevel, linux-mm,
	linux-kernel, Zilin Guan, Akira Yokosawa

David Howells wrote:
> From: Zilin Guan <zilin@seu.edu.cn>
> 
> The function netfs_unbuffered_write_iter_locked() in
> fs/netfs/direct_write.c contains an unnecessary smp_rmb() call after
> wait_on_bit(). Since wait_on_bit() already incorporates a memory barrier
> that ensures the flag update is visible before the function returns, the
> smp_rmb() provides no additional benefit and incurs unnecessary overhead.
> 
> This patch removes the redundant barrier to simplify and optimize the code.
> 
> Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Akira Yokosawa <akiyks@gmail.com>

Reviewed-by: Akira Yokosawa <akiyks@gmail.com>

> cc: Jeff Layton <jlayton@kernel.org>
> cc: netfs@lists.linux.dev
> cc: linux-fsdevel@vger.kernel.org
> Link: https://lore.kernel.org/r/20241207021952.2978530-1-zilin@seu.edu.cn/
> ---
>  fs/netfs/direct_write.c | 1 -
>  1 file changed, 1 deletion(-)
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 11/10] netfs: Fix is-caching check in read-retry
  2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
                   ` (10 preceding siblings ...)
  2024-12-13 14:04 ` ceph xfstests failures [was Re: [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes] David Howells
@ 2024-12-16 20:34 ` David Howells
  11 siblings, 0 replies; 19+ messages in thread
From: David Howells @ 2024-12-16 20:34 UTC (permalink / raw)
  To: Christian Brauner
  Cc: dhowells, Max Kellermann, Ilya Dryomov, Xiubo Li, Trond Myklebust,
	Jeff Layton, Matthew Wilcox, netfs, linux-afs, linux-cifs,
	linux-nfs, ceph-devel, v9fs, linux-erofs, linux-fsdevel, linux-mm,
	linux-kernel

netfs: Fix is-caching check in read-retry

The read-retry code checks the NETFS_RREQ_COPY_TO_CACHE flag to determine
if there might be failed reads from the cache that need turning into reads
from the server, with the intention of skipping the complicated part if it
can.  The code that set the flag, however, got lost during the read-side
rewrite.

Fix the check to see if the cache_resources are valid instead.  The flag
can then be removed.

Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
 fs/netfs/read_retry.c |    2 +-
 include/linux/netfs.h |    1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/netfs/read_retry.c b/fs/netfs/read_retry.c
index 0e72e9226fc8..21b4a54e545e 100644
--- a/fs/netfs/read_retry.c
+++ b/fs/netfs/read_retry.c
@@ -49,7 +49,7 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq)
 	 * up to the first permanently failed one.
 	 */
 	if (!rreq->netfs_ops->prepare_read &&
-	    !test_bit(NETFS_RREQ_COPY_TO_CACHE, &rreq->flags)) {
+	    !rreq->cache_resources.ops) {
 		struct netfs_io_subrequest *subreq;
 
 		list_for_each_entry(subreq, &rreq->subrequests, rreq_link) {
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 4083d77e3f39..ecdd5ced16a8 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -269,7 +269,6 @@ struct netfs_io_request {
 	size_t			prev_donated;	/* Fallback for subreq->prev_donated */
 	refcount_t		ref;
 	unsigned long		flags;
-#define NETFS_RREQ_COPY_TO_CACHE	1	/* Need to write to the cache */
 #define NETFS_RREQ_NO_UNLOCK_FOLIO	2	/* Don't unlock no_unlock_folio on completion */
 #define NETFS_RREQ_DONT_UNLOCK_FOLIOS	3	/* Don't unlock the folios on completion */
 #define NETFS_RREQ_FAILED		4	/* The request failed */


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: ceph xfstests failures [was Re: [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes]
  2024-12-13 14:04 ` ceph xfstests failures [was Re: [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes] David Howells
@ 2024-12-18 15:10   ` Alex Markuze
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Markuze @ 2024-12-18 15:10 UTC (permalink / raw)
  To: David Howells
  Cc: Christian Brauner, Ilya Dryomov, Max Kellermann, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel

Hey David.
Thanks, for the find. I've seen your mail, but it was a busy week.
If you can, please open a https://tracker.ceph.com/ bug and assign it to me.

On Fri, Dec 13, 2024 at 4:05 PM David Howells <dhowells@redhat.com> wrote:
>
> David Howells <dhowells@redhat.com> wrote:
>
> > With these patches, I can run xfstest -g quick to completion on ceph with a
> > local cache.
>
> I should qualify that.  The thing completes and doesn't hang, but I get 6
> failures:
>
>     Failures: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
>
> Though these don't appear to be anything to do with netfslib (see attached).
> There are two cases where the mount is busy and the rest seems to be due to
> id-mapped mounts and/or user namespaces.
>
> The xfstest local.config file looks something like:
>
>     export FSTYP=ceph
>     export TEST_DEV=<ipaddr>:/test
>     export TEST_DIR=/xfstest.test
>     TEST_FS_MOUNT_OPTS='-o name=admin,mds_namespace=test,fs=test,fsc'
>     export SCRATCH_DEV=<ipaddr>:/scratch
>     export SCRATCH_MNT=/xfstest.scratch
>     export MOUNT_OPTIONS='-o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch'
>
> David
> ---
> # ./check -E .exclude generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
> FSTYP         -- ceph
> PLATFORM      -- Linux/x86_64 andromeda 6.13.0-rc2-build3+ #5311 SMP Fri Dec 13 09:03:34 GMT 2024
> MKFS_OPTIONS  -- <ipaddr>:/scratch
> MOUNT_OPTIONS -- -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.scratch
>
> generic/604 2s ... [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/604.out.bad)
>     --- tests/generic/604.out   2024-09-12 12:36:14.187441830 +0100
>     +++ /root/xfstests-dev/results//generic/604.out.bad 2024-12-13 13:18:51.910900871 +0000
>     @@ -1,2 +1,4 @@
>      QA output created by 604
>     -Silence is golden
>     +mount error 16 = Device or resource busy
>     +mount -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.scratch failed
>     +(see /root/xfstests-dev/results//generic/604.full for details)
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/604.out /root/xfstests-dev/results//generic/604.out.bad'  to see the entire diff)
> generic/633       [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/633.out.bad)
>     --- tests/generic/633.out   2024-09-12 12:36:14.187441830 +0100
>     +++ /root/xfstests-dev/results//generic/633.out.bad 2024-12-13 13:18:55.958979531 +0000
>     @@ -1,2 +1,4 @@
>      QA output created by 633
>      Silence is golden
>     +idmapped-mounts.c: 307: tcore_create_in_userns - Input/output error - failure: open file
>     +vfstest.c: 2418: run_test - Success - failure: create operations in user namespace
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/633.out /root/xfstests-dev/results//generic/633.out.bad'  to see the entire diff)
> generic/645       [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/645.out.bad)
>     --- tests/generic/645.out   2024-09-12 12:36:14.191441810 +0100
>     +++ /root/xfstests-dev/results//generic/645.out.bad 2024-12-13 13:19:25.526908024 +0000
>     @@ -1,2 +1,4 @@
>      QA output created by 645
>      Silence is golden
>     +idmapped-mounts.c: 6671: nested_userns - Invalid argument - failure: sys_mount_setattr
>     +vfstest.c: 2418: run_test - Invalid argument - failure: test that nested user namespaces behave correctly when attached to idmapped mounts
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/645.out /root/xfstests-dev/results//generic/645.out.bad'  to see the entire diff)
> generic/696       - output mismatch (see /root/xfstests-dev/results//generic/696.out.bad)
>     --- tests/generic/696.out   2024-09-12 12:36:14.195441791 +0100
>     +++ /root/xfstests-dev/results//generic/696.out.bad 2024-12-13 13:19:30.254804087 +0000
>     @@ -1,2 +1,6 @@
>      QA output created by 696
>     +idmapped-mounts.c: 7763: setgid_create_umask_idmapped - Input/output error - failure: create
>     +vfstest.c: 2418: run_test - Success - failure: create operations by using umask in directories with setgid bit set on idmapped mount
>     +idmapped-mounts.c: 7763: setgid_create_umask_idmapped - Input/output error - failure: create
>     +vfstest.c: 2418: run_test - Success - failure: create operations by using umask in directories with setgid bit set on idmapped mount
>      Silence is golden
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/696.out /root/xfstests-dev/results//generic/696.out.bad'  to see the entire diff)
>
> HINT: You _MAY_ be missing kernel fix:
>       ac6800e279a2 fs: Add missing umask strip in vfs_tmpfile 1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers
>
> generic/697       - output mismatch (see /root/xfstests-dev/results//generic/697.out.bad)
>     --- tests/generic/697.out   2024-09-12 12:36:14.195441791 +0100
>     +++ /root/xfstests-dev/results//generic/697.out.bad 2024-12-13 13:19:31.749225548 +0000
>     @@ -1,2 +1,4 @@
>      QA output created by 697
>     +idmapped-mounts.c: 8218: setgid_create_acl_idmapped - Input/output error - failure: create
>     +vfstest.c: 2418: run_test - Success - failure: create operations by using acl in directories with setgid bit set on idmapped mount
>      Silence is golden
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/697.out /root/xfstests-dev/results//generic/697.out.bad'  to see the entire diff)
>
> HINT: You _MAY_ be missing kernel fix:
>       1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers
>
> generic/732 1s ... [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/732.out.bad)
>     --- tests/generic/732.out   2024-09-12 12:36:14.195441791 +0100
>     +++ /root/xfstests-dev/results//generic/732.out.bad 2024-12-13 13:19:34.482858235 +0000
>     @@ -1,2 +1,5 @@
>      QA output created by 732
>      Silence is golden
>     +mount error 16 = Device or resource busy
>     +mount -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.test/mountpoint2-732 failed
>     +(see /root/xfstests-dev/results//generic/732.full for details)
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/732.out /root/xfstests-dev/results//generic/732.out.bad'  to see the entire diff)
> Ran: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
> Failures: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
> Failed 6 of 6 tests
>
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 01/10] kheaders: Ignore silly-rename files
  2024-12-13 13:50 ` [PATCH 01/10] kheaders: Ignore silly-rename files David Howells
@ 2024-12-21  5:15   ` Masahiro Yamada
  0 siblings, 0 replies; 19+ messages in thread
From: Masahiro Yamada @ 2024-12-21  5:15 UTC (permalink / raw)
  To: David Howells
  Cc: Christian Brauner, Max Kellermann, Ilya Dryomov, Xiubo Li,
	Trond Myklebust, Jeff Layton, Matthew Wilcox, netfs, linux-afs,
	linux-cifs, linux-nfs, ceph-devel, v9fs, linux-erofs,
	linux-fsdevel, linux-mm, linux-kernel, Marc Dionne

On Fri, Dec 13, 2024 at 10:50 PM David Howells <dhowells@redhat.com> wrote:
>
> Tell tar to ignore silly-rename files (".__afs*" and ".nfs*") when building
> the header archive.  These occur when a file that is open is unlinked
> locally, but hasn't yet been closed.  Such files are visible to the user
> via the getdents() syscall and so programs may want to do things with them.
>
> During the kernel build, such files may be made during the processing of
> header files and the cleanup may get deferred by fput() which may result in
> tar seeing these files when it reads the directory, but they may have
> disappeared by the time it tries to open them, causing tar to fail with an
> error.  Further, we don't want to include them in the tarball if they still
> exist.
>
> With CONFIG_HEADERS_INSTALL=y, something like the following may be seen:

I am confused.

kernel/gen_kheaders.sh is executed when CONFIG_IKHEADERS is enabled.

How is CONFIG_HEADERS_INSTALL related?



>    find: './kernel/.tmp_cpio_dir/include/dt-bindings/reset/.__afs2080': No such file or directory
>    tar: ./include/linux/greybus/.__afs3C95: File removed before we read it
>
> The find warning doesn't seem to cause a problem.

I picked the following commit.

https://lore.kernel.org/all/20241218202021.17276-1-elsk@google.com/

This shoots the root cause of the 'find' errors.
Does it fix your problems too?


Your patch does not address the 'find' errors.






>
> Fix this by telling tar when called from in gen_kheaders.sh to exclude such
> files.  This only affects afs and nfs; cifs uses the Windows Hidden
> attribute to prevent the file from being seen.
>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Masahiro Yamada <masahiroy@kernel.org>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: linux-afs@lists.infradead.org
> cc: linux-nfs@vger.kernel.org
> cc: linux-kernel@vger.kernel.org
> ---
>  kernel/gen_kheaders.sh | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/kernel/gen_kheaders.sh b/kernel/gen_kheaders.sh
> index 383fd43ac612..7e1340da5aca 100755
> --- a/kernel/gen_kheaders.sh
> +++ b/kernel/gen_kheaders.sh
> @@ -89,6 +89,7 @@ find $cpio_dir -type f -print0 |
>
>  # Create archive and try to normalize metadata for reproducibility.
>  tar "${KBUILD_BUILD_TIMESTAMP:+--mtime=$KBUILD_BUILD_TIMESTAMP}" \
> +    --exclude=".__afs*" --exclude=".nfs*" \
>      --owner=0 --group=0 --sort=name --numeric-owner --mode=u=rw,go=r,a+X \
>      -I $XZ -cf $tarfile -C $cpio_dir/ . > /dev/null
>
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-12-21  5:15 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-13 13:50 [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes David Howells
2024-12-13 13:50 ` [PATCH 01/10] kheaders: Ignore silly-rename files David Howells
2024-12-21  5:15   ` Masahiro Yamada
2024-12-13 13:50 ` [PATCH 02/10] netfs: Fix non-contiguous donation between completed reads David Howells
2024-12-13 13:50 ` [PATCH 03/10] netfs: Fix enomem handling in buffered reads David Howells
2024-12-13 13:50 ` [PATCH 04/10] nfs: Fix oops in nfs_netfs_init_request() when copying to cache David Howells
2024-12-13 13:50 ` [PATCH 05/10] cachefiles: Parse the "secctx" immediately David Howells
2024-12-13 13:50 ` [PATCH 06/10] netfs: Remove redundant use of smp_rmb() David Howells
2024-12-16 10:13   ` Akira Yokosawa
2024-12-13 13:50 ` [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit() David Howells
2024-12-14 10:16   ` Akira Yokosawa
2024-12-14 13:44   ` David Howells
2024-12-16 10:11     ` Akira Yokosawa
2024-12-13 13:50 ` [PATCH 08/10] netfs: Work around recursion by abandoning retry if nothing read David Howells
2024-12-13 13:50 ` [PATCH 09/10] netfs: Fix ceph copy to cache on write-begin David Howells
2024-12-13 13:50 ` [PATCH 10/10] netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled David Howells
2024-12-13 14:04 ` ceph xfstests failures [was Re: [PATCH 00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes] David Howells
2024-12-18 15:10   ` Alex Markuze
2024-12-16 20:34 ` [PATCH 11/10] netfs: Fix is-caching check in read-retry David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).