All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Schoenebeck <linux_oss@crudebyte.com>
To: asmadeus@codewreck.org
Cc: David Howells <dhowells@redhat.com>,
	David Kahurani <k.kahurani@gmail.com>,
	davem@davemloft.net, ericvh@gmail.com, kuba@kernel.org,
	linux-kernel@vger.kernel.org, lucho@ionkov.net,
	netdev@vger.kernel.org, v9fs-developer@lists.sourceforge.net,
	Greg Kurz <groug@kaod.org>
Subject: Re: 9p EBADF with cache enabled (Was: 9p fs-cache tests/benchmark (was: 9p fscache Duplicate cookie detected))
Date: Fri, 03 Jun 2022 18:46:04 +0200	[thread overview]
Message-ID: <3645230.Tf70N6zClz@silver> (raw)
In-Reply-To: <7091002.4ErQJAuLzZ@silver>

On Freitag, 6. Mai 2022 21:14:52 CEST Christian Schoenebeck wrote:
> On Mittwoch, 4. Mai 2022 23:48:47 CEST asmadeus@codewreck.org wrote:
> > Christian Schoenebeck wrote on Wed, May 04, 2022 at 08:33:36PM +0200:
> > > On Dienstag, 3. Mai 2022 12:21:23 CEST asmadeus@codewreck.org wrote:
> > > >  - add some complex code to track the exact byte range that got
> > > >  updated
> > > > 
> > > > in some conditions e.g. WRONLY or read fails?
> > > > That'd still be useful depending on how the backend tracks file mode,
> > > > qemu as user with security_model=mapped-file keeps files 600 but with
> > > > passthrough or none qemu wouldn't be able to read the file regardless
> > > > of
> > > > what we do on client...
> > > > Christian, if you still have an old kernel around did that use to
> > > > work?
> > > 
> > > Sorry, what was the question, i.e. what should I test / look for
> > > precisely? :)
> > 
> > I was curious if older kernel does not issue read at all, or issues read
> > on writeback fid correctly opened as root/RDRW
> > 
> > You can try either the append.c I pasted a few mails back or the dd
> > commands, as regular user.
> > 
> > $ dd if=/dev/zero of=test bs=1M count=1
> > $ chmod 400 test
> > # drop cache or remount
> > $ dd if=/dev/urandom of=test bs=102 seek=2 count=1 conv=notrunc
> > dd: error writing 'test': Bad file descriptor
> 
> Seems you were right, the old kernel opens the file with O_RDWR.
> 
> The following was taken with cache=loose, pre-netfs kernel version, using
> your append code and file to be appended already containing 34 bytes,
> relevant file is fid 7:
> 
>   v9fs_open tag 0 id 12 fid 7 mode 2
>   v9fs_open_return tag 0 id 12 qid={type 0 version 1651854932 path 3108899}
> iounit 4096 v9fs_xattrwalk tag 0 id 30 fid 5 newfid 8 name
> security.capability v9fs_rerror tag 0 id 30 err 95
>   v9fs_read tag 0 id 116 fid 7 off 0 max_count 4096
>   v9fs_read_return tag 0 id 116 count 34 err 45
>   v9fs_read tag 0 id 116 fid 7 off 34 max_count 4062
>   v9fs_read_return tag 0 id 116 count 0 err 11
>   v9fs_clunk tag 0 id 120 fid 6
>   v9fs_clunk tag 0 id 120 fid 4
>   [delay]
>   v9fs_write tag 0 id 118 fid 7 off 0 count 39 cnt 1
>   v9fs_write_return tag 0 id 118 total 39 err 11
>   v9fs_fsync tag 0 id 50 fid 7 datasync 0
> 
> BTW to see this protocol debug output with QEMU:
> 
>   cd qemu/build
>   ../configure --enable-trace-backends=log ...
>   make -jN
>   ./qemu-system-x86_64 -trace 'v9fs*' ...

I had another time slice on this issue today. As Dominique pointed out before,
the writeback_fid was and still is opened with O_RDWR [fs/9p/fid.c]:

struct p9_fid *v9fs_writeback_fid(struct dentry *dentry)
{
	int err;
	struct p9_fid *fid, *ofid;

	ofid = v9fs_fid_lookup_with_uid(dentry, GLOBAL_ROOT_UID, 0);
	fid = clone_fid(ofid);
	if (IS_ERR(fid))
		goto error_out;
	p9_client_clunk(ofid);
	/*
	 * writeback fid will only be used to write back the
	 * dirty pages. We always request for the open fid in read-write
	 * mode so that a partial page write which result in page
	 * read can work.
	 */
	err = p9_client_open(fid, O_RDWR);
	if (err < 0) {
		p9_client_clunk(fid);
		fid = ERR_PTR(err);
		goto error_out;
	}
error_out:
	return fid;
}

The problem rather seems to be that the new netfs code does not use the
writeback_fid when doing an implied read before the actual partial writeback.

As I showed in my previous email, the old pre-netfs kernel versions also did a
read before partial writebacks, but apparently used the special writeback_fid
for that.

I added some trap code to recent netfs kernel version:

diff --git a/net/9p/client.c b/net/9p/client.c
index 8bba0d9cf975..11ff1ee2130e 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -1549,12 +1549,21 @@ int p9_client_unlinkat(struct p9_fid *dfid, const char *name, int flags)
 }
 EXPORT_SYMBOL(p9_client_unlinkat);
 
+void p9_bug(void) {
+    BUG_ON(true);
+}
+EXPORT_SYMBOL(p9_bug);
+
 int
 p9_client_read(struct p9_fid *fid, u64 offset, struct iov_iter *to, int *err)
 {
        int total = 0;
        *err = 0;
 
+    if ((fid->mode & O_ACCMODE) == O_WRONLY) {
+        p9_bug();
+    }
+
        while (iov_iter_count(to)) {
                int count;
 
@@ -1648,6 +1657,10 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
        p9_debug(P9_DEBUG_9P, ">>> TWRITE fid %d offset %llu count %zd\n",
                 fid->fid, offset, iov_iter_count(from));
 
+    if ((fid->mode & O_ACCMODE) == O_RDONLY) {
+        p9_bug();
+    }
+
        while (iov_iter_count(from)) {
                int count = iov_iter_count(from);
                int rsize = fid->iounit;

Which triggers the trap in p9_client_read() with cache=loose. Here is the
backtrace [based on d615b5416f8a1afeb82d13b238f8152c572d59c0]:

[  139.365314] p9_client_read (net/9p/client.c:1553 net/9p/client.c:1564) 9pnet
[  139.148806] v9fs_issue_read (fs/9p/vfs_addr.c:45) 9p
[  139.149268] netfs_begin_read (fs/netfs/io.c:91 fs/netfs/io.c:579 fs/netfs/io.c:625) netfs
[  139.149725] ? xas_load (lib/xarray.c:211 lib/xarray.c:242) 
[  139.150057] ? xa_load (lib/xarray.c:1469) 
[  139.150398] netfs_write_begin (fs/netfs/buffered_read.c:407) netfs
[  139.150883] v9fs_write_begin (fs/9p/vfs_addr.c:279 (discriminator 2)) 9p
[  139.151293] generic_perform_write (mm/filemap.c:3789) 
[  139.151721] ? generic_update_time (fs/inode.c:1858) 
[  139.152112] ? file_update_time (fs/inode.c:2089) 
[  139.152504] __generic_file_write_iter (mm/filemap.c:3916) 
[  139.152943] generic_file_write_iter (./include/linux/fs.h:753 mm/filemap.c:3948) 
[  139.153348] new_sync_write (fs/read_write.c:505 (discriminator 1)) 
[  139.153754] vfs_write (fs/read_write.c:591) 
[  139.154090] ksys_write (fs/read_write.c:644) 
[  139.154417] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) 
[  139.154776] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115)

I still had not time to read the netfs code part yet, but I assume netfs falls
back to a generic 9p read on the O_WRONLY opened fid here, instead of using
the special O_RDWR opened 'writeback_fid'.

Is there already some info available in the netfs API that the read is
actually part of a writeback task, so that we could force on 9p driver level
to use the special writeback_fid for the read in this case instead?

Best regards,
Christian Schoenebeck





  reply	other threads:[~2022-06-03 16:46 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAAZOf26g-L2nSV-Siw6mwWQv1nv6on8c0fWqB4bKmX73QAFzow@mail.gmail.com>
2022-03-26 11:46 ` [syzbot] WARNING in p9_client_destroy David Kahurani
2022-03-26 11:48 ` Christian Schoenebeck
2022-03-26 12:24   ` asmadeus
2022-03-26 12:36     ` Christian Schoenebeck
2022-03-26 13:35       ` 9p fscache Duplicate cookie detected (Was: [syzbot] WARNING in p9_client_destroy) asmadeus
2022-03-30 12:21         ` 9p fs-cache tests/benchmark (was: 9p fscache Duplicate cookie detected) Christian Schoenebeck
2022-03-30 21:47           ` asmadeus
2022-04-01 14:19             ` Christian Schoenebeck
2022-04-01 23:11               ` asmadeus
2022-04-02 12:43                 ` Christian Schoenebeck
2022-04-11  8:10               ` David Howells
2022-04-11  7:59             ` David Howells
2022-04-09 11:16           ` Christian Schoenebeck
2022-04-10 16:18             ` Christian Schoenebeck
2022-04-10 22:54               ` asmadeus
2022-04-11 13:41                 ` Christian Schoenebeck
2022-04-12 22:38                   ` asmadeus
2022-04-14 12:44                     ` Christian Schoenebeck
2022-04-17 12:56                       ` asmadeus
2022-04-17 13:52                         ` Christian Schoenebeck
2022-04-17 21:22                           ` asmadeus
2022-04-17 22:17                             ` 9p EBADF with cache enabled (Was: 9p fs-cache tests/benchmark (was: 9p fscache Duplicate cookie detected)) asmadeus
2022-04-21 10:36                               ` David Howells
2022-04-21 11:36                                 ` Christian Schoenebeck
2022-04-22 13:13                                   ` asmadeus
2022-04-25 14:10                                     ` David Howells
2022-04-26 15:38                                       ` Christian Schoenebeck
2022-05-03 10:21                                         ` asmadeus
2022-05-04 18:33                                           ` Christian Schoenebeck
2022-05-04 21:48                                             ` asmadeus
2022-05-06 19:14                                               ` Christian Schoenebeck
2022-06-03 16:46                                                 ` Christian Schoenebeck [this message]
2022-06-12 10:02                                                   ` asmadeus
2022-06-14  3:38                                                     ` [PATCH] 9p: fix EBADF errors in cached mode Dominique Martinet
2022-06-14  3:41                                                       ` Dominique Martinet
2022-06-14 12:10                                                         ` Christian Schoenebeck
2022-06-14 12:45                                                           ` Dominique Martinet
2022-06-14 14:11                                                             ` Christian Schoenebeck
2022-06-16 13:35                                                               ` Christian Schoenebeck
2022-06-16 13:51                                                                 ` Dominique Martinet
2022-06-16 14:11                                                                   ` Dominique Martinet
2022-06-16 20:14                                                                     ` Christian Schoenebeck
2022-06-16 20:53                                                                       ` Dominique Martinet
2022-06-16 21:10                                                                       ` [PATCH v3] " Dominique Martinet
2022-06-20 12:47                                                                         ` Christian Schoenebeck
2022-06-20 20:34                                                                           ` Dominique Martinet
2022-06-21 12:13                                                                             ` Christian Schoenebeck
2022-06-16 13:52                                                                 ` [PATCH v2] " Dominique Martinet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3645230.Tf70N6zClz@silver \
    --to=linux_oss@crudebyte.com \
    --cc=asmadeus@codewreck.org \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=ericvh@gmail.com \
    --cc=groug@kaod.org \
    --cc=k.kahurani@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucho@ionkov.net \
    --cc=netdev@vger.kernel.org \
    --cc=v9fs-developer@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.