Re: fs/9p: regression in 6.8-rc1

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dominique Martinet <asmadeus@codewreck.org>
To: dhowells@redhat.com, Eric Van Hensbergen <ericvh@kernel.org>
Cc: v9fs@lists.linux.dev, linux_oss@crudebyte.com
Subject: Re: fs/9p: regression in 6.8-rc1
Date: Sun, 28 Jan 2024 22:06:12 +0900	[thread overview]
Message-ID: <ZbZRRBP453U8PZ8a@codewreck.org> (raw)
In-Reply-To: <ZbRiD2a6F9cDm_9n@codewreck.org>

Dominique Martinet wrote on Sat, Jan 27, 2024 at 10:53:19AM +0900:
> Eric Van Hensbergen wrote on Fri, Jan 26, 2024 at 02:21:39PM -0600:
> > I caught a problem in the new netfs code when running in 9p when running
> > with nocache mode.  A regression sweep is turning up a:
> >  [ 1084.438387] netfs: Zero-sized write [R=1b6da]
> > when running my ldconfig test (included at the end of this)
> > it reports:
> > /sbin/ldconfig.real: Writing of cache extension data failed: Input/output error
> > 
> > I will try to dig into this later today if I have time, but not sure I'll get
> > to it so I wanted to make other folks aware.  I'm not sure how much other
> > elements of my test harness are contributing to reproducing the problem.

Didn't get much time, I can just confirm I can reproduce, it boils down
to a 0-size write:

$ xfs_io -f -c 'pwrite 0 0' foo
(dmesg) netfs: Zero-sized write [R=fb5]
pwrite: Input/output error



I was going to say we probably need to filter it out - but it looks like
that might be netfs' job given the call trace I get:

# retsnoop -T -e vfs_write -a :fs/9p/*.c -a :fs/netfs/*.c
FUNCTION CALL TRACE                 RESULT                  DURATION
---------------------------------   --------------------  ----------
→ vfs_write                                                         
    → netfs_unbuffered_write_iter                                   
        ↔ netfs_start_io_direct     [0]                      0.391us
        → netfs_alloc_request                                       
            ↔ v9fs_init_request     [0]                      0.431us
        ← netfs_alloc_request       [0xffff8b8a5584d600]     1.653us
        ↔ netfs_extract_user_iter   [0]                      0.671us
        → netfs_begin_write                                         
            ↔ v9fs_free_inode       [void]                  33.653us
            ↔ v9fs_free_inode       [void]                   0.511us
            ↔ v9fs_free_inode       [void]                   0.371us
            ↔ v9fs_free_inode       [void]                   0.350us
            ↔ v9fs_free_inode       [void]                   0.391us
            ↔ v9fs_free_inode       [void]                   0.361us
            ↔ v9fs_free_inode       [void]                   0.451us
            ↔ v9fs_free_inode       [void]                   0.391us
            ↔ v9fs_free_inode       [void]                   0.391us
            ↔ v9fs_free_inode       [void]                   0.451us
        ← netfs_begin_write         [-EIO]                1120.811us
        → netfs_free_request                                        
            ↔ v9fs_free_request     [void]                  28.062us
        ← netfs_free_request        [void]                  44.423us
        ↔ netfs_end_io_direct       [void]                   0.421us
    ← netfs_unbuffered_write_iter   [-EIO]                1207.784us
← vfs_write                         [-EIO]                1210.228us


David, where do you think we should catch that?
Can we leave that fix to you?


> The syzbot report (refcount underflow[1]) is also probably related; I'll
> try to find some time to check a bit more this weekend
> 
> [1] https://lkml.kernel.org/r/000000000000ee5c6c060fd59890@google.com

So that one's not directly related to this, but given the timing I'd
still bet something changed around cache... I didn't manage to
reproduce it on a very quick workload but I didn't run all that much
yet, will need to spend a bit more time on that another day...

-- 
Dominique Martinet | Asmadeus

next prev parent reply	other threads:[~2024-01-28 13:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26 20:21 fs/9p: regression in 6.8-rc1 Eric Van Hensbergen
2024-01-27  1:53 ` Dominique Martinet
2024-01-28 13:06   ` Dominique Martinet [this message]
2024-01-29  9:20     ` David Howells
2024-01-30 15:41 ` Linux regression tracking #adding (Thorsten Leemhuis)
2024-02-18 10:10   ` Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZbZRRBP453U8PZ8a@codewreck.org \
    --to=asmadeus@codewreck.org \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=v9fs@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.