From: Tom Talpey <tom@talpey.com>
To: David Howells <dhowells@redhat.com>, Steve French <sfrench@samba.org>
Cc: Paulo Alcantara <pc@manguebit.com>,
Shyam Prasad N <nspmangalore@gmail.com>,
Rohith Surabattula <rohiths.msft@gmail.com>,
Jeff Layton <jlayton@kernel.org>,
linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Aurelien Aptel <aaptel@suse.com>,
netfs@lists.linux.dev
Subject: Re: [PATCH 1/4] cifs: Fix server re-repick on subrequest retry
Date: Fri, 19 Jul 2024 10:27:39 -0400 [thread overview]
Message-ID: <129ce5e2-2dc3-4185-9057-4c88bc02c103@talpey.com> (raw)
In-Reply-To: <20240719140907.1598372-2-dhowells@redhat.com>
On 7/19/2024 10:09 AM, David Howells wrote:
> When a subrequest is marked for needing retry, netfs will call
> cifs_prepare_write() which will make cifs repick the server for the op
> before renegotiating credits; it then calls cifs_issue_write() which
> invokes smb2_async_writev() - which re-repicks the server.
>
> If a different server is then selected, this causes the increment of
> server->in_flight to happen against one record and the decrement to happen
> against another, leading to misaccounting.
>
> Fix this by just removing the repick code in smb2_async_writev(). As this
> is only called from netfslib-driven code, cifs_prepare_write() should
> always have been called first, and so server should never be NULL and the
> preparatory step is repeated in the event that we do a retry.
>
> The problem manifests as a warning looking something like:
>
> WARNING: CPU: 4 PID: 72896 at fs/smb/client/smb2ops.c:97 smb2_add_credits+0x3f0/0x9e0 [cifs]
> ...
> RIP: 0010:smb2_add_credits+0x3f0/0x9e0 [cifs]
> ...
> smb2_writev_callback+0x334/0x560 [cifs]
> cifs_demultiplex_thread+0x77a/0x11b0 [cifs]
> kthread+0x187/0x1d0
> ret_from_fork+0x34/0x60
> ret_from_fork_asm+0x1a/0x30
>
> Which may be triggered by a number of different xfstests running against an
> Azure server in multichannel mode. generic/249 seems the most repeatable,
> but generic/215, generic/249 and generic/308 may also show it.
Nice fix, and good explanation. So, is this the negative-credits issue
we've been looking to fix? Or just one instance?
I'm very curious why it manifested when testing with Azure Files. Do
connection errors disappear with the fix, or do they still occur but
are now recoverable?
Feel free to add...
Acked-by: Tom Talpey <tom@talpey.com>
Tom.
> Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib")
> Reported-by: Steve French <sfrench@samba.org>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: Aurelien Aptel <aaptel@suse.com>
> cc: linux-cifs@vger.kernel.org
> cc: netfs@lists.linux.dev
> cc: linux-fsdevel@vger.kernel.org
> ---
> fs/smb/client/smb2pdu.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
> index 2ae2dbb6202b..bb84a89e5905 100644
> --- a/fs/smb/client/smb2pdu.c
> +++ b/fs/smb/client/smb2pdu.c
> @@ -4859,9 +4859,6 @@ smb2_async_writev(struct cifs_io_subrequest *wdata)
> struct cifs_io_parms *io_parms = NULL;
> int credit_request;
>
> - if (!wdata->server || test_bit(NETFS_SREQ_RETRYING, &wdata->subreq.flags))
> - server = wdata->server = cifs_pick_channel(tcon->ses);
> -
> /*
> * in future we may get cifs_io_parms passed in from the caller,
> * but for now we construct it here...
>
>
>
next prev parent reply other threads:[~2024-07-19 14:27 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-19 14:09 [PATCH 0/4] cifs: Miscellaneous fixes and a trace point David Howells
2024-07-19 14:09 ` [PATCH 1/4] cifs: Fix server re-repick on subrequest retry David Howells
2024-07-19 14:27 ` Tom Talpey [this message]
2024-07-19 15:45 ` Steve French
2024-07-19 14:09 ` [PATCH 2/4] cifs: Fix missing error code set David Howells
2024-07-19 14:09 ` [PATCH 3/4] cifs: Fix setting of zero_point after DIO write David Howells
2024-07-19 14:09 ` [PATCH 4/4] cifs: Add a tracepoint to track credits involved in R/W requests David Howells
2024-07-19 14:26 ` [PATCH 0/4] cifs: Miscellaneous fixes and a trace point Steve French
2024-07-19 15:25 ` [PATCH 5/4] cifs: Fix missing fscache invalidation David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=129ce5e2-2dc3-4185-9057-4c88bc02c103@talpey.com \
--to=tom@talpey.com \
--cc=aaptel@suse.com \
--cc=dhowells@redhat.com \
--cc=jlayton@kernel.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=netfs@lists.linux.dev \
--cc=nspmangalore@gmail.com \
--cc=pc@manguebit.com \
--cc=rohiths.msft@gmail.com \
--cc=sfrench@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox