All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ihor Solodrai" <ihor.solodrai@linux.dev>
To: "David Howells" <dhowells@redhat.com>
Cc: dhowells@redhat.com, "Marc Dionne" <marc.dionne@auristor.com>,
	"Steve  French" <stfrench@microsoft.com>,
	"Eric Van Hensbergen" <ericvh@kernel.org>,
	"Latchesar  Ionkov" <lucho@ionkov.net>,
	"Dominique Martinet" <asmadeus@codewreck.org>,
	"Christian Schoenebeck" <linux_oss@crudebyte.com>,
	"Paulo Alcantara" <pc@manguebit.com>,
	"Jeff Layton" <jlayton@kernel.org>,
	"Christian Brauner" <brauner@kernel.org>,
	v9fs@lists.linux.dev, linux-cifs@vger.kernel.org,
	netfs@lists.linux.dev, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, ast@kernel.org,
	bpf@vger.kernel.org
Subject: Re: [PATCH] netfs: Add retry stat counters
Date: Tue, 11 Feb 2025 00:54:33 +0000	[thread overview]
Message-ID: <84a8e6737fca05dd3ec234760f1c77901d915ef9@linux.dev> (raw)
In-Reply-To: <3210864.1739229537@warthog.procyon.org.uk>

On 2/10/25 3:18 PM, David Howells wrote:
> Ihor Solodrai <ihor.solodrai@linux.dev> wrote:
>
>> Done. I pushed the logs to the previously mentioned github branch:
>> https://github.com/kernel-patches/bpf/commit/699a3bb95e2291d877737438fb641628702fd18f
>
> Perfect, thanks.
>
> Looking at the last record of /proc/fs/netfs/requests, I see:
>
> 	REQUEST  OR REF FL ERR  OPS COVERAGE
> 	======== == === == ==== === =========
> 	00000a98 RA   1 2001    0   0 @0000 2000/2000
>
> So the request of interest is R=00000a98 in the trace.  Grepping for that, I
> see (with a few columns cut out):
>
>  test_progs-no_a-97: netfs_rreq_ref: R=00000a98 NEW         r=1
>  test_progs-no_a-97: netfs_read: R=00000a98 READAHEAD c=00000000 ni=2ec02f16 s=0 l=2000 sz=17a8
>  test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET SUBREQ  r=2
>  test_progs-no_a-97: netfs_sreq: R=00000a98[1] DOWN TERM  f=192 s=0 17a8/17a8 s=1 e=0
>  test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET WORK    r=3
>  test_progs-no_a-97: netfs_sreq_ref: R=00000a98[1] PUT TERM    r=1
>  test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET SUBREQ  r=4
>  test_progs-no_a-97: netfs_sreq: R=00000a98[2] ZERO SUBMT f=00 s=17a8 0/858 s=0 e=0
>     kworker/u8:2-36: netfs_rreq_ref: R=00000a98 SEE WORK    r=4
>     kworker/u8:2-36: netfs_rreq: R=00000a98 RA COLLECT f=2021
>     kworker/u8:2-36: netfs_sreq: R=00000a98[1] DOWN DSCRD f=92 s=0 17a8/17a8 s=1 e=0
>     kworker/u8:2-36: netfs_sreq_ref: R=00000a98[1] PUT DONE    r=0
>     kworker/u8:2-36: netfs_sreq: R=00000a98[1] DOWN FREE  f=92 s=0 17a8/17a8 s=1 e=0
>     kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT SUBREQ  r=3
>     kworker/u8:2-36: netfs_rreq: R=00000a98 RA COMPLET f=2021
>     kworker/u8:2-36: netfs_rreq: R=00000a98 RA WAKE-IP f=2021
>     kworker/u8:2-36: netfs_rreq: R=00000a98 RA DONE    f=2001
>     kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT WORK    r=2
>  test_progs-no_a-97: netfs_sreq: R=00000a98[2] ZERO TERM  f=102 s=17a8 858/858 s=1 e=0
>  test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET WORK    r=3
>  test_progs-no_a-97: netfs_sreq_ref: R=00000a98[2] PUT TERM    r=1
>  test_progs-no_a-97: netfs_rreq_ref: R=00000a98 PUT RETURN  r=2
>     kworker/u8:2-36: netfs_rreq_ref: R=00000a98 SEE WORK    r=2
>     kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT WORK    r=1
>
> You can see subrequest 1 completes fine, the subrequest is freed and the ref
> it had on the request is put:
>
> 	netfs_sreq: R=00000a98[1] DOWN FREE  f=92 s=0 17a8/17a8 s=1 e=0
> 	netfs_rreq_ref: R=00000a98 PUT SUBREQ  r=3
>
> Subrequest 2, however isn't collected:
>
> 	netfs_sreq: R=00000a98[2] ZERO SUBMT f=00 s=17a8 0/858 s=0 e=0
> 	netfs_sreq: R=00000a98[2] ZERO TERM  f=102 s=17a8 858/858 s=1 e=0
> 	netfs_sreq_ref: R=00000a98[2] PUT TERM    r=1
>
> and the work happens again in kworker/u8:2-36 right at the end:
>
> 	netfs_rreq_ref: R=00000a98 SEE WORK    r=2
> 	netfs_rreq_ref: R=00000a98 PUT WORK    r=1
>
> but this doesn't do anything.
>
> The excess buffer clearance happened in the app thread (test_progs-no_a-97):
>
> 	netfs_sreq: R=00000a98[2] ZERO TERM  f=102 s=17a8 858/858 s=1 e=0
>
>> Let me know if I can help with anything else.
>
> Can you add some more tracepoints?
>
> echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect/enable
> echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect_sreq/enable
> echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect_state/enable

See here: https://github.com/kernel-patches/bpf/commit/517f51d1f6c09ebab9df3e3d17bb669601ab14ef

Beware, uncompressed trace-cmd.log is 37Mb

>
> However, I think I may have spotted the issue: I'm mixing
> clear_and_wake_up_bit() for NETFS_RREQ_IN_PROGRESS (which will use a common
> system waitqueue) with waiting on an rreq-specific waitqueue in such as
> netfs_wait_for_read().
>
> I'll work up a fix patch for that tomorrow.

Great! Thank you.

>
> Thanks,
> David
>

  reply	other threads:[~2025-02-11  0:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28  0:33 [PATCH] netfs: Fix a number of read-retry hangs David Howells
2025-01-28  9:33 ` [PATCH] netfs: Add retry stat counters David Howells
2025-01-28 19:11   ` Ihor Solodrai
2025-02-03 18:01     ` Ihor Solodrai
2025-02-10 10:57     ` David Howells
2025-02-10 11:12     ` David Howells
2025-02-10 21:54       ` Ihor Solodrai
2025-02-10 23:18         ` David Howells
2025-02-11  0:54           ` Ihor Solodrai [this message]
2025-02-12  9:47             ` [PATCH] netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued David Howells
2025-02-12 18:01               ` Ihor Solodrai
2025-01-28 16:51 ` [PATCH] netfs: Fix a number of read-retry hangs Marc Dionne
2025-02-11  1:07 ` Steve French

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84a8e6737fca05dd3ec234760f1c77901d915ef9@linux.dev \
    --to=ihor.solodrai@linux.dev \
    --cc=asmadeus@codewreck.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=jlayton@kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=lucho@ionkov.net \
    --cc=marc.dionne@auristor.com \
    --cc=netfs@lists.linux.dev \
    --cc=pc@manguebit.com \
    --cc=stfrench@microsoft.com \
    --cc=v9fs@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.