* Re: [PATCH] netfs: Add retry stat counters [not found] ` <3187377.1738056789@warthog.procyon.org.uk> @ 2025-01-28 19:11 ` Ihor Solodrai 2025-02-03 18:01 ` Ihor Solodrai ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Ihor Solodrai @ 2025-01-28 19:11 UTC (permalink / raw) To: David Howells, Marc Dionne, Steve French Cc: dhowells, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf January 28, 2025 at 1:33 AM, "David Howells" <dhowells@redhat.com> wrote: > > Here's an additional patch to allow stats on the number of retries to be > > obtained. This isn't a fix per se. > > David > > [...] Hi David, Marc. I regret to report that this patch didn't fix the hanging in v9fs_evict_inode when running selftests/bpf. Here is what I tried to test: * Checked out latest bpf-next source tree (0fc5dddb9409) * Applied patch: https://lore.kernel.org/netfs/3173328.1738024385@warthog.procyon.org.uk/ * Applied retry stats patch: https://lore.kernel.org/netfs/3187377.1738056789@warthog.procyon.org.uk/ * Modified tools/testing/selftests/bpf/config to enable /proc/fs/netfs/stats * Modified CI scripts to collect the stats * Ran the shell script reproducing the CI testing pipeline Bash piece starting a process collecting /proc/fs/netfs/stats: function tail_netfs { echo -n > /mnt/vmtest/netfs-stats.log while true; do echo >> /mnt/vmtest/netfs-stats.log cat /proc/fs/netfs/stats >> /mnt/vmtest/netfs-stats.log sleep 1 done } export -f tail_netfs nohup bash -c 'tail_netfs' & disown Last recored /proc/fs/netfs/stats (note 0 retries): Reads : DR=0 RA=15184 RF=5 RS=0 WB=0 WBZ=0 Writes : BW=488 WT=0 DW=0 WP=488 2C=0 ZeroOps: ZR=7964 sh=0 sk=0 DownOps: DL=15189 ds=15189 df=0 di=0 CaRdOps: RD=0 rs=0 rf=0 UpldOps: UL=488 us=488 uf=0 CaWrOps: WR=0 ws=0 wf=0 Retries: rq=0 rs=0 wq=0 ws=0 Objs : rr=2 sr=1 foq=1 wsc=0 WbLock : skip=0 wait=0 -- FS-Cache statistics -- Cookies: n=0 v=0 vcol=0 voom=0 Acquire: n=0 ok=0 oom=0 LRU : n=0 exp=0 rmv=0 drp=0 at=0 Invals : n=0 Updates: n=0 rsz=0 rsn=0 Relinqs: n=0 rtr=0 drop=0 NoSpace: nwr=0 ncr=0 cull=0 IO : rd=0 wr=0 mis=0 The stack on hung task hasn't changed: [ 184.375149] INFO: task modprobe:2759 blocked for more than 20 seconds. [ 184.376149] Tainted: G OE 6.13.0-gbb67a65a921c-dirty #3 [ 184.376593] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 184.377119] task:modprobe state:D stack:0 pid:2759 tgid:2759 ppid:455 flags:0x00004002 [ 184.377701] Call Trace: [ 184.377886] <TASK> [ 184.378039] __schedule+0xa91/0xe80 [ 184.378282] schedule+0x41/0xb0 [ 184.378490] v9fs_evict_inode+0xfe/0x170 [ 184.378754] ? __pfx_var_wake_function+0x10/0x10 [ 184.379070] evict+0x1ef/0x360 [ 184.379288] __dentry_kill+0xb0/0x220 [ 184.379528] ? dput+0x3a/0x1d0 [ 184.379736] dput+0x114/0x1d0 [ 184.379946] __fput+0x136/0x2b0 [ 184.380158] task_work_run+0x89/0xc0 [ 184.380396] do_exit+0x2c6/0x9c0 [ 184.380617] do_group_exit+0xa4/0xb0 [ 184.380870] __x64_sys_exit_group+0x17/0x20 [ 184.381137] x64_sys_call+0x21a0/0x21a0 [ 184.381386] do_syscall_64+0x79/0x120 [ 184.381630] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 184.381969] RIP: 0033:0x7f817bf7c21d [ 184.382202] RSP: 002b:00007fff92c8d148 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7 [ 184.382676] RAX: ffffffffffffffda RBX: 00007f817c092fa8 RCX: 00007f817bf7c21d [ 184.383138] RDX: 00000000000000e7 RSI: ffffffffffffff88 RDI: 0000000000000001 [ 184.383582] RBP: 00007fff92c8d1a0 R08: 00007fff92c8d0e8 R09: 0000000000000000 [ 184.384042] R10: 00007fff92c8d05f R11: 0000000000000206 R12: 0000000000000001 [ 184.384486] R13: 0000000000000000 R14: 0000000000000001 R15: 00007f817c092fc0 [ 184.384963] </TASK> [ 184.385112] [ 184.385112] Showing all locks held in the system: [ 184.385499] 1 lock held by khungtaskd/32: [ 184.385793] #0: ffffffff9d195d90 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x2e/0x180 [ 184.386366] 2 locks held by kworker/u8:10/455: [ 184.386649] #0: ffffa1a240104d48 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x23a/0x600 [ 184.387357] #1: ffffb06380a23e20 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_scheduled_works+0x25a/0x600 [ 184.388076] [ 184.388183] ============================================= I pushed full logs to github: https://github.com/kernel-patches/bpf/commit/88c0d0e1692b04c0d54b7c1941003758d23e0d6a I recommend trying to reproduce with steps I shared in my initial report: https://lore.kernel.org/bpf/a7x33d4dnMdGTtRivptq6S1i8btK70SNBP2XyX_xwDAhLvgQoPox6FVBOkifq4eBinfFfbZlIkMZBe3QarlWTxoEtHZwJCZbNKtaqrR7PvI=@pm.me/ I know it may not be very convenient due to all the CI stuff, but you should be able to use it to iterate on the kernel source locally and narrow down the problem. I have everything set up, so you also might share some debugging code for me to run if you prefer. Thanks. --- Not directly related, but it took me a while to figure out how to collect the netfs stats. I first added: CONFIG_NETFS_DEBUG=y CONFIG_NETFS_STATS=y But that didn't work, because /proc/fs/netfs/stats is created only with CONFIG_FSCACHE_STATS (fs/netfs/main.c): #ifdef CONFIG_FSCACHE_STATS if (!proc_create_single("fs/netfs/stats", S_IFREG | 0444, NULL, netfs_stats_show)) goto error_procfile; #endif And that depends on CONFIG_FSCACHE=y, so I ended up with: CONFIG_FSCACHE=y CONFIG_FSCACHE_STATS=y CONFIG_NETFS_SUPPORT=y CONFIG_NETFS_DEBUG=y CONFIG_NETFS_STATS=y ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] netfs: Add retry stat counters 2025-01-28 19:11 ` [PATCH] netfs: Add retry stat counters Ihor Solodrai @ 2025-02-03 18:01 ` Ihor Solodrai 2025-02-10 10:57 ` David Howells 2025-02-10 11:12 ` David Howells 2 siblings, 0 replies; 9+ messages in thread From: Ihor Solodrai @ 2025-02-03 18:01 UTC (permalink / raw) To: David Howells, Marc Dionne, Steve French Cc: dhowells, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf On 1/28/25 11:11 AM, Ihor Solodrai wrote: > January 28, 2025 at 1:33 AM, "David Howells" <dhowells@redhat.com> wrote: > >> Here's an additional patch to allow stats on the number of retries to be >> obtained. This isn't a fix per se. >> David >> >> [...] > > Hi David, Marc. > > I regret to report that this patch didn't fix the hanging in > v9fs_evict_inode when running selftests/bpf. David, Marc, Were you able to reproduce the hanging? We're still using the revert of the original patch [1] in BPF CI as a mitigation. If there is other pending work that may be relevant, please share. Thanks. [1] https://lore.kernel.org/all/20241216204124.3752367-28-dhowells@redhat.com/ > > [...] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] netfs: Add retry stat counters 2025-01-28 19:11 ` [PATCH] netfs: Add retry stat counters Ihor Solodrai 2025-02-03 18:01 ` Ihor Solodrai @ 2025-02-10 10:57 ` David Howells 2025-02-10 11:12 ` David Howells 2 siblings, 0 replies; 9+ messages in thread From: David Howells @ 2025-02-10 10:57 UTC (permalink / raw) To: Ihor Solodrai Cc: dhowells, Marc Dionne, Steve French, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf Ihor Solodrai <ihor.solodrai@linux.dev> wrote: > I recommend trying to reproduce with steps I shared in my initial report: > https://lore.kernel.org/bpf/a7x33d4dnMdGTtRivptq6S1i8btK70SNBP2XyX_xwDAhLvgQoPox6FVBOkifq4eBinfFfbZlIkMZBe3QarlWTxoEtHZwJCZbNKtaqrR7PvI=@pm.me/ > > I know it may not be very convenient due to all the CI stuff, That's an understatement. :-) > but you should be able to use it to iterate on the kernel source locally and > narrow down the problem. Can you share just the reproducer without all the docker stuff? Is this one of those tests that requires 9p over virtio? I have a different environment for that. David ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] netfs: Add retry stat counters 2025-01-28 19:11 ` [PATCH] netfs: Add retry stat counters Ihor Solodrai 2025-02-03 18:01 ` Ihor Solodrai 2025-02-10 10:57 ` David Howells @ 2025-02-10 11:12 ` David Howells 2025-02-10 21:54 ` Ihor Solodrai 2 siblings, 1 reply; 9+ messages in thread From: David Howells @ 2025-02-10 11:12 UTC (permalink / raw) To: Ihor Solodrai Cc: dhowells, Marc Dionne, Steve French, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf Ihor Solodrai <ihor.solodrai@linux.dev> wrote: > Bash piece starting a process collecting /proc/fs/netfs/stats: > > function tail_netfs { > echo -n > /mnt/vmtest/netfs-stats.log > while true; do > echo >> /mnt/vmtest/netfs-stats.log > cat /proc/fs/netfs/stats >> /mnt/vmtest/netfs-stats.log > sleep 1 > done > } > export -f tail_netfs > nohup bash -c 'tail_netfs' & disown I'm afraid, intermediate snapshots of this file aren't particularly useful - just the last snapshot: > Last recored /proc/fs/netfs/stats (note 0 retries): > > Reads : DR=0 RA=15184 RF=5 RS=0 WB=0 WBZ=0 > Writes : BW=488 WT=0 DW=0 WP=488 2C=0 > ZeroOps: ZR=7964 sh=0 sk=0 > DownOps: DL=15189 ds=15189 df=0 di=0 > CaRdOps: RD=0 rs=0 rf=0 > UpldOps: UL=488 us=488 uf=0 > CaWrOps: WR=0 ws=0 wf=0 > Retries: rq=0 rs=0 wq=0 ws=0 > Objs : rr=2 sr=1 foq=1 wsc=0 > WbLock : skip=0 wait=0 > -- FS-Cache statistics -- > Cookies: n=0 v=0 vcol=0 voom=0 > Acquire: n=0 ok=0 oom=0 > LRU : n=0 exp=0 rmv=0 drp=0 at=0 > Invals : n=0 > Updates: n=0 rsz=0 rsn=0 > Relinqs: n=0 rtr=0 drop=0 > NoSpace: nwr=0 ncr=0 cull=0 > IO : rd=0 wr=0 mis=0 Could you collect some tracing: echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_read/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_write/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_write_iter/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq_ref/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq_ref/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_failure/enable and then collect the tracelog: trace-cmd show | bzip2 >some_file_somewhere.bz2 And if you could collect /proc/fs/netfs/requests as well, that will show the debug IDs of the hanging requests. These can be used to grep the trace by prepending "R=". For example, if you see: REQUEST OR REF FL ERR OPS COVERAGE ======== == === == ==== === ========= 00000043 WB 1 2120 0 0 @34000000 0/0 then: trace-cmd show | grep R=00000043 Thanks, David ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] netfs: Add retry stat counters 2025-02-10 11:12 ` David Howells @ 2025-02-10 21:54 ` Ihor Solodrai 2025-02-10 23:18 ` David Howells 0 siblings, 1 reply; 9+ messages in thread From: Ihor Solodrai @ 2025-02-10 21:54 UTC (permalink / raw) To: David Howells Cc: dhowells, Marc Dionne, Steve French, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf On 2/10/25 2:57 AM, David Howells wrote: > Ihor Solodrai <ihor.solodrai@linux.dev> wrote: > >> I recommend trying to reproduce with steps I shared in my initial report: >> https://lore.kernel.org/bpf/a7x33d4dnMdGTtRivptq6S1i8btK70SNBP2XyX_xwDAhLvgQoPox6FVBOkifq4eBinfFfbZlIkMZBe3QarlWTxoEtHZwJCZbNKtaqrR7PvI=@pm.me/ >> >> I know it may not be very convenient due to all the CI stuff, > > That's an understatement. :-) > >> but you should be able to use it to iterate on the kernel source locally and >> narrow down the problem. > > Can you share just the reproducer without all the docker stuff? I wrote a couple of shell scripts with a gist of what's happening on CI: build kernel, build selftests and run. You may try them. Pull this branch from my github: https://github.com/theihor/bpf/tree/netfs-debug It's the kernel source in a broken state with the scripts. Inlining the scripts here: ## ./reproducer.sh #!/bin/bash set -euo pipefail export KBUILD_OUTPUT=$(realpath kbuild-output) mkdir -p $KBUILD_OUTPUT cp -f repro.config $KBUILD_OUTPUT/.config make olddefconfig make -j$(nproc) all make -j$(nproc) headers # apt install lsb-release wget software-properties-common gnupg # bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)" export LLVM_VERSION=18 make -C tools/testing/selftests/bpf \ CLANG=clang-${LLVM_VERSION} \ LLC=llc-${LLVM_VERSION} \ LLVM_STRIP=llvm-strip-${LLVM_VERSION} \ -j$(nproc) test_progs-no_alu32 # wget https://github.com/danobi/vmtest/releases/download/v0.15.0/vmtest-x86_64 # chmod +x vmtest-x86_64 ./vmtest-x86_64 -k $KBUILD_OUTPUT/$(make -s image_name) ./run-bpf-selftests.sh | tee test.log ## end of ./reproducer.sh ## ./run-bpf-selftests.sh #!/bin/bash /bin/mount bpffs /sys/fs/bpf -t bpf ip link set lo up echo 10 > /proc/sys/kernel/hung_task_timeout_secs echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_read/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_write/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_write_iter/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq_ref/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq_ref/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_failure/enable function tail_proc { src=$1 dst=$2 echo -n > $dst while true; do echo >> $dst cat $src >> $dst sleep 1 done } export -f tail_proc nohup bash -c 'tail_proc /proc/fs/netfs/stats netfs-stats.log' & disown nohup bash -c 'tail_proc /proc/fs/netfs/requests netfs-requests.log' & disown nohup bash -c 'trace-cmd show -p > trace-cmd.log' & disown cd tools/testing/selftests/bpf ./test_progs-no_alu32 ## end of ./run-bpf-selftests.sh One of the reasons for suggesting docker is that all the dependencies are pre-packaged in the image, and so the environment is pretty close to the actual CI environment. With only shell scripts you will have to detect and install missing dependencies on your system and hope package versions are more or less the same and don't affect the issue. Notable things: LLVM 18, pahole, qemu, qemu-guest-agent, vmtest tool. > Is this one > of those tests that requires 9p over virtio? I have a different environment > for that. We run the tests via vmtest tool: https://github.com/danobi/vmtest This is essentially a qemu wrapper. I am not familiar with its internals, but for sure it is using 9p. On 2/10/25 3:12 AM, David Howells wrote: > Ihor Solodrai <ihor.solodrai@linux.dev> wrote: > >> Bash piece starting a process collecting /proc/fs/netfs/stats: >> >> function tail_netfs { >> echo -n > /mnt/vmtest/netfs-stats.log >> while true; do >> echo >> /mnt/vmtest/netfs-stats.log >> cat /proc/fs/netfs/stats >> /mnt/vmtest/netfs-stats.log >> sleep 1 >> done >> } >> export -f tail_netfs >> nohup bash -c 'tail_netfs' & disown > > I'm afraid, intermediate snapshots of this file aren't particularly useful - > just the last snapshot: The reason I wrote it like this is because the test runner hangs, and so I have to kill qemu to stop it (with no ability to run post-processing within qemu instance; well, at least I don't know how to do it). > > [...] > > Could you collect some tracing: > > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_read/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_write/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_write_iter/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq_ref/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq_ref/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_failure/enable > > and then collect the tracelog: > > trace-cmd show | bzip2 >some_file_somewhere.bz2 > > And if you could collect /proc/fs/netfs/requests as well, that will show the > debug IDs of the hanging requests. These can be used to grep the trace by > prepending "R=". For example, if you see: > > REQUEST OR REF FL ERR OPS COVERAGE > ======== == === == ==== === ========= > 00000043 WB 1 2120 0 0 @34000000 0/0 > > then: > > trace-cmd show | grep R=00000043 Done. I pushed the logs to the previously mentioned github branch: https://github.com/kernel-patches/bpf/commit/699a3bb95e2291d877737438fb641628702fd18f Let me know if I can help with anything else. > > Thanks, > David > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] netfs: Add retry stat counters 2025-02-10 21:54 ` Ihor Solodrai @ 2025-02-10 23:18 ` David Howells 2025-02-11 0:54 ` Ihor Solodrai 0 siblings, 1 reply; 9+ messages in thread From: David Howells @ 2025-02-10 23:18 UTC (permalink / raw) To: Ihor Solodrai Cc: dhowells, Marc Dionne, Steve French, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf Ihor Solodrai <ihor.solodrai@linux.dev> wrote: > Done. I pushed the logs to the previously mentioned github branch: > https://github.com/kernel-patches/bpf/commit/699a3bb95e2291d877737438fb641628702fd18f Perfect, thanks. Looking at the last record of /proc/fs/netfs/requests, I see: REQUEST OR REF FL ERR OPS COVERAGE ======== == === == ==== === ========= 00000a98 RA 1 2001 0 0 @0000 2000/2000 So the request of interest is R=00000a98 in the trace. Grepping for that, I see (with a few columns cut out): test_progs-no_a-97: netfs_rreq_ref: R=00000a98 NEW r=1 test_progs-no_a-97: netfs_read: R=00000a98 READAHEAD c=00000000 ni=2ec02f16 s=0 l=2000 sz=17a8 test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET SUBREQ r=2 test_progs-no_a-97: netfs_sreq: R=00000a98[1] DOWN TERM f=192 s=0 17a8/17a8 s=1 e=0 test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET WORK r=3 test_progs-no_a-97: netfs_sreq_ref: R=00000a98[1] PUT TERM r=1 test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET SUBREQ r=4 test_progs-no_a-97: netfs_sreq: R=00000a98[2] ZERO SUBMT f=00 s=17a8 0/858 s=0 e=0 kworker/u8:2-36: netfs_rreq_ref: R=00000a98 SEE WORK r=4 kworker/u8:2-36: netfs_rreq: R=00000a98 RA COLLECT f=2021 kworker/u8:2-36: netfs_sreq: R=00000a98[1] DOWN DSCRD f=92 s=0 17a8/17a8 s=1 e=0 kworker/u8:2-36: netfs_sreq_ref: R=00000a98[1] PUT DONE r=0 kworker/u8:2-36: netfs_sreq: R=00000a98[1] DOWN FREE f=92 s=0 17a8/17a8 s=1 e=0 kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT SUBREQ r=3 kworker/u8:2-36: netfs_rreq: R=00000a98 RA COMPLET f=2021 kworker/u8:2-36: netfs_rreq: R=00000a98 RA WAKE-IP f=2021 kworker/u8:2-36: netfs_rreq: R=00000a98 RA DONE f=2001 kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT WORK r=2 test_progs-no_a-97: netfs_sreq: R=00000a98[2] ZERO TERM f=102 s=17a8 858/858 s=1 e=0 test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET WORK r=3 test_progs-no_a-97: netfs_sreq_ref: R=00000a98[2] PUT TERM r=1 test_progs-no_a-97: netfs_rreq_ref: R=00000a98 PUT RETURN r=2 kworker/u8:2-36: netfs_rreq_ref: R=00000a98 SEE WORK r=2 kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT WORK r=1 You can see subrequest 1 completes fine, the subrequest is freed and the ref it had on the request is put: netfs_sreq: R=00000a98[1] DOWN FREE f=92 s=0 17a8/17a8 s=1 e=0 netfs_rreq_ref: R=00000a98 PUT SUBREQ r=3 Subrequest 2, however isn't collected: netfs_sreq: R=00000a98[2] ZERO SUBMT f=00 s=17a8 0/858 s=0 e=0 netfs_sreq: R=00000a98[2] ZERO TERM f=102 s=17a8 858/858 s=1 e=0 netfs_sreq_ref: R=00000a98[2] PUT TERM r=1 and the work happens again in kworker/u8:2-36 right at the end: netfs_rreq_ref: R=00000a98 SEE WORK r=2 netfs_rreq_ref: R=00000a98 PUT WORK r=1 but this doesn't do anything. The excess buffer clearance happened in the app thread (test_progs-no_a-97): netfs_sreq: R=00000a98[2] ZERO TERM f=102 s=17a8 858/858 s=1 e=0 > Let me know if I can help with anything else. Can you add some more tracepoints? echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect_sreq/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect_state/enable However, I think I may have spotted the issue: I'm mixing clear_and_wake_up_bit() for NETFS_RREQ_IN_PROGRESS (which will use a common system waitqueue) with waiting on an rreq-specific waitqueue in such as netfs_wait_for_read(). I'll work up a fix patch for that tomorrow. Thanks, David ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] netfs: Add retry stat counters 2025-02-10 23:18 ` David Howells @ 2025-02-11 0:54 ` Ihor Solodrai 2025-02-12 9:47 ` [PATCH] netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued David Howells 0 siblings, 1 reply; 9+ messages in thread From: Ihor Solodrai @ 2025-02-11 0:54 UTC (permalink / raw) To: David Howells Cc: dhowells, Marc Dionne, Steve French, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf On 2/10/25 3:18 PM, David Howells wrote: > Ihor Solodrai <ihor.solodrai@linux.dev> wrote: > >> Done. I pushed the logs to the previously mentioned github branch: >> https://github.com/kernel-patches/bpf/commit/699a3bb95e2291d877737438fb641628702fd18f > > Perfect, thanks. > > Looking at the last record of /proc/fs/netfs/requests, I see: > > REQUEST OR REF FL ERR OPS COVERAGE > ======== == === == ==== === ========= > 00000a98 RA 1 2001 0 0 @0000 2000/2000 > > So the request of interest is R=00000a98 in the trace. Grepping for that, I > see (with a few columns cut out): > > test_progs-no_a-97: netfs_rreq_ref: R=00000a98 NEW r=1 > test_progs-no_a-97: netfs_read: R=00000a98 READAHEAD c=00000000 ni=2ec02f16 s=0 l=2000 sz=17a8 > test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET SUBREQ r=2 > test_progs-no_a-97: netfs_sreq: R=00000a98[1] DOWN TERM f=192 s=0 17a8/17a8 s=1 e=0 > test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET WORK r=3 > test_progs-no_a-97: netfs_sreq_ref: R=00000a98[1] PUT TERM r=1 > test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET SUBREQ r=4 > test_progs-no_a-97: netfs_sreq: R=00000a98[2] ZERO SUBMT f=00 s=17a8 0/858 s=0 e=0 > kworker/u8:2-36: netfs_rreq_ref: R=00000a98 SEE WORK r=4 > kworker/u8:2-36: netfs_rreq: R=00000a98 RA COLLECT f=2021 > kworker/u8:2-36: netfs_sreq: R=00000a98[1] DOWN DSCRD f=92 s=0 17a8/17a8 s=1 e=0 > kworker/u8:2-36: netfs_sreq_ref: R=00000a98[1] PUT DONE r=0 > kworker/u8:2-36: netfs_sreq: R=00000a98[1] DOWN FREE f=92 s=0 17a8/17a8 s=1 e=0 > kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT SUBREQ r=3 > kworker/u8:2-36: netfs_rreq: R=00000a98 RA COMPLET f=2021 > kworker/u8:2-36: netfs_rreq: R=00000a98 RA WAKE-IP f=2021 > kworker/u8:2-36: netfs_rreq: R=00000a98 RA DONE f=2001 > kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT WORK r=2 > test_progs-no_a-97: netfs_sreq: R=00000a98[2] ZERO TERM f=102 s=17a8 858/858 s=1 e=0 > test_progs-no_a-97: netfs_rreq_ref: R=00000a98 GET WORK r=3 > test_progs-no_a-97: netfs_sreq_ref: R=00000a98[2] PUT TERM r=1 > test_progs-no_a-97: netfs_rreq_ref: R=00000a98 PUT RETURN r=2 > kworker/u8:2-36: netfs_rreq_ref: R=00000a98 SEE WORK r=2 > kworker/u8:2-36: netfs_rreq_ref: R=00000a98 PUT WORK r=1 > > You can see subrequest 1 completes fine, the subrequest is freed and the ref > it had on the request is put: > > netfs_sreq: R=00000a98[1] DOWN FREE f=92 s=0 17a8/17a8 s=1 e=0 > netfs_rreq_ref: R=00000a98 PUT SUBREQ r=3 > > Subrequest 2, however isn't collected: > > netfs_sreq: R=00000a98[2] ZERO SUBMT f=00 s=17a8 0/858 s=0 e=0 > netfs_sreq: R=00000a98[2] ZERO TERM f=102 s=17a8 858/858 s=1 e=0 > netfs_sreq_ref: R=00000a98[2] PUT TERM r=1 > > and the work happens again in kworker/u8:2-36 right at the end: > > netfs_rreq_ref: R=00000a98 SEE WORK r=2 > netfs_rreq_ref: R=00000a98 PUT WORK r=1 > > but this doesn't do anything. > > The excess buffer clearance happened in the app thread (test_progs-no_a-97): > > netfs_sreq: R=00000a98[2] ZERO TERM f=102 s=17a8 858/858 s=1 e=0 > >> Let me know if I can help with anything else. > > Can you add some more tracepoints? > > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect_sreq/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_collect_state/enable See here: https://github.com/kernel-patches/bpf/commit/517f51d1f6c09ebab9df3e3d17bb669601ab14ef Beware, uncompressed trace-cmd.log is 37Mb > > However, I think I may have spotted the issue: I'm mixing > clear_and_wake_up_bit() for NETFS_RREQ_IN_PROGRESS (which will use a common > system waitqueue) with waiting on an rreq-specific waitqueue in such as > netfs_wait_for_read(). > > I'll work up a fix patch for that tomorrow. Great! Thank you. > > Thanks, > David > ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued 2025-02-11 0:54 ` Ihor Solodrai @ 2025-02-12 9:47 ` David Howells 2025-02-12 18:01 ` Ihor Solodrai 0 siblings, 1 reply; 9+ messages in thread From: David Howells @ 2025-02-12 9:47 UTC (permalink / raw) To: Ihor Solodrai Cc: dhowells, Marc Dionne, Steve French, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf Hi Ihor, Okay, the bug you're hitting appears to be a different one to the one I thought first. Can you try the attached patch? I managed to reproduce it with AFS by injecting a delay. Grepping your logs for the stuck request, you can see the issue: ip: netfs_rreq_ref: R=00002152 NEW r=1 ip: netfs_read: R=00002152 READAHEAD c=00000000 ni=1034fe3 s=4000 l=3000 sz=6898 ip: netfs_rreq_ref: R=00002152 GET SUBREQ r=2 Subrequest 1 completes synchronously and queues the collector work item: ip: netfs_sreq: R=00002152[1] DOWN TERM f=192 s=4000 2898/2898 s=2 e=0 ip: netfs_rreq_ref: R=00002152 GET WORK r=3 kworker/u8:3: netfs_rreq_ref: R=00002152 SEE WORK r=3 ip: netfs_sreq_ref: R=00002152[1] PUT TERM r=1 ip: netfs_rreq_ref: R=00002152 GET SUBREQ r=4 Then proposed a new subreq to clear the end of the page, but it's not queued at this point: ip: netfs_sreq: R=00002152[2] ZERO SUBMT f=00 s=6898 0/768 s=0 e=0 (I should probably move the tracepoint to the queue point to make it more obvious). The collector processes the subrequests it can see, and NETFS_RREQ_ALL_QUEUED (0x2000) is set in the flags (f=2021): kworker/u8:3: netfs_rreq: R=00002152 RA COLLECT f=2021 kworker/u8:3: netfs_collect: R=00002152 s=4000-7000 kworker/u8:3: netfs_collect_sreq: R=00002152[0:01] s=4000 t=2898/2898 kworker/u8:3: netfs_sreq: R=00002152[1] DOWN DSCRD f=92 s=4000 2898/2898 s=2 e=0 kworker/u8:3: netfs_sreq_ref: R=00002152[1] PUT DONE r=0 kworker/u8:3: netfs_sreq: R=00002152[1] DOWN FREE f=92 s=4000 2898/2898 s=2 e=0 kworker/u8:3: netfs_rreq_ref: R=00002152 PUT SUBREQ r=3 The notes (n=x) indicate that the collector didn't see subreq 2 (bit 0, HIT_PENDING, wasn't set)...: kworker/u8:3: netfs_collect_state: R=00002152 col=6898 cln=7000 n=c kworker/u8:3: netfs_collect_state: R=00002152 col=6898 cln=7000 n=8 ... and so it completed the request: kworker/u8:3: netfs_rreq: R=00002152 RA COMPLET f=2021 kworker/u8:3: netfs_rreq: R=00002152 RA WAKE-IP f=2021 And now, NETFS_RREQ_IN_PROGRESS has been cleared, which means we can't get back into the read collector. kworker/u8:3: netfs_rreq: R=00002152 RA DONE f=2001 kworker/u8:3: netfs_rreq_ref: R=00002152 PUT WORK r=2 Then subreq 2 finishes and you can see the worker happen, but do nothing: ip: netfs_sreq: R=00002152[2] ZERO TERM f=102 s=6898 768/768 s=2 e=0 ip: netfs_rreq_ref: R=00002152 GET WORK r=3 kworker/u8:3: netfs_rreq_ref: R=00002152 SEE WORK r=3 kworker/u8:3: netfs_rreq_ref: R=00002152 PUT WORK r=2 David --- netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued Due to the code that queues a subreq on the active subrequest list getting moved to netfs_issue_read(), the NETFS_RREQ_ALL_QUEUED flag may now get set before the list-add actually happens. This is not a problem if the collection worker happens after the list-add, but it's a race - and, for 9P, where the read from the server is synchronous and done in the submitting thread, this is a lot more likely. The result is that, if the timing is wrong, a ref gets leaked because the collector thinks that all the subreqs have completed (because it can't see the last one yet) and clears NETFS_RREQ_IN_PROGRESS - at which point, the collection worker no longer goes into the collector. This can be provoked with AFS by injecting an msleep() right before the final subreq is queued. Fix this by splitting the queuing part out of netfs_issue_read() into a new function, netfs_queue_read(), and calling it separately. The setting of NETFS_RREQ_ALL_QUEUED is then done by netfs_queue_read() whilst it is holding the spinlock (that's probably unnecessary, but shouldn't hurt). It might be better to set a flag on the final subreq, but this could be a problem if an error occurs and we can't queue it. Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item") Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Closes: https://lore.kernel.org/r/a7x33d4dnMdGTtRivptq6S1i8btK70SNBP2XyX_xwDAhLvgQoPox6FVBOkifq4eBinfFfbZlIkMZBe3QarlWTxoEtHZwJCZbNKtaqrR7PvI=@pm.me/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Steve French <stfrench@microsoft.com> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org --- fs/netfs/buffered_read.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c index f761d44b3436..0d1b6d35ff3b 100644 --- a/fs/netfs/buffered_read.c +++ b/fs/netfs/buffered_read.c @@ -155,8 +155,9 @@ static void netfs_read_cache_to_pagecache(struct netfs_io_request *rreq, netfs_cache_read_terminated, subreq); } -static void netfs_issue_read(struct netfs_io_request *rreq, - struct netfs_io_subrequest *subreq) +static void netfs_queue_read(struct netfs_io_request *rreq, + struct netfs_io_subrequest *subreq, + bool last_subreq) { struct netfs_io_stream *stream = &rreq->io_streams[0]; @@ -177,8 +178,17 @@ static void netfs_issue_read(struct netfs_io_request *rreq, } } + if (last_subreq) { + smp_wmb(); /* Write lists before ALL_QUEUED. */ + set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); + } + spin_unlock(&rreq->lock); +} +static void netfs_issue_read(struct netfs_io_request *rreq, + struct netfs_io_subrequest *subreq) +{ switch (subreq->source) { case NETFS_DOWNLOAD_FROM_SERVER: rreq->netfs_ops->issue_read(subreq); @@ -293,11 +303,8 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq) } size -= slice; start += slice; - if (size <= 0) { - smp_wmb(); /* Write lists before ALL_QUEUED. */ - set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); - } + netfs_queue_read(rreq, subreq, size <= 0); netfs_issue_read(rreq, subreq); cond_resched(); } while (size > 0); ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued 2025-02-12 9:47 ` [PATCH] netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued David Howells @ 2025-02-12 18:01 ` Ihor Solodrai 0 siblings, 0 replies; 9+ messages in thread From: Ihor Solodrai @ 2025-02-12 18:01 UTC (permalink / raw) To: David Howells Cc: dhowells, Marc Dionne, Steve French, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, Paulo Alcantara, Jeff Layton, Christian Brauner, v9fs, linux-cifs, netfs, linux-fsdevel, linux-kernel, ast, bpf On 2/12/25 1:47 AM, David Howells wrote: > Hi Ihor, > > Okay, the bug you're hitting appears to be a different one to the one I > thought first. Can you try the attached patch? I managed to reproduce it > with AFS by injecting a delay. > > [...] > > netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued > > [...] Hi David. I tried this patch locally, and then on BPF CI. It fixes the 9p hanging issue. A couple of platforms and toolchains are tested there: https://github.com/kernel-patches/vmtest/actions/runs/13291034531 Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev> Note that on CI the "netfs: Fix a number of read-retry hangs" patch [1] is *not* applied. Only this one. Thank you! [1] https://lore.kernel.org/v9fs/3173328.1738024385@warthog.procyon.org.uk/ > [...] ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-02-12 18:01 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <3173328.1738024385@warthog.procyon.org.uk>
[not found] ` <3187377.1738056789@warthog.procyon.org.uk>
2025-01-28 19:11 ` [PATCH] netfs: Add retry stat counters Ihor Solodrai
2025-02-03 18:01 ` Ihor Solodrai
2025-02-10 10:57 ` David Howells
2025-02-10 11:12 ` David Howells
2025-02-10 21:54 ` Ihor Solodrai
2025-02-10 23:18 ` David Howells
2025-02-11 0:54 ` Ihor Solodrai
2025-02-12 9:47 ` [PATCH] netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued David Howells
2025-02-12 18:01 ` Ihor Solodrai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox