From: Mike Snitzer <snitzer@kernel.org>
To: NeilBrown <neil@brown.name>
Cc: Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna.schumaker@oracle.com>,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2 0/3] Fix localio hangs
Date: Wed, 16 Jul 2025 19:27:00 -0400 [thread overview]
Message-ID: <aHg1RLw-5Csbiber@kernel.org> (raw)
In-Reply-To: <175270375199.2234665.7748991440226043304@noble.neil.brown.name>
On Thu, Jul 17, 2025 at 08:09:11AM +1000, NeilBrown wrote:
> On Thu, 17 Jul 2025, Trond Myklebust wrote:
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> >
> > The following patch series fixes a series of issues with the current
> > localio code, as reported in the link
> > https://lore.kernel.org/linux-nfs/aG0pJXVtApZ9C5vy@kernel.org/
> >
> >
> > Trond Myklebust (3):
> > NFS/localio: nfs_close_local_fh() fix check for file closed
> > NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh()
> > NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file
>
> That all looks good to me - thanks a lot for finding and fixing my bugs.
>
> Reviewed-by: NeilBrown <neil@brown.name>
>
> I'd still like to fix the nfsd_file_cache_purge() issue but that is
> quite separate especially now that you've prevented it causing problems
> for nfs_uuid_put().
>
> thanks,
> NeilBrown
Unfortunately even with these 3 v2 fixes I was just able to hit the
same hang on NFSD shutdown. It took 5 iterations of the fio test,
reported here:
https://lore.kernel.org/linux-nfs/aG0pJXVtApZ9C5vy@kernel.org/
So it is harder to hit with these v2 fixes, nevertheless:
[ 369.528839] task:rpc.nfsd state:D stack:0 pid:10569 tgid:10569 ppid:1 flags:0x00004006
[ 369.528985] Call Trace:
[ 369.529127] <TASK>
[ 369.529295] __schedule+0x26d/0x530
[ 369.529435] schedule+0x27/0xa0
[ 369.529566] schedule_timeout+0x14e/0x160
[ 369.529700] ? svc_destroy+0xce/0x160 [sunrpc]
[ 369.529882] ? lockd_put+0x5f/0x90 [lockd]
[ 369.530022] __wait_for_common+0x8f/0x1d0
[ 369.530154] ? __pfx_schedule_timeout+0x10/0x10
[ 369.530329] nfsd_destroy_serv+0x13f/0x1a0 [nfsd]
[ 369.530516] nfsd_svc+0xe0/0x170 [nfsd]
[ 369.530684] write_threads+0xc3/0x190 [nfsd]
[ 369.530845] ? simple_transaction_get+0xc2/0xe0
[ 369.530973] ? __pfx_write_threads+0x10/0x10 [nfsd]
[ 369.531133] nfsctl_transaction_write+0x47/0x80 [nfsd]
[ 369.531324] vfs_write+0xfa/0x420
[ 369.531448] ? do_filp_open+0xae/0x150
[ 369.531574] ksys_write+0x63/0xe0
[ 369.531693] do_syscall_64+0x7d/0x160
[ 369.531816] ? do_sys_openat2+0x81/0xd0
[ 369.531937] ? syscall_exit_work+0xf3/0x120
[ 369.532058] ? syscall_exit_to_user_mode+0x32/0x1b0
[ 369.532178] ? do_syscall_64+0x89/0x160
[ 369.532344] ? __mod_memcg_lruvec_state+0x95/0x150
[ 369.532465] ? __lruvec_stat_mod_folio+0x84/0xd0
[ 369.532584] ? syscall_exit_work+0xf3/0x120
[ 369.532705] ? syscall_exit_to_user_mode+0x32/0x1b0
[ 369.532827] ? do_syscall_64+0x89/0x160
[ 369.532947] ? __handle_mm_fault+0x326/0x730
[ 369.533066] ? __mod_memcg_lruvec_state+0x95/0x150
[ 369.533187] ? __count_memcg_events+0x53/0xf0
[ 369.533306] ? handle_mm_fault+0x245/0x340
[ 369.533427] ? do_user_addr_fault+0x341/0x6b0
[ 369.533547] ? exc_page_fault+0x70/0x160
[ 369.533666] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 369.533787] RIP: 0033:0x7f1db10fd617
crash> dis -l nfsd_destroy_serv+0x13f
/root/snitm/git/linux-HS/fs/nfsd/nfssvc.c: 468
0xffffffffc172e36f <nfsd_destroy_serv+319>: mov %r12,%rdi
which is the percpu_ref_exit() in nfsd_shutdown_net():
static void nfsd_shutdown_net(struct net *net)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
if (!nn->nfsd_net_up)
return;
percpu_ref_kill_and_confirm(&nn->nfsd_net_ref, nfsd_net_done);
wait_for_completion(&nn->nfsd_net_confirm_done);
nfsd_export_flush(net);
nfs4_state_shutdown_net(net);
nfsd_reply_cache_shutdown(nn);
nfsd_file_cache_shutdown_net(net);
if (nn->lockd_up) {
lockd_down(net);
nn->lockd_up = false;
}
wait_for_completion(&nn->nfsd_net_free_done);
---> percpu_ref_exit(&nn->nfsd_net_ref);
nn->nfsd_net_up = false;
nfsd_shutdown_generic();
}
next prev parent reply other threads:[~2025-07-16 23:27 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-09 0:46 [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers NeilBrown
2025-05-09 0:46 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
2025-05-09 0:46 ` [PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref NeilBrown
2025-05-09 0:46 ` [PATCH 3/6] nfs_localio: simplify interface to nfsd for getting nfsd_file NeilBrown
2025-05-09 0:46 ` [PATCH 4/6] nfs_localio: duplicate nfs_close_local_fh() NeilBrown
2025-05-09 0:46 ` [PATCH 5/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() NeilBrown
2025-05-09 0:46 ` [PATCH 6/6] nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer NeilBrown
2025-05-09 11:03 ` kernel test robot
2025-07-08 14:20 ` [RFC PATCH for 6.16-rcX] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 0/9] NFSD/NFS/LOCALIO: stable fixes and revert 6.16 LOCALIO changes Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 1/9] Revert "NFSD: Clean up kdoc for nfsd_open_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 2/9] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 3/9] Revert "nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 4/9] Revert "nfs_localio: duplicate nfs_close_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 5/9] Revert "nfs_localio: simplify interface to nfsd for getting nfsd_file" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 6/9] Revert "nfs_localio: always hold nfsd net ref with nfsd_file ref" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 7/9] Revert "nfs_localio: use cmpxchg() to install new nfs_file_localio" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 8/9] nfs/localio: avoid bouncing LOCALIO if nfs_client_is_local() Mike Snitzer
2025-07-14 4:19 ` NeilBrown
2025-07-14 14:37 ` Mike Snitzer
2025-07-14 12:23 ` Jeff Layton
2025-07-14 3:13 ` [for-6.16-final PATCH 9/9] nfs/localio: add localio_async_probe modparm Mike Snitzer
2025-07-14 4:23 ` NeilBrown
2025-07-14 12:28 ` Jeff Layton
2025-07-14 14:08 ` Mike Snitzer
2025-07-14 3:50 ` [RFC PATCH for 6.16-rcX] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" NeilBrown
2025-07-14 14:45 ` Mike Snitzer
2025-07-15 22:52 ` [PATCH 0/3] Fix localio hangs Trond Myklebust
2025-07-15 22:52 ` [PATCH 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed Trond Myklebust
2025-07-15 22:52 ` [PATCH 2/3] NFS/localio: nfs_uuid_put() fix the wait for file unlink events Trond Myklebust
2025-07-15 22:52 ` [PATCH 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file Trond Myklebust
2025-07-16 1:09 ` [PATCH 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed NeilBrown
2025-07-16 1:22 ` [PATCH 2/3] NFS/localio: nfs_uuid_put() fix the wait for file unlink events NeilBrown
2025-07-16 2:29 ` Trond Myklebust
2025-07-16 3:51 ` NeilBrown
2025-07-16 1:31 ` [PATCH 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file NeilBrown
2025-07-16 4:17 ` Trond Myklebust
2025-07-16 5:07 ` NeilBrown
2025-07-16 15:19 ` Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 0/3] Fix localio hangs Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 2/3] NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh() Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file Trond Myklebust
2025-07-16 22:09 ` [PATCH v2 0/3] Fix localio hangs NeilBrown
2025-07-16 23:27 ` Mike Snitzer [this message]
2025-07-18 0:18 ` NeilBrown
2025-05-09 16:01 ` [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers Chuck Lever
2025-05-09 21:02 ` Mike Snitzer
2025-05-10 0:16 ` Paul E. McKenney
2025-05-10 2:44 ` NeilBrown
2025-05-10 3:01 ` NeilBrown
2025-05-10 16:02 ` Chuck Lever
2025-05-10 19:57 ` Mike Snitzer
2025-05-16 15:33 ` Chuck Lever
2025-05-18 10:46 ` Pali Rohár
2025-05-19 3:49 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aHg1RLw-5Csbiber@kernel.org \
--to=snitzer@kernel.org \
--cc=anna.schumaker@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox