From: Mike Snitzer <snitzer@kernel.org>
To: NeilBrown <neil@brown.name>
Cc: Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna.schumaker@oracle.com>,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2 0/3] Fix localio hangs
Date: Wed, 16 Jul 2025 19:27:00 -0400 [thread overview]
Message-ID: <aHg1RLw-5Csbiber@kernel.org> (raw)
In-Reply-To: <175270375199.2234665.7748991440226043304@noble.neil.brown.name>
On Thu, Jul 17, 2025 at 08:09:11AM +1000, NeilBrown wrote:
> On Thu, 17 Jul 2025, Trond Myklebust wrote:
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> >
> > The following patch series fixes a series of issues with the current
> > localio code, as reported in the link
> > https://lore.kernel.org/linux-nfs/aG0pJXVtApZ9C5vy@kernel.org/
> >
> >
> > Trond Myklebust (3):
> > NFS/localio: nfs_close_local_fh() fix check for file closed
> > NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh()
> > NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file
>
> That all looks good to me - thanks a lot for finding and fixing my bugs.
>
> Reviewed-by: NeilBrown <neil@brown.name>
>
> I'd still like to fix the nfsd_file_cache_purge() issue but that is
> quite separate especially now that you've prevented it causing problems
> for nfs_uuid_put().
>
> thanks,
> NeilBrown
Unfortunately even with these 3 v2 fixes I was just able to hit the
same hang on NFSD shutdown. It took 5 iterations of the fio test,
reported here:
https://lore.kernel.org/linux-nfs/aG0pJXVtApZ9C5vy@kernel.org/
So it is harder to hit with these v2 fixes, nevertheless:
[ 369.528839] task:rpc.nfsd state:D stack:0 pid:10569 tgid:10569 ppid:1 flags:0x00004006
[ 369.528985] Call Trace:
[ 369.529127] <TASK>
[ 369.529295] __schedule+0x26d/0x530
[ 369.529435] schedule+0x27/0xa0
[ 369.529566] schedule_timeout+0x14e/0x160
[ 369.529700] ? svc_destroy+0xce/0x160 [sunrpc]
[ 369.529882] ? lockd_put+0x5f/0x90 [lockd]
[ 369.530022] __wait_for_common+0x8f/0x1d0
[ 369.530154] ? __pfx_schedule_timeout+0x10/0x10
[ 369.530329] nfsd_destroy_serv+0x13f/0x1a0 [nfsd]
[ 369.530516] nfsd_svc+0xe0/0x170 [nfsd]
[ 369.530684] write_threads+0xc3/0x190 [nfsd]
[ 369.530845] ? simple_transaction_get+0xc2/0xe0
[ 369.530973] ? __pfx_write_threads+0x10/0x10 [nfsd]
[ 369.531133] nfsctl_transaction_write+0x47/0x80 [nfsd]
[ 369.531324] vfs_write+0xfa/0x420
[ 369.531448] ? do_filp_open+0xae/0x150
[ 369.531574] ksys_write+0x63/0xe0
[ 369.531693] do_syscall_64+0x7d/0x160
[ 369.531816] ? do_sys_openat2+0x81/0xd0
[ 369.531937] ? syscall_exit_work+0xf3/0x120
[ 369.532058] ? syscall_exit_to_user_mode+0x32/0x1b0
[ 369.532178] ? do_syscall_64+0x89/0x160
[ 369.532344] ? __mod_memcg_lruvec_state+0x95/0x150
[ 369.532465] ? __lruvec_stat_mod_folio+0x84/0xd0
[ 369.532584] ? syscall_exit_work+0xf3/0x120
[ 369.532705] ? syscall_exit_to_user_mode+0x32/0x1b0
[ 369.532827] ? do_syscall_64+0x89/0x160
[ 369.532947] ? __handle_mm_fault+0x326/0x730
[ 369.533066] ? __mod_memcg_lruvec_state+0x95/0x150
[ 369.533187] ? __count_memcg_events+0x53/0xf0
[ 369.533306] ? handle_mm_fault+0x245/0x340
[ 369.533427] ? do_user_addr_fault+0x341/0x6b0
[ 369.533547] ? exc_page_fault+0x70/0x160
[ 369.533666] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 369.533787] RIP: 0033:0x7f1db10fd617
crash> dis -l nfsd_destroy_serv+0x13f
/root/snitm/git/linux-HS/fs/nfsd/nfssvc.c: 468
0xffffffffc172e36f <nfsd_destroy_serv+319>: mov %r12,%rdi
which is the percpu_ref_exit() in nfsd_shutdown_net():
static void nfsd_shutdown_net(struct net *net)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
if (!nn->nfsd_net_up)
return;
percpu_ref_kill_and_confirm(&nn->nfsd_net_ref, nfsd_net_done);
wait_for_completion(&nn->nfsd_net_confirm_done);
nfsd_export_flush(net);
nfs4_state_shutdown_net(net);
nfsd_reply_cache_shutdown(nn);
nfsd_file_cache_shutdown_net(net);
if (nn->lockd_up) {
lockd_down(net);
nn->lockd_up = false;
}
wait_for_completion(&nn->nfsd_net_free_done);
---> percpu_ref_exit(&nn->nfsd_net_ref);
nn->nfsd_net_up = false;
nfsd_shutdown_generic();
}
next prev parent reply other threads:[~2025-07-16 23:27 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-09 0:46 [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers NeilBrown
2025-05-09 0:46 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
2025-05-09 0:46 ` [PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref NeilBrown
2025-05-09 0:46 ` [PATCH 3/6] nfs_localio: simplify interface to nfsd for getting nfsd_file NeilBrown
2025-05-09 0:46 ` [PATCH 4/6] nfs_localio: duplicate nfs_close_local_fh() NeilBrown
2025-05-09 0:46 ` [PATCH 5/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() NeilBrown
2025-05-09 0:46 ` [PATCH 6/6] nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer NeilBrown
2025-05-09 11:03 ` kernel test robot
2025-07-08 14:20 ` [RFC PATCH for 6.16-rcX] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 0/9] NFSD/NFS/LOCALIO: stable fixes and revert 6.16 LOCALIO changes Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 1/9] Revert "NFSD: Clean up kdoc for nfsd_open_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 2/9] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 3/9] Revert "nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 4/9] Revert "nfs_localio: duplicate nfs_close_local_fh()" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 5/9] Revert "nfs_localio: simplify interface to nfsd for getting nfsd_file" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 6/9] Revert "nfs_localio: always hold nfsd net ref with nfsd_file ref" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 7/9] Revert "nfs_localio: use cmpxchg() to install new nfs_file_localio" Mike Snitzer
2025-07-14 3:13 ` [for-6.16-final PATCH 8/9] nfs/localio: avoid bouncing LOCALIO if nfs_client_is_local() Mike Snitzer
2025-07-14 4:19 ` NeilBrown
2025-07-14 14:37 ` Mike Snitzer
2025-07-14 12:23 ` Jeff Layton
2025-07-14 3:13 ` [for-6.16-final PATCH 9/9] nfs/localio: add localio_async_probe modparm Mike Snitzer
2025-07-14 4:23 ` NeilBrown
2025-07-14 12:28 ` Jeff Layton
2025-07-14 14:08 ` Mike Snitzer
2025-07-14 3:50 ` [RFC PATCH for 6.16-rcX] Revert "nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer" NeilBrown
2025-07-14 14:45 ` Mike Snitzer
2025-07-15 22:52 ` [PATCH 0/3] Fix localio hangs Trond Myklebust
2025-07-15 22:52 ` [PATCH 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed Trond Myklebust
2025-07-15 22:52 ` [PATCH 2/3] NFS/localio: nfs_uuid_put() fix the wait for file unlink events Trond Myklebust
2025-07-15 22:52 ` [PATCH 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file Trond Myklebust
2025-07-16 1:09 ` [PATCH 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed NeilBrown
2025-07-16 1:22 ` [PATCH 2/3] NFS/localio: nfs_uuid_put() fix the wait for file unlink events NeilBrown
2025-07-16 2:29 ` Trond Myklebust
2025-07-16 3:51 ` NeilBrown
2025-07-16 1:31 ` [PATCH 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file NeilBrown
2025-07-16 4:17 ` Trond Myklebust
2025-07-16 5:07 ` NeilBrown
2025-07-16 15:19 ` Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 0/3] Fix localio hangs Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 1/3] NFS/localio: nfs_close_local_fh() fix check for file closed Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 2/3] NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh() Trond Myklebust
2025-07-16 15:59 ` [PATCH v2 3/3] NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file Trond Myklebust
2025-07-16 22:09 ` [PATCH v2 0/3] Fix localio hangs NeilBrown
2025-07-16 23:27 ` Mike Snitzer [this message]
2025-07-18 0:18 ` NeilBrown
2025-05-09 16:01 ` [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers Chuck Lever
2025-05-09 21:02 ` Mike Snitzer
2025-05-10 0:16 ` Paul E. McKenney
2025-05-10 2:44 ` NeilBrown
2025-05-10 3:01 ` NeilBrown
2025-05-10 16:02 ` Chuck Lever
2025-05-10 19:57 ` Mike Snitzer
2025-05-16 15:33 ` Chuck Lever
2025-05-18 10:46 ` Pali Rohár
2025-05-19 3:49 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aHg1RLw-5Csbiber@kernel.org \
--to=snitzer@kernel.org \
--cc=anna.schumaker@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.