From: Mike Snitzer <snitzer@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: Chuck Lever <chuck.lever@oracle.com>,
Jeff Layton <jlayton@kernel.org>,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH v14-plus 00/25] Address netns refcount issues for localio
Date: Fri, 30 Aug 2024 03:47:31 -0400 [thread overview]
Message-ID: <ZtF5E5H53tkNurR3@kernel.org> (raw)
In-Reply-To: <20240830023531.29421-1-neilb@suse.de>
On Fri, Aug 30, 2024 at 12:20:13PM +1000, NeilBrown wrote:
> Following are revised versions of 6 patches from the v14 localio series.
>
> The issue addressed is net namespace refcounting.
>
> We don't want to keep a long-term counted reference in the client
> because that prevents a server container from completely shutting down.
>
> So we avoid taking a reference at all and rely on the per-cpu reference
> to the server being sufficient to keep the net-ns active. This involves
> allowing the net-ns exit code to iterate all active clients and clear
> their ->net pointers (which they need to find the per-cpu-refcount for
> the nfs_serv).
>
> So:
> - embed nfs_uuid_t in nfs_client. This provides a list_head that can
> be used to find the client. It does add the actual uuid to nfs_client
> so it is bigger than needed. If that is really a problem we can find
> a fix.
>
> - When the nfs server confirms that the uuid is local, it moves the
> nfs_uuid_t onto a per-net-ns list.
>
> - When the net-ns is shutting down - in a "pre_exit" handler, all these
> nfS_uuid_t have their ->net cleared. There is an rcu_synchronize()
> call between pre_exit() handlers and exit() handlers so and caller
> that sees ->net as not NULL can safely check the ->counter
>
> - We now pass the nfs_uuid_t to nfsd_open_local_fh() so it can safely
> look at ->net in a private rcu_read_lock() section.
>
> I have compile tested this code but nothing more.
>
> Thanks,
> NeilBrown
>
> [PATCH 14/25] nfs_common: add NFS LOCALIO auxiliary protocol
> [PATCH 15/25] nfs_common: introduce nfs_localio_ctx struct and
> [PATCH 16/25] nfsd: add localio support
> [PATCH 17/25] nfsd: implement server support for NFS_LOCALIO_PROGRAM
> [PATCH 19/25] nfs: add localio support
> [PATCH 23/25] nfs: implement client support for NFS_LOCALIO_PROGRAM
Hey Neil,
I attempted to test the kernel with your changes but it crashed with:
[ 55.422564] list_add corruption. next is NULL.
[ 55.423523] ------------[ cut here ]------------
[ 55.424423] kernel BUG at lib/list_debug.c:27!
[ 55.425291] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 55.426367] CPU: 29 UID: 0 PID: 5251 Comm: nfsd Kdump: loaded Not tainted 6.11.0-rc4.snitm+ #147
[ 55.427991] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-4.module+el8.9.0+19570+14a90618 04/01/2014
[ 55.429697] RIP: 0010:__list_add_valid_or_report+0x55/0xa0
[ 55.430741] Code: 4c 39 cf 74 4f b8 01 00 00 00 5d c3 cc cc cc cc 48 c7 c7 98 1d a5 82 e8 d9 6d 93 ff 0f 0b 48 c7 c7 c0 1d a5 82 e8 cb 6d 93 ff <0f> 0b 4c 89 c1 48 c7 c7 e8 1d a5 82 e8 ba 6d 93 ff 0f 0b 48 89 d1
[ 55.434167] RSP: 0018:ffff8881441a7d50 EFLAGS: 00010296
[ 55.435141] RAX: 0000000000000022 RBX: ffff888107b50370 RCX: 0000000000000000
[ 55.436455] RDX: ffff888473caf800 RSI: ffff888473ca18c0 RDI: ffff888473ca18c0
[ 55.437770] RBP: ffff8881441a7d50 R08: 0000000000000022 R09: ffff8881441a7be8
[ 55.439098] R10: ffff8881441a7be0 R11: ffffffff8333f328 R12: ffff888107b50380
[ 55.440419] R13: ffff888103b15080 R14: ffff888107bb5d00 R15: 0000000000000000
[ 55.441737] FS: 0000000000000000(0000) GS:ffff888473c80000(0000) knlGS:0000000000000000
[ 55.443228] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 55.444297] CR2: 000055958fab3488 CR3: 000000010615e000 CR4: 0000000000350ef0
[ 55.445615] Call Trace:
[ 55.446090] <TASK>
[ 55.446498] ? show_regs+0x6d/0x80
[ 55.447149] ? die+0x3c/0xa0
[ 55.447698] ? do_trap+0xcf/0xf0
[ 55.448316] ? do_error_trap+0x75/0xa0
[ 55.449026] ? __list_add_valid_or_report+0x55/0xa0
[ 55.449938] ? exc_invalid_op+0x57/0x80
[ 55.450660] ? __list_add_valid_or_report+0x55/0xa0
[ 55.451646] ? asm_exc_invalid_op+0x1f/0x30
[ 55.452438] ? __list_add_valid_or_report+0x55/0xa0
[ 55.453355] nfs_uuid_is_local+0xba/0x110
[ 55.454115] localio_proc_uuid_is_local+0x64/0x80 [nfsd]
[ 55.455145] nfsd_dispatch+0xc2/0x210 [nfsd]
[ 55.455977] svc_process_common+0x2e6/0x6e0
[ 55.456761] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[ 55.457697] svc_process+0x13e/0x1e0
[ 55.458377] svc_recv+0x89e/0xa70
[ 55.459012] ? __pfx_nfsd+0x10/0x10 [nfsd]
[ 55.459806] nfsd+0xa5/0x100 [nfsd]
[ 55.460486] kthread+0xe5/0x120
[ 55.461090] ? __pfx_kthread+0x10/0x10
[ 55.461801] ret_from_fork+0x3d/0x60
[ 55.462476] ? __pfx_kthread+0x10/0x10
[ 55.463184] ret_from_fork_asm+0x1a/0x30
[ 55.463923] </TASK>
I'll triple check my melding of your changes and mine in ~7 hours.. I
may have missed something.
Note this is _not_ with your other incremental patch (that uses
__module_get) -- only because I didn't get to that yet.
Mike
next prev parent reply other threads:[~2024-08-30 7:47 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-30 2:20 [PATCH v14-plus 00/25] Address netns refcount issues for localio NeilBrown
2024-08-30 2:20 ` [PATCH 14/25] nfs_common: add NFS LOCALIO auxiliary protocol enablement NeilBrown
2024-08-30 2:20 ` [PATCH 15/25] nfs_common: introduce nfs_localio_ctx struct and interfaces NeilBrown
2024-08-30 2:20 ` [PATCH 16/25] nfsd: add localio support NeilBrown
2024-08-30 2:20 ` [PATCH 17/25] nfsd: implement server support for NFS_LOCALIO_PROGRAM NeilBrown
2024-08-30 2:20 ` [PATCH 19/25] nfs: add localio support NeilBrown
2024-08-30 2:20 ` [PATCH 23/25] nfs: implement client support for NFS_LOCALIO_PROGRAM NeilBrown
2024-08-30 3:46 ` [PATCH v14-plus 00/25] Address netns refcount issues for localio Mike Snitzer
2024-08-30 7:47 ` Mike Snitzer [this message]
2024-08-30 13:56 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZtF5E5H53tkNurR3@kernel.org \
--to=snitzer@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.