From: Jeff Layton <jlayton@kernel.org>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Chuck Lever <chuck.lever@oracle.com>, Neil Brown <neilb@suse.de>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
Kinglong Mee <kinglongmee@gmail.com>,
Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna@kernel.org>,
linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups
Date: Wed, 29 Jan 2025 10:01:49 -0500 [thread overview]
Message-ID: <0722b556a8b01ca4f99e5d3ac5285efc666d69ec.camel@kernel.org> (raw)
In-Reply-To: <Z5o93QZJ35z8Zkyh@poldevia.fieldses.org>
On Wed, 2025-01-29 at 09:40 -0500, J. Bruce Fields wrote:
> On Wed, Jan 29, 2025 at 09:27:02AM -0500, Jeff Layton wrote:
> > On Wed, 2025-01-29 at 09:21 -0500, J. Bruce Fields wrote:
> > > On Wed, Jan 29, 2025 at 08:39:53AM -0500, Jeff Layton wrote:
> > > > While looking over the CB_SEQUENCE error handling, I discovered that
> > > > callbacks don't hold a reference to a session, and the
> > > > clp->cl_cb_session could easily change between request and response.
> > > > If that happens at an inopportune time, there could be UAFs or weird
> > > > slot/sequence handling problems.
> > >
> > > Nobody should place too much faith in my understanding of how any of
> > > this works at this point, but.... My vague memory is that a lot of
> > > things are serialized simply by being run only on the cl_callback_wq.
> > > Modifying clp->cl_cb_session is such a thing.
> >
> > It is, but that doesn't save us here. The workqueue is just there to
> > submit jobs to the RPC client. Once that happens they are run via
> > rpciod's workqueue (and in parallel with one another since they're
> > async RPC calls).
> >
> > So, it's possible that while we're waiting for a response from one
> > callback, another is submitted, and that workqueue job changes the
> > clp->cl_cb_session.
>
> I think it calls rpc_shutdown_client() before changing
> clp->cl_cb_session.
>
It does, but the cl_cb_session doesn't carry a reference. My worry was
that the client could call a DESTROY_SESSION at any time.
Now that I look though, you may be right that that's enough to ensure
it because nfsd4_destroy_session() calls nfsd4_probe_callback_sync()
before putting the session reference.
Still, that is a lot of reliance on these things happening in a
particular order.
> (Though I'm not sure whether rpc_shutdown_client guarantees that all rpc
> processing for the client is completed before returning?)
>
FWIW, it does wait for them to be killed:
while (!list_empty(&clnt->cl_tasks)) {
rpc_killall_tasks(clnt);
wait_event_timeout(destroy_wait,
list_empty(&clnt->cl_tasks), 1*HZ);
}
I'm not crazy about the fact that it does that synchronously in the
workqueue job, but I guess not much else can be happening with
callbacks until this completes.
> --b.
>
> >
> > Thanks for taking a look, Bruce!
> >
> > >
> > > > This series changes the nfsd4_session to be RCU-freed, and then adds a
> > > > new method of session refcounting that is compatible with the old.
> > > > nfsd4_callback RPCs will now hold a lightweight reference to the session
> > > > in addition to the slot. Then, all of the callback handling is switched
> > > > to use that session instead of dereferencing clp->cb_cb_session.
> > > > I've also reworked the error handling in nfsd4_cb_sequence_done()
> > > > based on review comments, and lifted the v4.0 handing out of that
> > > > function.
> > > >
> > > > This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> > > > much any of that stresses the backchannel's error handling.
> > > >
> > > > These should probably go in via Chuck's tree, but the last patch touches
> > > > some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> > > > from Trond and/or Anna on that one.
> > > >
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > > Changes in v2:
> > > > - make nfsd4_session be RCU-freed
> > > > - change code to keep reference to session over callback RPCs
> > > > - rework error handling in nfsd4_cb_sequence_done()
> > > > - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > > - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
> > > >
> > > > ---
> > > > Jeff Layton (7):
> > > > nfsd: add routines to get/put session references for callbacks
> > > > nfsd: make clp->cl_cb_session be an RCU managed pointer
> > > > nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
> > > > nfsd: overhaul CB_SEQUENCE error handling
> > > > nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
> > > > nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > > sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> > > >
> > > > fs/nfs/nfs4proc.c | 12 ++-
> > > > fs/nfsd/nfs4callback.c | 212 ++++++++++++++++++++++++++++++++------------
> > > > fs/nfsd/nfs4state.c | 43 ++++++++-
> > > > fs/nfsd/state.h | 6 +-
> > > > fs/nfsd/trace.h | 6 +-
> > > > include/linux/sunrpc/clnt.h | 4 +-
> > > > net/sunrpc/clnt.c | 7 +-
> > > > 7 files changed, 210 insertions(+), 80 deletions(-)
> > > > ---
> > > > base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> > > > change-id: 20250123-nfsd-6-14-b0797e385dc0
> > > >
> > > > Best regards,
> > > > --
> > > > Jeff Layton <jlayton@kernel.org>
> >
> > --
> > Jeff Layton <jlayton@kernel.org>
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2025-01-29 15:01 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-29 13:39 [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups Jeff Layton
2025-01-29 13:39 ` [PATCH v2 1/7] nfsd: add routines to get/put session references for callbacks Jeff Layton
2025-01-29 13:39 ` [PATCH v2 2/7] nfsd: make clp->cl_cb_session be an RCU managed pointer Jeff Layton
2025-01-29 13:39 ` [PATCH v2 3/7] nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session Jeff Layton
2025-01-29 13:39 ` [PATCH v2 4/7] nfsd: overhaul CB_SEQUENCE error handling Jeff Layton
2025-01-29 13:39 ` [PATCH v2 5/7] nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() Jeff Layton
2025-01-29 13:39 ` [PATCH v2 6/7] nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done() Jeff Layton
2025-01-29 13:40 ` [PATCH v2 7/7] sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return Jeff Layton
2025-01-29 14:21 ` [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups J. Bruce Fields
2025-01-29 14:27 ` Jeff Layton
2025-01-29 14:40 ` J. Bruce Fields
2025-01-29 15:01 ` Jeff Layton [this message]
2025-01-29 15:09 ` Chuck Lever
2025-01-29 15:25 ` J. Bruce Fields
2025-01-29 14:22 ` Chuck Lever
2025-01-29 14:39 ` Jeff Layton
2025-01-29 14:42 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0722b556a8b01ca4f99e5d3ac5285efc666d69ec.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=anna@kernel.org \
--cc=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=kinglongmee@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox