Linux NFS development
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever III <chuck.lever@oracle.com>,
	Olga Kornievskaia <aglo@umich.edu>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Dai Ngo <dai.ngo@oracle.com>
Subject: Re: [PATCH 2/2] nfsd: clean up potential nfsd_file refcount leaks in COPY codepath
Date: Wed, 18 Jan 2023 12:06:24 -0500	[thread overview]
Message-ID: <0fbcbdc37e7e3f070b491848a74be348843074c2.camel@kernel.org> (raw)
In-Reply-To: <12C5F3B3-6DB1-4483-8160-A691EB464464@oracle.com>

On Wed, 2023-01-18 at 16:39 +0000, Chuck Lever III wrote:
> 
> > On Jan 18, 2023, at 11:29 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> > 
> > On Wed, Jan 18, 2023 at 10:27 AM Jeff Layton <jlayton@kernel.org> wrote:
> > > 
> > > On Wed, 2023-01-18 at 09:42 -0500, Olga Kornievskaia wrote:
> > > > On Tue, Jan 17, 2023 at 2:38 PM Jeff Layton <jlayton@kernel.org> wrote:
> > > > > 
> > > > > There are two different flavors of the nfsd4_copy struct. One is
> > > > > embedded in the compound and is used directly in synchronous copies. The
> > > > > other is dynamically allocated, refcounted and tracked in the client
> > > > > struture. For the embedded one, the cleanup just involves releasing any
> > > > > nfsd_files held on its behalf. For the async one, the cleanup is a bit
> > > > > more involved, and we need to dequeue it from lists, unhash it, etc.
> > > > > 
> > > > > There is at least one potential refcount leak in this code now. If the
> > > > > kthread_create call fails, then both the src and dst nfsd_files in the
> > > > > original nfsd4_copy object are leaked.
> > > > 
> > > > I don't believe that's true. If kthread_create thread fails we call
> > > > cleanup_async_copy() that does a put on the file descriptors.
> > > > 
> > > 
> > > You mean this?
> > > 
> > > out_err:
> > >        if (async_copy)
> > >                cleanup_async_copy(async_copy);
> > > 
> > > That puts the references that were taken in dup_copy_fields, but the
> > > original (embedded) nfsd4_copy also holds references and those are not
> > > being put in this codepath.
> > 
> > Can you please point out where do we take a reference on the original copy?
> > 
> > > > > The cleanup in this codepath is also sort of weird. In the async copy
> > > > > case, we'll have up to four nfsd_file references (src and dst for both
> > > > > flavors of copy structure).
> > > > 
> > > > That's not true. There is a careful distinction between intra -- which
> > > > had 2 valid file pointers and does a get on both as they both point to
> > > > something that's opened on this server--- but inter -- only does a get
> > > > on the dst file descriptor, the src doesn't exit. And yes I realize
> > > > the code checks for nfs_src being null which it should be but it makes
> > > > the code less clear and at some point somebody might want to decide to
> > > > really do a put on it.
> > > > 
> > > 
> > > This is part of the problem here. We have a nfsd4_copy structure, and
> > > depending on what has been done to it, you need to call different
> > > methods to clean it up. That seems like a real antipattern to me.
> > 
> > But they call different methods because different things need to be
> > done there and it makes it clear what needs to be for what type of
> > copy.
> 
> In cases like this, it makes sense to consider using types to
> ensure the code can't do the wrong thing. So you might want to
> have a struct nfs4_copy_A for the inter code to use, and a struct
> nfs4_copy_B for the intra code to use. Sharing the same struct
> for both use cases is probably what's confusing to human readers.
> 
> I've never been a stickler for removing every last ounce of code
> duplication. Here, it might help to have a little duplication
> just to make it easier to reason about the reference counting in
> the two use cases.
> 
> That's my view from the mountain top, worth every penny you paid
> for it.
> 

+1

The nfsd4_copy structure has a lot of fields in it that only matter for
the async copy case. ISTM that nfsd4_copy (the function) should
dynamically allocate a struct nfsd4_async_copy that contains a
nfsd4_copy and whatever other fields are needed.

Then, we could trim down struct nfsd4_copy to just the info needed.

For instance, the nf_src and nf_dst fields really don't need to be in
nfsd4_copy. For the synchronous copy case, we can just keep those
pointers on the stack, and for the async case they would be inside the
larger structure.

That would allow us to trim down the footprint of the compound union
too.

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2023-01-18 17:06 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-17 19:38 [PATCH 0/2] nfsd: COPY refcounting fix and cleanup Jeff Layton
2023-01-17 19:38 ` [PATCH 1/2] nfsd: zero out pointers after putting nfsd_files on COPY setup error Jeff Layton
2023-01-17 19:38 ` [PATCH 2/2] nfsd: clean up potential nfsd_file refcount leaks in COPY codepath Jeff Layton
2023-01-18 14:42   ` Olga Kornievskaia
2023-01-18 15:27     ` Jeff Layton
2023-01-18 16:29       ` Olga Kornievskaia
2023-01-18 16:39         ` Chuck Lever III
2023-01-18 17:06           ` Jeff Layton [this message]
2023-01-18 17:11             ` Chuck Lever III
2023-01-18 17:26               ` Jeff Layton
2023-01-18 17:48                 ` Olga Kornievskaia
2023-01-18 16:57         ` Jeff Layton
2023-01-18 17:07           ` Olga Kornievskaia
2023-01-18 18:16             ` Olga Kornievskaia
2023-01-18 18:34               ` Jeff Layton
2023-01-19  1:45                 ` Olga Kornievskaia
2023-01-19  5:05   ` dai.ngo
2023-01-19 10:56     ` Jeff Layton
2023-01-19 18:38       ` dai.ngo
2023-01-20 11:43         ` Jeff Layton
2023-01-21 18:56           ` dai.ngo
2023-01-21 19:50             ` dai.ngo
2023-01-21 20:05               ` Jeff Layton
2023-01-21 20:12                 ` Chuck Lever III
2023-01-21 21:28                   ` dai.ngo
2023-01-22 16:45                     ` Chuck Lever III
2023-01-22 17:10                       ` Chuck Lever III
2023-01-23 12:17                         ` Jeff Layton
2023-01-23 15:22                       ` Olga Kornievskaia
2023-01-23 15:32                         ` Jeff Layton
2023-01-23 20:32                       ` dai.ngo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0fbcbdc37e7e3f070b491848a74be348843074c2.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=aglo@umich.edu \
    --cc=chuck.lever@oracle.com \
    --cc=dai.ngo@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox