Re: list_del corruption / unhash_ol_stateid()

Linux NFS development
 help / color / mirror / Atom feed

From: Jeff Layton <jlayton@poochiereds.net>
To: Andrew W Elble <aweits@rit.edu>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	<linux-nfs@vger.kernel.org>,
	Anna Schumaker <schumakeranna@gmail.com>
Subject: Re: list_del corruption / unhash_ol_stateid()
Date: Wed, 5 Aug 2015 13:12:11 -0400	[thread overview]
Message-ID: <20150805131211.64d1834a@tlielax.poochiereds.net> (raw)
In-Reply-To: <m2pp31bsqb.fsf@discipline.rit.edu>

On Wed, 05 Aug 2015 12:33:16 -0400
Andrew W Elble <aweits@rit.edu> wrote:

> 
> > I'm not sure I'm following your shorthand correctly, but, I'm guessing
> > the above is that you're creating a stateid and then calling
> > init_open_stateid on it to hash it?
> >
> > If you can flesh out the description of the race, it might be more
> > clear.
> 
> Sorry, knew that was a bit terse.
> 
> nfsd4_process_open2                    nfsd4_close
> 
> stp = open->op_stp;
>  nfs4_alloc_stid, refcount is 1
> 
> init_open_stateid(stp, fp, open);
>  is hashed, refcount is 2
> 
>                                  nfs4_preprocess_seqid_op()
> 				 finds the same stateid,
> 				 refcount is 3
> 
>                                  nfsd4_close_open_stateid()
>                                  unhashes stateid,
>                                  put_ol_stateid_locked() drops refcount
>                                  refcount is 2
> nfs4_get_vfs_file() returns
> with status due to:
>  nfsd_open()
>   nfsd_open_break_lease()
>    break_lease() returning
>     -EWOULDBLOCK
> 
> release_open_stateid()
>  calls unhash_open_stateid()
>  attempts to unhash,
>  WARN generated for list_del
> 
>  proceeds to call put_ol_stateid_locked()
>  decrements refcount inappropriately
>  refcount is now 1.
> 
>                                  nfs4_put_stid()
> 				 refcount is 0, memory is freed
> 
>  nfs4_put_stid
>  calls atomic_dec_and_lock()
>  use after free
>  first corrupt byte is 6a
>  because of atomic decrement
>  with slub_debug on
> 
> > Unless...the initial task that created the stateid got _really_
> > stalled, and another OPEN raced in, found that stateid and was able to
> > reply quickly. Then the client could send a CLOSE for that second OPEN
> > before the first opener ever finished.
> 
> Sounds plausible given the environment. Would take more effort to
> directly prove. I have directly observed the same stateid being used in
> the above race.
> 
> > If you had multiple racing OPENs then one could find the stateid that
> > was previously hashed. A simple fix might be to just use list_del_init
> > calls in unhash_ol_stateid. That should make any second list_del's a
> > no-op.
> >
> > Whether that's sufficient to fix the mem corruption, I'm not sure...
> 
> It's sc_count decrementing when the unhashing "fails" or is going to
> fail. For instance, this kind of construct will prevent one of the
> possible use-after-free scenarios.
> 
> static void release_open_stateid(struct nfs4_ol_stateid *stp)
> {
>         LIST_HEAD(reaplist);
> 
>         spin_lock(&stp->st_stid.sc_client->cl_lock);
>         if (stp->st_perfile->next != LIST_POISON1) {
>          unhash_open_stateid(stp, &reaplist);
>          put_ol_stateid_locked(stp, &reaplist);
>         }
>         spin_unlock(&stp->st_stid.sc_client->cl_lock);
>         free_ol_stateid_reaplist(&reaplist);
> }
> 
> Thanks,
> 
> Andy
> 

Ok, I get it now. The real problem then is a dispute over who should
put the "hash reference" for this stateid. We need to ensure that only
one caller dequeues it from the lists, and only that caller puts that
reference.

There are a couple of different ways to do that. The best one in this
case is probably to use list_del_init in unhash_ol_stateid and check
whether one of the list_heads is already empty before attempting to
unhash it and put the final reference.

We still do have the problem (I suspect) with the nfs4_file not being
fully instantiated before allowing other callers to use it, but that's
really a separate problem.

-- 
Jeff Layton <jlayton@poochiereds.net>

     prev parent reply	other threads:[~2015-08-05 17:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-27 15:13 list_del corruption / unhash_ol_stateid() Andrew W Elble
2015-07-27 18:06 ` Andrew W Elble
2015-07-27 20:40   ` J. Bruce Fields
2015-07-27 21:03     ` Andrew W Elble
2015-07-28 13:02   ` Jeff Layton
2015-07-28 15:01     ` Andrew W Elble
2015-07-28 15:49       ` Jeff Layton
2015-07-28 21:04         ` J. Bruce Fields
2015-07-29 15:17           ` Andrew W Elble
2015-07-29 19:52             ` Andrew W Elble
2015-07-30 11:11               ` Andrew W Elble
2015-07-30 12:57                 ` Jeff Layton
2015-08-04 20:18                   ` Andrew W Elble
2015-08-05 15:11                     ` Jeff Layton
2015-08-05 16:33                       ` Andrew W Elble
2015-08-05 17:12                         ` Jeff Layton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150805131211.64d1834a@tlielax.poochiereds.net \
    --to=jlayton@poochiereds.net \
    --cc=aweits@rit.edu \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=schumakeranna@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox