Re: [PATCH 1/1] userfaultfd: non-cooperative: fix fork use after free

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Mark Rutland <mark.rutland@arm.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Pavel Emelyanov <xemul@virtuozzo.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, syzkaller@googlegroups.com,
	stable@vger.kernel.org
Subject: Re: [PATCH 1/1] userfaultfd: non-cooperative: fix fork use after free
Date: Thu, 21 Sep 2017 11:46:41 +0100	[thread overview]
Message-ID: <20170921104641.GA12505@leverpostej> (raw)
In-Reply-To: <20170920180413.26713-1-aarcange@redhat.com>

On Wed, Sep 20, 2017 at 08:04:13PM +0200, Andrea Arcangeli wrote:
> When reading the event from the uffd, we put it on a temporary
> fork_event list to detect if we can still access it after releasing
> and retaking the event_wqh.lock.
> 
> If fork aborts and removes the event from the fork_event all is fine
> as long as we're still in the userfault read context and fork_event
> head is still alive.
> 
> We've to put the event allocated in the fork kernel stack, back from
> fork_event list-head to the event_wqh head, before returning from
> userfaultfd_ctx_read, because the fork_event head lifetime is limited
> to the userfaultfd_ctx_read stack lifetime.
> 
> Forgetting to move the event back to its event_wqh place then results
> in __remove_wait_queue(&ctx->event_wqh, &ewq->wq); in
> userfaultfd_event_wait_completion to remove it from a head that has
> been already freed from the reader stack.
> 
> This could only happen if resolve_userfault_fork failed (for example
> if there are no file descriptors available to allocate the fork
> uffd). If it succeeded it was put back correctly.
> 
> Furthermore, after find_userfault_evt receives a fork event, the
> forked userfault context in fork_nctx and
> uwq->msg.arg.reserved.reserved1 can be released by the fork thread as
> soon as the event_wqh.lock is released. Taking a reference on the
> fork_nctx before dropping the lock prevents an use after free in
> resolve_userfault_fork().
> 
> If the fork side aborted and it already released everything, we still
> try to succeed resolve_userfault_fork(), if possible.
> 
> Reported-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>

This has survived my test-case overnight, so FWIW:

Tested-by: Mark Rutland <mark.rutland@arm.com>

So that this can be backported to stable trees, I think we also need:

Fixes: 893e26e61d04eac9 ("userfaultfd: non-cooperative: Add fork() event")
Cc: <stable@vger.kernel.org>

Thanks,
Mark.

> ---
>  fs/userfaultfd.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 56 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index 06d6cfda1e8e..16366587e579 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -599,6 +599,12 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx,
>  			break;
>  		if (ACCESS_ONCE(ctx->released) ||
>  		    fatal_signal_pending(current)) {
> +			/*
> +			 * &ewq->wq may be queued in fork_event, but
> +			 * __remove_wait_queue ignores the head
> +			 * parameter. It would be a problem if it
> +			 * didn't.
> +			 */
>  			__remove_wait_queue(&ctx->event_wqh, &ewq->wq);
>  			if (ewq->msg.event == UFFD_EVENT_FORK) {
>  				struct userfaultfd_ctx *new;
> @@ -1072,6 +1078,12 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait,
>  					(unsigned long)
>  					uwq->msg.arg.reserved.reserved1;
>  				list_move(&uwq->wq.entry, &fork_event);
> +				/*
> +				 * fork_nctx can be freed as soon as
> +				 * we drop the lock, unless we take a
> +				 * reference on it.
> +				 */
> +				userfaultfd_ctx_get(fork_nctx);
>  				spin_unlock(&ctx->event_wqh.lock);
>  				ret = 0;
>  				break;
> @@ -1102,19 +1114,53 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait,
>  
>  	if (!ret && msg->event == UFFD_EVENT_FORK) {
>  		ret = resolve_userfault_fork(ctx, fork_nctx, msg);
> +		spin_lock(&ctx->event_wqh.lock);
> +		if (!list_empty(&fork_event)) {
> +			/*
> +			 * The fork thread didn't abort, so we can
> +			 * drop the temporary refcount.
> +			 */
> +			userfaultfd_ctx_put(fork_nctx);
> +
> +			uwq = list_first_entry(&fork_event,
> +					       typeof(*uwq),
> +					       wq.entry);
> +			/*
> +			 * If fork_event list wasn't empty and in turn
> +			 * the event wasn't already released by fork
> +			 * (the event is allocated on fork kernel
> +			 * stack), put the event back to its place in
> +			 * the event_wq. fork_event head will be freed
> +			 * as soon as we return so the event cannot
> +			 * stay queued there no matter the current
> +			 * "ret" value.
> +			 */
> +			list_del(&uwq->wq.entry);
> +			__add_wait_queue(&ctx->event_wqh, &uwq->wq);
>  
> -		if (!ret) {
> -			spin_lock(&ctx->event_wqh.lock);
> -			if (!list_empty(&fork_event)) {
> -				uwq = list_first_entry(&fork_event,
> -						       typeof(*uwq),
> -						       wq.entry);
> -				list_del(&uwq->wq.entry);
> -				__add_wait_queue(&ctx->event_wqh, &uwq->wq);
> +			/*
> +			 * Leave the event in the waitqueue and report
> +			 * error to userland if we failed to resolve
> +			 * the userfault fork.
> +			 */
> +			if (likely(!ret))
>  				userfaultfd_event_complete(ctx, uwq);
> -			}
> -			spin_unlock(&ctx->event_wqh.lock);
> +		} else {
> +			/*
> +			 * Here the fork thread aborted and the
> +			 * refcount from the fork thread on fork_nctx
> +			 * has already been released. We still hold
> +			 * the reference we took before releasing the
> +			 * lock above. If resolve_userfault_fork
> +			 * failed we've to drop it because the
> +			 * fork_nctx has to be freed in such case. If
> +			 * it succeeded we'll hold it because the new
> +			 * uffd references it.
> +			 */
> +			if (ret)
> +				userfaultfd_ctx_put(fork_nctx);
>  		}
> +		spin_unlock(&ctx->event_wqh.lock);
>  	}
>  
>  	return ret;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2017-09-21 10:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-19 13:18 userfaultfd use-after-free Mark Rutland
2017-09-20 18:04 ` [PATCH 1/1] userfaultfd: non-cooperative: fix fork use after free Andrea Arcangeli
2017-09-20 18:30   ` Greg KH
2017-09-21 10:46   ` Mark Rutland [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170921104641.GA12505@leverpostej \
    --to=mark.rutland@arm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dgilbert@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    --cc=syzkaller@googlegroups.com \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).