Re: kvm: use-after-free in process_srcu

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Dmitry Vyukov" <dvyukov@google.com>,
	"Steve Rutherford" <srutherford@google.com>,
	syzkaller <syzkaller@googlegroups.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"KVM list" <kvm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: kvm: use-after-free in process_srcu
Date: Tue, 17 Jan 2017 12:34:36 -0800	[thread overview]
Message-ID: <20170117203436.GC5238@linux.vnet.ibm.com> (raw)
In-Reply-To: <f99af820-fc36-4786-e950-acef43ff3090@redhat.com>

On Tue, Jan 17, 2017 at 01:03:28PM +0100, Paolo Bonzini wrote:
> 
> 
> On 17/01/2017 12:13, Dmitry Vyukov wrote:
> > On Tue, Jan 17, 2017 at 12:08 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>
> >>
> >> On 17/01/2017 10:56, Dmitry Vyukov wrote:
> >>>> I am seeing use-after-frees in process_srcu as struct srcu_struct is
> >>>> already freed. Before freeing struct srcu_struct, code does
> >>>> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do:
> >>>>
> >>>> +      srcu_barrier(&kvm->irq_srcu);
> >>>>          cleanup_srcu_struct(&kvm->irq_srcu);
> >>>>
> >>>> It reduced rate of use-after-frees, but did not eliminate them
> >>>> completely. The full threaded is here:
> >>>> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ
> >>>>
> >>>> Does Paolo's fix above make sense to you? Namely adding
> >>>> flush_delayed_work(&sp->work) to cleanup_srcu_struct()?

Yes, we do need a flush_delayed_work(), good catch!

But doing multiple of them should not be necessary because there shouldn't
be any callbacks at all once the srcu_barrier() returns, and the only
time SRCU queues more work is if there is at least one callback pending.
The code is making sure that no new call_srcu() invocations happen before
it does the srcu_barrier(), right?

So if you are seing failures even with the single flush_delayed_work(),
it would be interesting to set a flag in the srcu_struct at
cleanup_srcu_struct time, and then splat if srcu_reschedule() does its
queue_delayed_work() when that flag is set.

> >>> I am not sure about interaction of flush_delayed_work and
> >>> srcu_reschedule... flush_delayed_work probably assumes that no work is
> >>> queued concurrently, but what if srcu_reschedule queues another work
> >>> concurrently... can't it happen that flush_delayed_work will miss that
> >>> newly scheduled work?
> >>
> >> Newly scheduled callbacks would be a bug in SRCU usage, but my patch is
> > 
> > I mean not srcu callbacks, but the sp->work being rescheduled.
> > Consider that callbacks are already scheduled. We call
> > flush_delayed_work, it waits for completion of process_srcu. But that
> > process_srcu schedules sp->work again in srcu_reschedule.

It only does this if there are callbacks still on the srcu_struct, so
if you are seeing this, we either have a bug in SRCU that finds callbacks
when none are present or we have a usage bug that is creating new callbacks
after src_barrier() starts.

Do any of your callback functions invoke call_srcu()?  (Hey, I have to ask!)

> >> indeed insufficient.  Because of SRCU's two-phase algorithm, it's possible
> >> that the first flush_delayed_work doesn't invoke all callbacks.  Instead I
> >> would propose this (still untested, but this time with a commit message):
> >>
> >> ---------------- 8< --------------
> >> From: Paolo Bonzini <pbonzini@redhat.com>
> >> Subject: [PATCH] srcu: wait for all callbacks before deeming SRCU "cleaned up"
> >>
> >> Even though there are no concurrent readers, it is possible that the
> >> work item is queued for delayed processing when cleanup_srcu_struct is
> >> called.  The work item needs to be flushed before returning, or a
> >> use-after-free can ensue.
> >>
> >> Furthermore, because of SRCU's two-phase algorithm it may take up to
> >> two executions of srcu_advance_batches before all callbacks are invoked.
> >> This can happen if the first flush_delayed_work happens as follows
> >>
> >>                                                           srcu_read_lock
> >>     process_srcu
> >>         srcu_advance_batches
> >>             ...
> >>             if (!try_check_zero(sp, idx^1, trycount))
> >>                 // there is a reader
> >>                 return;
> >>         srcu_invoke_callbacks
> >>             ...
> >>                                                           srcu_read_unlock
> >>                                                           cleanup_srcu_struct
> >>                                                               flush_delayed_work
> >>         srcu_reschedule
> >>             queue_delayed_work
> >>
> >> Now flush_delayed_work returns but srcu_reschedule will *not* have cleared
> >> sp->running to false.

But srcu_reschedule() sets sp->running to false if there are no callbacks.
And at that point, there had better be no callbacks.

> >> Not-tested-by: Paolo Bonzini <pbonzini@redhat.com>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >>
> >> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
> >> index 9b9cdd549caa..9470f1ba2ef2 100644
> >> --- a/kernel/rcu/srcu.c
> >> +++ b/kernel/rcu/srcu.c
> >> @@ -283,6 +283,14 @@ void cleanup_srcu_struct(struct srcu_struct *sp)
> >>  {
> >>         if (WARN_ON(srcu_readers_active(sp)))
> >>                 return; /* Leakage unless caller handles error. */
> >> +
> >> +       /*
> >> +        * No readers active, so any pending callbacks will rush through the two
> >> +        * batches before sp->running becomes false.  No risk of busy-waiting.
> >> +        */
> >> +       while (sp->running)
> >> +               flush_delayed_work(&sp->work);
> > 
> > Unsynchronized accesses to shared state make me nervous. running is
> > meant to be protected with sp->queue_lock.
> 
> I think it could just be
> 
> 	while (flush_delayed_work(&sp->work));
> 
> but let's wait for Paul.

If it needs to be more than just a single flush_delayed_work(), we have
some other bug somewhere.  ;-)

							Thanx, Paul

> Paolo
> 
> > At least we will get back to you with a KTSAN report.
> > 
> >>         free_percpu(sp->per_cpu_ref);
> >>         sp->per_cpu_ref = NULL;
> >>  }
> >>
> >>
> >> Thanks,
> >>
> >> Paolo
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> >> For more options, visit https://groups.google.com/d/optout.
> > 
>

next prev parent reply	other threads:[~2017-01-17 20:35 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-11  6:46 kvm: use-after-free in process_srcu Dmitry Vyukov
2016-12-11  8:40 ` Vegard Nossum
2016-12-11  8:49   ` Dmitry Vyukov
2017-01-13  3:30     ` Steve Rutherford
2017-01-13  9:19       ` Dmitry Vyukov
2017-01-15 17:11         ` Dmitry Vyukov
2017-01-16 21:34           ` Dmitry Vyukov
2017-01-16 21:48             ` Paolo Bonzini
2017-01-17  9:47               ` Dmitry Vyukov
2017-01-17  9:56                 ` Dmitry Vyukov
2017-01-17 11:08                   ` Paolo Bonzini
2017-01-17 11:13                     ` Dmitry Vyukov
2017-01-17 12:03                       ` Paolo Bonzini
2017-01-17 20:34                         ` Paul E. McKenney [this message]
2017-01-18  8:53                           ` Paolo Bonzini
2017-01-18 22:15                             ` Paul E. McKenney
2017-01-19  9:27                               ` Paolo Bonzini
2017-01-19 21:52                                 ` Paul McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170117203436.GC5238@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=dvyukov@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=srutherford@google.com \
    --cc=syzkaller@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.