From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: kvm: use-after-free in process_srcu Date: Tue, 17 Jan 2017 12:34:36 -0800 Message-ID: <20170117203436.GC5238@linux.vnet.ibm.com> References: <754246063.9562871.1484603305281.JavaMail.zimbra@redhat.com> Reply-To: paulmck@linux.vnet.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Dmitry Vyukov , Steve Rutherford , syzkaller , Radim =?utf-8?B?S3LEjW3DocWZ?= , KVM list , LKML To: Paolo Bonzini Return-path: Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:44487 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751381AbdAQUfI (ORCPT ); Tue, 17 Jan 2017 15:35:08 -0500 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v0HKXsxY022474 for ; Tue, 17 Jan 2017 15:34:41 -0500 Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.150]) by mx0b-001b2d01.pphosted.com with ESMTP id 281qd8832r-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 17 Jan 2017 15:34:41 -0500 Received: from localhost by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 17 Jan 2017 13:34:40 -0700 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Jan 17, 2017 at 01:03:28PM +0100, Paolo Bonzini wrote: > > > On 17/01/2017 12:13, Dmitry Vyukov wrote: > > On Tue, Jan 17, 2017 at 12:08 PM, Paolo Bonzini wrote: > >> > >> > >> On 17/01/2017 10:56, Dmitry Vyukov wrote: > >>>> I am seeing use-after-frees in process_srcu as struct srcu_struct is > >>>> already freed. Before freeing struct srcu_struct, code does > >>>> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do: > >>>> > >>>> + srcu_barrier(&kvm->irq_srcu); > >>>> cleanup_srcu_struct(&kvm->irq_srcu); > >>>> > >>>> It reduced rate of use-after-frees, but did not eliminate them > >>>> completely. The full threaded is here: > >>>> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ > >>>> > >>>> Does Paolo's fix above make sense to you? Namely adding > >>>> flush_delayed_work(&sp->work) to cleanup_srcu_struct()? Yes, we do need a flush_delayed_work(), good catch! But doing multiple of them should not be necessary because there shouldn't be any callbacks at all once the srcu_barrier() returns, and the only time SRCU queues more work is if there is at least one callback pending. The code is making sure that no new call_srcu() invocations happen before it does the srcu_barrier(), right? So if you are seing failures even with the single flush_delayed_work(), it would be interesting to set a flag in the srcu_struct at cleanup_srcu_struct time, and then splat if srcu_reschedule() does its queue_delayed_work() when that flag is set. > >>> I am not sure about interaction of flush_delayed_work and > >>> srcu_reschedule... flush_delayed_work probably assumes that no work is > >>> queued concurrently, but what if srcu_reschedule queues another work > >>> concurrently... can't it happen that flush_delayed_work will miss that > >>> newly scheduled work? > >> > >> Newly scheduled callbacks would be a bug in SRCU usage, but my patch is > > > > I mean not srcu callbacks, but the sp->work being rescheduled. > > Consider that callbacks are already scheduled. We call > > flush_delayed_work, it waits for completion of process_srcu. But that > > process_srcu schedules sp->work again in srcu_reschedule. It only does this if there are callbacks still on the srcu_struct, so if you are seeing this, we either have a bug in SRCU that finds callbacks when none are present or we have a usage bug that is creating new callbacks after src_barrier() starts. Do any of your callback functions invoke call_srcu()? (Hey, I have to ask!) > >> indeed insufficient. Because of SRCU's two-phase algorithm, it's possible > >> that the first flush_delayed_work doesn't invoke all callbacks. Instead I > >> would propose this (still untested, but this time with a commit message): > >> > >> ---------------- 8< -------------- > >> From: Paolo Bonzini > >> Subject: [PATCH] srcu: wait for all callbacks before deeming SRCU "cleaned up" > >> > >> Even though there are no concurrent readers, it is possible that the > >> work item is queued for delayed processing when cleanup_srcu_struct is > >> called. The work item needs to be flushed before returning, or a > >> use-after-free can ensue. > >> > >> Furthermore, because of SRCU's two-phase algorithm it may take up to > >> two executions of srcu_advance_batches before all callbacks are invoked. > >> This can happen if the first flush_delayed_work happens as follows > >> > >> srcu_read_lock > >> process_srcu > >> srcu_advance_batches > >> ... > >> if (!try_check_zero(sp, idx^1, trycount)) > >> // there is a reader > >> return; > >> srcu_invoke_callbacks > >> ... > >> srcu_read_unlock > >> cleanup_srcu_struct > >> flush_delayed_work > >> srcu_reschedule > >> queue_delayed_work > >> > >> Now flush_delayed_work returns but srcu_reschedule will *not* have cleared > >> sp->running to false. But srcu_reschedule() sets sp->running to false if there are no callbacks. And at that point, there had better be no callbacks. > >> Not-tested-by: Paolo Bonzini > >> Signed-off-by: Paolo Bonzini > >> > >> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c > >> index 9b9cdd549caa..9470f1ba2ef2 100644 > >> --- a/kernel/rcu/srcu.c > >> +++ b/kernel/rcu/srcu.c > >> @@ -283,6 +283,14 @@ void cleanup_srcu_struct(struct srcu_struct *sp) > >> { > >> if (WARN_ON(srcu_readers_active(sp))) > >> return; /* Leakage unless caller handles error. */ > >> + > >> + /* > >> + * No readers active, so any pending callbacks will rush through the two > >> + * batches before sp->running becomes false. No risk of busy-waiting. > >> + */ > >> + while (sp->running) > >> + flush_delayed_work(&sp->work); > > > > Unsynchronized accesses to shared state make me nervous. running is > > meant to be protected with sp->queue_lock. > > I think it could just be > > while (flush_delayed_work(&sp->work)); > > but let's wait for Paul. If it needs to be more than just a single flush_delayed_work(), we have some other bug somewhere. ;-) Thanx, Paul > Paolo > > > At least we will get back to you with a KTSAN report. > > > >> free_percpu(sp->per_cpu_ref); > >> sp->per_cpu_ref = NULL; > >> } > >> > >> > >> Thanks, > >> > >> Paolo > >> > >> -- > >> You received this message because you are subscribed to the Google Groups "syzkaller" group. > >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com. > >> For more options, visit https://groups.google.com/d/optout. > > >