From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50085) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aiL9g-0006Hw-Ta for qemu-devel@nongnu.org; Tue, 22 Mar 2016 08:11:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aiL9d-0003hh-MC for qemu-devel@nongnu.org; Tue, 22 Mar 2016 08:11:24 -0400 References: <56E93A22.1080102@de.ibm.com> <56E93ECE.10103@redhat.com> <56E9425C.8030201@de.ibm.com> <56E957AD.2050005@redhat.com> <56E961EA.4090908@de.ibm.com> <56E9638B.5090204@redhat.com> <20160317003906.GA23821@ad.usersys.redhat.com> <56EA8EEE.2020801@linux.vnet.ibm.com> <20160321105718.GA7710@ad.usersys.redhat.com> <56F0EFCA.4080003@linux.vnet.ibm.com> <20160322071804.GC24999@ad.usersys.redhat.com> <20160322100706.597dab1b.cornelia.huck@de.ibm.com> <56F11492.5040103@redhat.com> <20160322125958.13afc4e7.cornelia.huck@de.ibm.com> From: Paolo Bonzini Message-ID: <56F13659.1020605@redhat.com> Date: Tue, 22 Mar 2016 13:11:05 +0100 MIME-Version: 1.0 In-Reply-To: <20160322125958.13afc4e7.cornelia.huck@de.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Cornelia Huck Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, "Michael S. Tsirkin" , qemu-devel@nongnu.org, Christian Borntraeger , tu bo , Stefan Hajnoczi On 22/03/2016 12:59, Cornelia Huck wrote: >> > They can be fixed with just an extra object_ref/object_unref. >> > >> > I didn't understand that Tu Bo also needed the BH fix, and with that >> > information it makes sense. Passing the assign value ensures that >> > ioeventfd remains always assigned. With the CPU threads out of the >> > picture, the BH becomes enough to make everything thread-safe. > Yes, this makes sense. > > Might we still have a hole somewhere in dataplane teardown? Probably > not, from reading the code, even if it runs in cpu thread context. The bug arises when the main thread sets started = true, a CPU thread comes along while the ioeventfd is reset, and as soon as the BQL is released by the main thread the CPU thread thinks it is a dataplane thread. This does horrible things to non-reentrant code. For stop we should be safe because the CPU thread is the one that sets started = false. IOW, we should be safe as long as the ioeventfd is never unassigned (your fix) _and_ we ensure serialization between threads that touch dataplane_started (Fam's fix). Paolo