From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59730) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZQFQa-0002NX-RN for qemu-devel@nongnu.org; Fri, 14 Aug 2015 09:53:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZQFQZ-0001EA-FH for qemu-devel@nongnu.org; Fri, 14 Aug 2015 09:53:48 -0400 Date: Fri, 14 Aug 2015 09:53:37 -0400 From: Jeff Cody Message-ID: <20150814135337.GK9878@localhost.localdomain> References: <1439455310-11263-1-git-send-email-kwolf@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1439455310-11263-1-git-send-email-kwolf@redhat.com> Subject: Re: [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: famz@redhat.com, qemu-block@nongnu.org, qemu-devel@nongnu.org, qemu-stable@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com On Thu, Aug 13, 2015 at 10:41:50AM +0200, Kevin Wolf wrote: > This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero > write on target if sectors not allocated"), which was reported to cause > aborts with the message "Co-routine re-entered recursively". > > The cause for this bug is the following code in mirror_iteration_done(): > > if (s->common.busy) { > qemu_coroutine_enter(s->common.co, NULL); > } > > This has always been ugly because - unlike most places that reenter - it > doesn't have a specific yield that it pairs with, but is more > uncontrolled. What we really mean here is "reenter the coroutine if > it's in one of the four explicit yields in mirror.c". > > This used to be equivalent with s->common.busy because neither > mirror_run() nor mirror_iteration() call any function that could yield. > However since commit dcfb3beb this doesn't hold true any more: > bdrv_get_block_status_above() can yield. > > So what happens is that bdrv_get_block_status_above() wants to take a > lock that is already held, so it adds itself to the queue of waiting > coroutines and yields. Instead of being woken up by the unlock function, > however, it gets woken up by mirror_iteration_done(), which is obviously > wrong. > > In most cases the code actually happens to cope fairly well with such > cases, but in this specific case, the unlock must already have scheduled > the coroutine for wakeup when mirror_iteration_done() reentered it. And > then the coroutine happened to process the scheduled restarts and tried > to reenter itself recursively. > > This patch fixes the problem by pairing the reenter in > mirror_iteration_done() with specific yields instead of abusing > s->common.busy. > > Cc: qemu-stable@nongnu.org > Signed-off-by: Kevin Wolf Thanks, applied to my block branch: https://github.com/codyprime/qemu-kvm-jtc/tree/block -Jeff