From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56289) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WN9Qq-0007oi-Ir for qemu-devel@nongnu.org; Mon, 10 Mar 2014 19:16:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WN9Qk-0007Sj-Jx for qemu-devel@nongnu.org; Mon, 10 Mar 2014 19:16:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:28311) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WN9Qk-0007Sc-BR for qemu-devel@nongnu.org; Mon, 10 Mar 2014 19:16:22 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s2ANGIue030353 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 10 Mar 2014 19:16:19 -0400 Message-ID: <531E47BF.8080709@redhat.com> Date: Tue, 11 Mar 2014 00:16:15 +0100 From: Laszlo Ersek MIME-Version: 1.0 References: <1394491449-10897-1-git-send-email-mreitz@redhat.com> <1394491449-10897-2-git-send-email-mreitz@redhat.com> In-Reply-To: <1394491449-10897-2-git-send-email-mreitz@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/3] qcow2: Check bs->drv in copy_sectors() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Max Reitz , qemu-devel@nongnu.org Cc: Kevin Wolf , Stefan Hajnoczi On 03/10/14 23:44, Max Reitz wrote: > Before dereferencing bs->drv for a call to its member bdrv_co_readv(), > copy_sectors() should check whether that pointer is indeed valid, since > it may have been set to NULL by e.g. a concurrent write triggering the > corruption prevention mechanism. > > Signed-off-by: Max Reitz > --- > To be precise, this still is a race condition. If bs->drv is set to NULL > after the check and before the call to bdrv_co_readv(), QEMU will > obviously still crash. However, in order to circumvent this behavior, we > would probably have to re-lock s->lock, check bs->drv, take the function > pointer to bdrv_co_readv() and then unlock s->lock before the function > is called. I found this rather ugly and therefore this still has a very > small chance of running into a race condition. > Therefore, I'm asking for your opinion on this, whether we can really > take this chance or should rather "do it right". In fact, if I were a > reviewer, I'd probably reject this patch and request the solution with > the function pointer (if there is no better solution), but I was afraid > to send such an ugly patch. > --- > block/qcow2-cluster.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index 36c1bed..9499df9 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -380,6 +380,10 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs, > > BLKDBG_EVENT(bs->file, BLKDBG_COW_READ); > > + if (!bs->drv) { > + return -ENOMEDIUM; > + } > + > /* Call .bdrv_co_readv() directly instead of using the public block-layer > * interface. This avoids double I/O throttling and request tracking, > * which can lead to deadlock when block layer copy-on-read is enabled. > I can't answer your question nor review this patch -- instead, I have a question of my own: when you say "set to NULL by [...] the corruption prevention mechanism", do you mean qcow2_pre_write_overlap_check(): bs->drv = NULL; /* make BDS unusable */ If so: I thought that it was quite a bold move, but also that we'd find the SIGSEGVs sooner or later... :) Thanks Laszlo