From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36338) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1adf6y-0002S5-0K for qemu-devel@nongnu.org; Wed, 09 Mar 2016 09:29:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1adf6u-0001Tg-KD for qemu-devel@nongnu.org; Wed, 09 Mar 2016 09:29:15 -0500 Received: from e06smtp17.uk.ibm.com ([195.75.94.113]:37955) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1adf6u-0001Sw-Aw for qemu-devel@nongnu.org; Wed, 09 Mar 2016 09:29:12 -0500 Received: from localhost by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 9 Mar 2016 14:29:10 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id B33FE17D805F for ; Wed, 9 Mar 2016 14:29:36 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u29ET8jb2163134 for ; Wed, 9 Mar 2016 14:29:08 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u29ET8N7001821 for ; Wed, 9 Mar 2016 07:29:08 -0700 References: <1455470231-5223-1-git-send-email-pbonzini@redhat.com> <1455470231-5223-6-git-send-email-pbonzini@redhat.com> <56E01544.6060305@de.ibm.com> <56E01D3F.1060204@redhat.com> From: Christian Borntraeger Message-ID: <56E03333.5020601@de.ibm.com> Date: Wed, 9 Mar 2016 15:29:07 +0100 MIME-Version: 1.0 In-Reply-To: <56E01D3F.1060204@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 5/8] virtio-blk: fix "disabled data plane" mode List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org On 03/09/2016 01:55 PM, Paolo Bonzini wrote: > > > On 09/03/2016 13:21, Christian Borntraeger wrote: >> I have some random crashes at startup >> >> Stack trace of thread 48326: >> #0 0x000002aa2e0cce46 bdrv_co_do_rw (qemu-system-s390x) >> #1 0x000002aa2e159e8e coroutine_trampoline (qemu-system-s390x) >> #2 0x000003ffbc35150a __makecontext_ret (libc.so.6) >> >> >> that I was able to bisect. >> commit 2906cddfecff21af20eedab43288b485a679f9ac does crash regularly, >> 2906cddfecff21af20eedab43288b485a679f9ac^ does not. >> >> I will try to find somebody that looks into that - unless you have an idea. > > The only random idea is to move > > vblk->dataplane_started = true > > to the beginning of virtio_blk_data_plane_start rather than the end. > > Paolo FWIW, it seems that this patch triggers this error, the "tracked_request_begin" that I reported yesterday and / or some early read issues from the bootloader in a random fashion. Using 2906cddfecff21af20eedab43288b485a679f9ac^ seems to work all the time, moving around vblk->dataplane_started = true also triggers all 3 types of bugs, e.g. Thread 1 (Thread 0x3ffaabff910 (LWP 32782)): #0 0x0000000010329a70 in bdrv_co_do_rw (opaque=0x0) at /home/cborntra/REPOS/qemu/block/io.c:2170 #1 0x00000000103b2e7a in coroutine_trampoline (i0=1023, i1=-2147470992) at /home/cborntra/REPOS/qemu/util/coroutine-ucontext.c:79 #2 0x000003ffac85150a in __makecontext_ret () from /lib64/libc.so.6 (gdb) list 2165 2166 /* Invoke bdrv_co_do_readv/bdrv_co_do_writev */ 2167 static void coroutine_fn bdrv_co_do_rw(void *opaque) 2168 { 2169 BlockAIOCBCoroutine *acb = opaque; 2170 BlockDriverState *bs = acb->common.bs; 2171 2172 if (!acb->is_write) { 2173 acb->req.error = bdrv_co_do_readv(bs, acb->req.sector, 2174 acb->req.nb_sectors, acb->req.qiov, acb->req.flags); I will try to find somebody to work on this.