From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60144) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bDAzT-0007PB-Bo for qemu-devel@nongnu.org; Wed, 15 Jun 2016 09:36:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bDAzP-0007rb-85 for qemu-devel@nongnu.org; Wed, 15 Jun 2016 09:36:19 -0400 Received: from 12.mo6.mail-out.ovh.net ([178.32.125.228]:42095) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bDAzO-0007qn-UF for qemu-devel@nongnu.org; Wed, 15 Jun 2016 09:36:15 -0400 Received: from player738.ha.ovh.net (b7.ovh.net [213.186.33.57]) by mo6.mail-out.ovh.net (Postfix) with ESMTP id 3A4ECFF96FC for ; Wed, 15 Jun 2016 15:36:11 +0200 (CEST) References: <574D9F0F.7060904@redhat.com> <574D9FBB.60100@kaod.org> <574DA17D.5070505@redhat.com> <575EDE86.6080201@kaod.org> <575EE3B7.5080209@redhat.com> <575EF0AC.20305@kaod.org> <575F01EE.2050208@redhat.com> <575FB9F9.4000003@kaod.org> <20160614083832.GC4916@noname.str.redhat.com> <57602AAC.1050304@kaod.org> <20160615075704.GC4566@noname.redhat.com> From: =?UTF-8?Q?C=c3=a9dric_Le_Goater?= Message-ID: <576159C5.8060402@kaod.org> Date: Wed, 15 Jun 2016 15:36:05 +0200 MIME-Version: 1.0 In-Reply-To: <20160615075704.GC4566@noname.redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] m25p80: fix test on blk_pread() return value List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Eric Blake , Peter Crosthwaite , Max Reitz , qemu-block@nongnu.org, qemu-devel@nongnu.org On 06/15/2016 09:57 AM, Kevin Wolf wrote: > Am 14.06.2016 um 18:02 hat C=E9dric Le Goater geschrieben: >> On 06/14/2016 10:38 AM, Kevin Wolf wrote: >>> Am 14.06.2016 um 10:02 hat C=E9dric Le Goater geschrieben: >>>>>> #4 0x00007fa81c6694ac in bdrv_aligned_pwritev (bs=3D0x7fa81d4dd05= 0, req=3D, offset=3D30878208,=20 >>>>>> bytes=3D512, qiov=3D0x7fa7f47fee60, flags=3D0) >>>>>> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io= .c:1243 >>>>>> #5 0x00007fa81c669ecb in bdrv_co_pwritev (bs=3D0x7fa81d4dd050, of= fset=3D8, bytes=3D512, qiov=3D0x7fa80d5191c0,=20 >>>>>> flags=3D(BDRV_REQ_COPY_ON_READ | BDRV_REQ_ZERO_WRITE | BDRV_RE= Q_MAY_UNMAP | BDRV_REQ_NO_SERIALISING | BDRV_REQ_FUA | unknown: 427812425= 6), flags@entry=3D(unknown: 0)) >>>>>> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io= .c:1492 >>>>> >>>>> That 'flags' value looks bogus... >>>>> >>>>>> #6 0x00007fa81c65e367 in blk_co_pwritev (blk=3D0x7fa81d4c5b60, of= fset=3D30878208, bytes=3D256, qiov=3D0x7fa80d5191c0,=20 >>>>>> flags=3D(unknown: 0)) at /home/legoater/work/qemu/qemu-ast2400= -mainline.git/block/block-backend.c:788 >>>>>> #7 0x00007fa81c65e49b in blk_aio_write_entry (opaque=3D0x7fa7e849= aca0) >>>>>> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/bl= ock-backend.c:977 >>>>>> #8 0x00007fa81c6c823a in coroutine_trampoline (i0=3D, i1=3D) >>>>>> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/util/cor= outine-ucontext.c:78 >>>>>> #9 0x00007fa818ea8f00 in ?? () from /lib/x86_64-linux-gnu/libc.so= .6 >>>>> >>>>> and we don't get anything further in the backtrace beyond coroutine= s, to >>>>> see who's sending the bad parameters. I recently debugged a bogus = flags >>>>> in bdrv_aio_preadv, by hoisting an assert to occur before coroutine= s are >>>>> used in blk_aio_prwv(): >>>>> >>>>> https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg02948.html >>>>> >>>>> I've just posted v2 of that patch (now a 2/2 series), but in v2 no >>>>> longer kept the assert at that point. But maybe the correct fix, a= nd/or >>>>> the hack for catching the bug prior to coroutines, will help you de= bug >>>>> where the bad arguments are coming from. >>>> >>>> That does not fix the assert. >>>> =20 >>>>>> #10 0x00007fa80d5189d0 in ?? () >>>>>> #11 0x0000000000000000 in ?? () >>>>>> (gdb) up 4 >>>>>> #4 0x00007fa81c6694ac in bdrv_aligned_pwritev (bs=3D0x7fa81d4dd05= 0, req=3D, offset=3D30878208,=20 >>>>>> bytes=3D512, qiov=3D0x7fa7f47fee60, flags=3D0) >>>>>> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io= .c:1243 >>>>>> 1243 assert(!qiov || bytes =3D=3D qiov->size); >>>>>> (gdb) p *qiov=20 >>>>>> $1 =3D {iov =3D 0x7fa81da671d0, niov =3D 1, nalloc =3D 1, size =3D= 256} >>>> >>>> So, it seems that the issue is coming from the fact that bdrv_co_pwr= itev() >>>> does not handle alignments less than BDRV_SECTOR_SIZE : >>>> >>>> /* TODO Lift BDRV_SECTOR_SIZE restriction in BlockDriver interface = */ >>>> uint64_t align =3D MAX(BDRV_SECTOR_SIZE, bs->request_alignment); >>>> >>>> It calls bdrv_aligned_pwritev() which does the assert :=20 >>>> >>>> assert(!qiov || bytes =3D=3D qiov->size); >>> >>> Yes, but between these two places, there is code that should actually >>> enforce the right alignment: >>> >>> if ((offset + bytes) & (align - 1)) { >>> ... >>> } >>> >>> You can see in your backtrace that bdrv_aligned_pwritev() gets a >>> different qiov than bdrv_co_pwritev() (which is local_qiov in the lat= ter >>> function). >>> >>> It's just unclear to me why this code extended bytes, but didn't add = the >>> tail_buf iovec to local_qiov. >> >> The gdb backtrace is bogus. It does not make sense. May be a gdb issue >> with multithread on jessie. >> >> In the path tracking the tail bytes, we have :=20 >> >> if ((offset + bytes) & (align - 1)) { >> ... > if (!use_local_qiov) { > qemu_iovec_init(&local_qiov, qiov->niov + 1); > qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); > use_local_qiov =3D true; > } >> tail_bytes =3D (offset + bytes) & (align - 1); >> qemu_iovec_add(&local_qiov, tail_buf + tail_bytes, align - tai= l_bytes); >> >> bytes =3D ROUND_UP(bytes, align); >> } >> >> This is where the issue is I think. The qiov holds 256 and bytes 512. >> >> I have no idea how to fix that though. >=20 > Added some more context above. qiov->size as passed from the device is > already 256 bytes, which are added to local_qiov with > qemu_iovec_concat(). And then we add another 256 from tail_buf in the > lines that you quoted, so in theory we should end up with a properly > aligned 256 + 256 =3D 512 byte qiov. yes.=20 It seems that qiov is bogus after the bdrv_aligned_preadv() call. It gets= =20 zeroed most of the time, sometime ->size is 1, and then qemu_iovec_concat= () constructs an awful local_qiov, which brings the assert in bdrv_aligned_p= writev() How's that possible ? Could it be a serialization issue ?=20 C.