From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57870) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YsWjA-00040H-3t for qemu-devel@nongnu.org; Wed, 13 May 2015 09:29:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YsWj5-0004Ak-9k for qemu-devel@nongnu.org; Wed, 13 May 2015 09:29:36 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:38437) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YsWj5-0004A9-1L for qemu-devel@nongnu.org; Wed, 13 May 2015 09:29:31 -0400 Received: by wicnf17 with SMTP id nf17so55919211wic.1 for ; Wed, 13 May 2015 06:29:29 -0700 (PDT) Message-ID: <555351BA.9060807@m2r.biz> Date: Wed, 13 May 2015 15:29:30 +0200 From: Fabio Fantoni MIME-Version: 1.0 References: <553508EB.6030400@m2r.biz> <553636BF.9060609@m2r.biz> <5550C517.70602@m2r.biz> <5551C6A9.6040400@m2r.biz> <5551D568.4010006@m2r.biz> <55520630.1080901@m2r.biz> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Regression: qemu crash of hvm domUs with spice (backtrace included) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefano Stabellini Cc: "qemu-devel@nongnu.org" , xen-devel@lists.xen.org, Gerd Hoffmann , Jan Beulich , Anthony PERARD , spice-devel@lists.freedesktop.org Il 12/05/2015 16:44, Stefano Stabellini ha scritto: > On Tue, 12 May 2015, Stefano Stabellini wrote: >> On Tue, 12 May 2015, Fabio Fantoni wrote: >>> Il 12/05/2015 12:26, Fabio Fantoni ha scritto: >>>> Il 12/05/2015 11:23, Fabio Fantoni ha scritto: >>>>> Il 11/05/2015 17:04, Fabio Fantoni ha scritto: >>>>>> Il 21/04/2015 14:53, Stefano Stabellini ha scritto: >>>>>>> On Tue, 21 Apr 2015, Fabio Fantoni wrote: >>>>>>>> Il 21/04/2015 12:49, Stefano Stabellini ha scritto: >>>>>>>>> On Mon, 20 Apr 2015, Fabio Fantoni wrote: >>>>>>>>>> I updated xen and qemu from xen 4.5.0 with its upstream qemu >>>>>>>>>> included to >>>>>>>>>> xen >>>>>>>>>> 4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk >>>>>>>>>> to use >>>>>>>>>> revision "master"). >>>>>>>>>> After few minutes I booted windows 7 64 bit domU qemu crash, >>>>>>>>>> tried 2 times >>>>>>>>>> with same result. >>>>>>>>>> >>>>>>>>>> In the domU's qemu log: >>>>>>>>>>> qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion >>>>>>>>>>> `(old_top == >>>>>>>>>>> (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - >>>>>>>>>>> __builtin_offsetof >>>>>>>>>>> (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned >>>>>>>>>>> long) >>>>>>>>>>> (old_size) >= (unsigned long)((((__builtin_offsetof (struct >>>>>>>>>>> malloc_chunk, >>>>>>>>>>> fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * >>>>>>>>>>> (sizeof(size_t))) - >>>>>>>>>>> 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & >>>>>>>>>>> pagemask) >>>>>>>>>>> == >>>>>>>>>>> 0)' failed. >>>>>>>>>>> Killing all inferiors >>>>>>>>>> In attachment the full backtrace of qemu crash. >>>>>>>>>> >>>>>>>>>> With a fast search after I saw the backtrace I found a probable >>>>>>>>>> cause of >>>>>>>>>> regression (I'm not sure): >>>>>>>>>> http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa >>>>>>>>>> spice: make sure we don't overflow ssd->buf >>>>>>>>>> >>>>>>>>>> Added also qemu-devel and spice-devel as cc. >>>>>>>>>> >>>>>>>>>> If you need more informations/tests tell me and I'll post them. >>>>>>>>> Maybe you could try to revert the offending commit >>>>>>>>> (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect >>>>>>>>> the >>>>>>>>> crash? >>>>>>>> Thanks for your reply. >>>>>>>> >>>>>>>> I reverted to 4.5.0 on dom0 for now on that system because I'm busy >>>>>>>> trying to >>>>>>>> found another problem that cause very bad performance without errors >>>>>>>> or >>>>>>>> nothing in logs :( I don't know if if xen related, kernel related or >>>>>>>> other for >>>>>>>> now. >>>>>>>> >>>>>>>> About this regression with spice I'll do further tests in next days >>>>>>>> (probably >>>>>>>> starting reverting the spice patch in qemu) but any help is >>>>>>>> appreciated. >>>>>>>> Based on data I have for now is possible that the problem is that >>>>>>>> qemu try to >>>>>>>> allocate other ram or videoram after domU create but with xen is not >>>>>>>> possible? >>>>>>>> In the spice related patch I saw something about dynamic allocation >>>>>>>> for >>>>>>>> example. >>>>>>> It is probably caused by a commit in the range: >>>>>>> >>>>>>> 1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4 >>>>>>> >>>>>>> there are only 10 commits in that range. By using git bisect you >>>>>>> should >>>>>>> be able to narrow it down in just 3 tests. >>>>>> Sorry for delay, I was busy with many things, today I retried with >>>>>> updated stable-4.5 and also reverting "spice: make sure we don't >>>>>> overflow ssd->buf" (in a second test) but in both case regression remain >>>>>> :( >>>>>> Tomorrow probably I'll do other tests. >>>>> I did another test, reverting this instead: >>>>> http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8 >>>>> And now seems I'm unable to reproduce the regression, before happen after >>>>> few seconds up to 1-2 minutes, now I use the same domU 15-20 minutes >>>>> without problem. >>>>> Probably is the cause of regression even if seems strange that on unstable >>>>> with same patch on tests of some days ago didn't happen. >>>>> >>>>> Any ideas? >>>>> >>>>> Thanks for any reply and sorry for my bad english. >>>> Bad news, qemu crash still happen even if this time in qemu log there is >>>> another output, see attachment. >>>> After take a look on the other patches I saw: >>>> http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commitdiff;h=7154fba0e51ec985ef621965d1b7120ad424fcbf >>>> With "Conflicts: hw/display/vga.c" in description I'll try to revert it >>>> instead. >>>> >>>> Or someone can tell me another probable test I can try? >>> Tried also to revet the patch above with same result, so I retried with qemu >>> from 4.5.0 and seems the crash happen also in this case...I'm going crazy :( > Sorry, I missed this bit before. The only thing I could suggest at this > point, would be to make sure that you have a clean test environment. > Usually this happens when you have some "leftovers" from previous broken > tests. I use make debball to be sure to track and remove all files on package update. Now I retried with latest xen-unstable and the qemu crash didn't happen, more exactly I used this: https://github.com/Fantu/Xen/commits/rebase/m2r-staging Latest test with regression based on latest stable-4.5, more exactly: https://github.com/Fantu/Xen/commits/rebase/m2r-testing Some days ago on same dom0 and domU I tried with latest stable version (that I use on only 2 production servers for now but I not saw the regression), more exactly: https://github.com/Fantu/Xen/commits/rebase/m2r-stable-4.5 Dom0 debian 7 with kernel 3.16 from backports, seabios 1.8.1-2 from unstable and this xen configure: ./configure --prefix=/usr --disable-blktap1 --disable-qemu-traditional --disable-rombios --with-system-seabios=/usr/share/seabios/bios-256k.bin --with-extra-qemuu-configure-args="--enable-spice --enable-usb-redir" --disable-blktap2 I suppose that there is unexpected case caused by a backports or missed patch/es to backports from unstable. I not found with a fast look rilevant patch to try to revert, can anyone suggest me the more probable point/s for bisect and/or patch to revert or I must try full bisect 4.5.0->stable-4.5? Thanks for any reply and sorry for my bad english.