qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Grundmann, Christian" <Christian.Grundmann@fabasoft.com>
Cc: "'qemu-devel@nongnu.org'" <qemu-devel@nongnu.org>,
	"stefanha@redhat.com" <stefanha@redhat.com>
Subject: Re: [Qemu-devel] WG: [ovirt-users] Segmentation fault in libtcmalloc
Date: Tue, 17 Nov 2015 11:36:02 +0000	[thread overview]
Message-ID: <20151117113601.GD2498@work-vm> (raw)
In-Reply-To: <6A17C71B52524C408E7AAF69103E9E490F153F45@fabamailserver.fabagl.fabasoft.com>

* Grundmann, Christian (Christian.Grundmann@fabasoft.com) wrote:
> Hi,
> 
> @ Can you please use a 'thread apply all bt full'   the full gives a little more info.
> 
> gdb --batch /usr/libexec/qemu-kvm core.52281.1447709011.dump -ex "set pagination off" -ex "thread apply all bt full"

OK, it doesn't relaly give any more without the debuginfo package mentioned below.

<snip>

> @ Also, if you've not already got it installed can you please install the debuginfo package for qemu, it gives a lot more information in backtraces.
> Sorry it's a ovirt-node System where I can't you yum

Ah, although perhaps if you took the core dump, onto another machine with matching qemu and debuginfo you should
be able to get more detail.

> @ Does this part always look the same in your backtraces?
> The most are the same, found one a little bit different :
> Thread 1 (Thread 0x7f378a0d7c00 (LWP 6658)):
> #0  0x00007f3785d18353 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) () from /lib64/libtcmalloc.so.4
> No symbol table info available.
> #1  0x00007f3785d186b0 in tcmalloc::ThreadCache::Scavenge() () from /lib64/libtcmalloc.so.4
> No symbol table info available.
> #2  0x00007f3785d27057 in tc_free () from /lib64/libtcmalloc.so.4
> No symbol table info available.
> #3  0x00007f37885e858f in g_free () from /lib64/libglib-2.0.so.0
> No symbol table info available.
> #4  0x00007f37885fec89 in g_slice_free1 () from /lib64/libglib-2.0.so.0
> No symbol table info available.
> #5  0x00007f378a1f232e in virtio_blk_rw_complete ()
> No symbol table info available.
> #6  0x00007f378a39f1ae in bdrv_co_em_bh ()
> No symbol table info available.
> #7  0x00007f378a398394 in aio_bh_poll ()
> No symbol table info available.
> #8  0x00007f378a3a7409 in aio_dispatch_clients ()
> No symbol table info available.
> #9  0x00007f378a39820e in aio_ctx_dispatch ()
> No symbol table info available.
> #10 0x00007f37885e299a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
> No symbol table info available.
> #11 0x00007f378a3a6288 in main_loop_wait ()
> No symbol table info available.
> #12 0x00007f378a1a5a4e in main ()
> No symbol table info available.
> 

OK, that's a bit different but interesting....

> @  1) Was there anything nasty in the /var/log/libvirt/qemu/yourvmname.log ?
> No nothing abnormal
> 
> @  2) Did you hit any IO errors and need to tell the VM to continue after a problem?
> Ovirt tells me "no Storage space error". Which is something like the disk is growing to fast i think. I use Snapshots so on heavy write the disk has to grow a lot.
> Sometimes the VM is paused and resumed from ovirt. Sometimes the VM stays offline.

OK, that's interesting, because you may be hitting the following bug;
http://lists.nongnu.org/archive/html/qemu-block/2015-11/msg00585.html

whose fix coincidentally just got accepted today; it's related to error cases with error=stop which
you are using.

Do you think you're only hitting these crashes on VMs that have been paused because of these space errors?

>      disk emulation and see if the problem goes away - e.g. virtio-scsi would be a good one to try.
> 
> Ok will try that and report

Thanks,

Dave

> 
> Thx Christian
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Dr. David Alan Gilbert [mailto:dgilbert@redhat.com] 
> Gesendet: Dienstag, 17. November 2015 10:59
> An: Grundmann, Christian <Christian.Grundmann@fabasoft.com>
> Cc: 'qemu-devel@nongnu.org' <qemu-devel@nongnu.org>; stefanha@redhat.com
> Betreff: Re: [Qemu-devel] WG: [ovirt-users] Segmentation fault in libtcmalloc
> 
> * Grundmann, Christian (Christian.Grundmann@fabasoft.com) wrote:
> > Hi,
> > Dan sent me over to you,
> > please let me know if i can provide additional informations
> 
> Hi Christian,
>   Thanks for reporting this,
> 
> > Softwareversions:
> > ovirt-node-iso-3.6-0.999.201510221942.el7.centos.iso
> > 
> > qemu-img-ev-2.3.0-29.1.el7.x86_64
> > qemu-kvm-ev-2.3.0-29.1.el7.x86_64
> > qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64
> > qemu-kvm-tools-ev-2.3.0-29.1.el7.x86_64
> > ipxe-roms-qemu-20130517-7.gitc4bce43.el7.noarch
> > kernel-3.10.0-229.14.1.el7.x86_64
> > gperftools-libs-2.4-7.el7.x86_64
> > 
> > Commandline:
> > /usr/libexec/qemu-kvm -name myvmname -S -machine 
> > rhel6.5.0,accel=kvm,usb=off -cpu Westmere -m 7168 -realtime mlock=off 
> > -smp 2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 
> > 5b6b8899-5a9d-4c07-a6aa-6171527ad319 -smbios 
> > type=1,manufacturer=oVirt,product=oVirt 
> > Node,version=3.6-0.999.201510221942.el7.centos,serial=30343536-3138-5A
> > 43-4A34-323630303253,uuid=5b6b8899-5a9d-4c07-a6aa-6171527ad319 
> > -nographic -no-user-config -nodefaults -chardev 
> > socket,id=charmonitor,path=/var/lib/libvirt/qemu/myvmname.monitor,serv
> > er,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc 
> > base=2015-11-15T20:04:35,driftfix=slew -global 
> > kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on 
> > -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
> > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device 
> > virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 
> > -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial= 
> > -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 
> > -drive 
> > file=/rhev/data-center/00000002-0002-0002-0002-0000000000e2/5df61b84-8
> > 746-4460-b148-65cc0eb8d29c/images/8202b81d-6191-495f-8c9d-7d90baffaecf
> > /d7665e07-1786-4051-aa26-0a3e1c9d2574,if=none,id=drive-virtio-disk0,fo
> > rmat=qcow2,serial=8202b81d-6191-495f-8c9d-7d90baffaecf,cache=none,werr
> > or=stop,rerror=stop,aio=native -device 
> > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id
> > =virtio-disk0,bootindex=1 -netdev 
> > tap,fd=39,id=hostnet0,vhost=on,vhostfd=65 -device 
> > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:83:a2:0e,bus=pci.0
> > ,addr=0x3 -chardev 
> > socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/5b6b8899-5a
> > 9d-4c07-a6aa-6171527ad319.com.redhat.rhevm.vdsm,server,nowait -device 
> > virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=chann
> > el0,name=com.redhat.rhevm.vdsm -chardev 
> > socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/5b6b8899-5a
> > 9d-4c07-a6aa-6171527ad319.org.qemu.guest_agent.0,server,nowait -device 
> > virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=chann
> > el1,name=org.qemu.guest_agent.0 -device 
> > cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device 
> > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on
> > 
> > Stack Trace:
> > 
> > gdb --batch /usr/libexec/qemu-kvm core.14750.1447544080.dump -ex "set pagination off" -ex "thread apply all bt"
> 
> Can you please use a 'thread apply all bt full'   the full gives a little more info.
> Also, if you've not already got it installed can you please install the debuginfo package for qemu, it gives a lot more information in backtraces.
> 
> > Thread 1 (Thread 0x7fa8b16afc00 (LWP 14750)):
> > #0  0x00007fa8ad2febe1 in tc_malloc () from /lib64/libtcmalloc.so.4
> > #1  0x00007fa8b186b489 in malloc_and_trace ()
> > #2  0x00007fa8afbc047f in g_malloc () from /lib64/libglib-2.0.so.0
> > #3  0x00007fa8afbd666e in g_slice_alloc () from 
> > /lib64/libglib-2.0.so.0
> > #4  0x00007fa8b17cbffd in virtio_blk_handle_output ()
> > #5  0x00007fa8b197e6b6 in qemu_iohandler_poll ()
> > #6  0x00007fa8b197e296 in main_loop_wait ()
> > #7  0x00007fa8b177da4e in main ()
> 
> Does this part always look the same in your backtraces?
> The segfault in tc_malloc is probably due to a heap corruption, or double free or similar - although it can be a bit tricky to find out what did it, since the corruption might have happened a bit before the place it crashed.
> 
> Some other ideas:
>   1) Was there anything nasty in the /var/log/libvirt/qemu/yourvmname.log ?
>   2) Did you hit any IO errors and need to tell the VM to continue after a problem?
>   3) If this is pretty repeatable, then it would be interesting to try changing to a different
>      disk emulation and see if the problem goes away - e.g. virtio-scsi would be a good one to try.
> 
> Dave
> > 
> > 
> > Thx Christian
> > 
> > -----Ursprüngliche Nachricht-----
> > Von: Dan Kenigsberg [mailto:danken@redhat.com]
> > Gesendet: Freitag, 13. November 2015 20:00
> > An: Grundmann, Christian <Christian.Grundmann@fabasoft.com>
> > Cc: 'users@ovirt.org' <users@ovirt.org>
> > Betreff: Re: [ovirt-users] Segmentation fault in libtcmalloc
> > 
> > On Fri, Nov 13, 2015 at 07:56:14AM +0000, Grundmann, Christian wrote:
> > > Hi,
> > > i am using "ovirt-node-iso-3.6-0.999.201510221942.el7.centos.iso" 
> > > (is there something better to use?) fort he nodes, and have random 
> > > crashes of VMs The dumps are always the Same
> > > 
> > > gdb --batch /usr/libexec/qemu-kvm core.45902.1447199164.dump [Thread 
> > > debugging using libthread_db enabled] Using host libthread_db 
> > > library "/lib64/libthread_db.so.1".
> > > Core was generated by `/usr/libexec/qemu-kvm -name vmname -S -machine rhel6.5.0,accel=kvm,usb=o'.
> > > Program terminated with signal 11, Segmentation fault.
> > > #0  0x00007f0c559c4353 in
> > > tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::
> > > Fr eeList*, unsigned long, int) () from /lib64/libtcmalloc.so.4
> > > 
> > > 
> > > Didn't have the Problem with 3.5 el6 nodes, so don't no if ist 
> > > centos7 or 3.6
> > 
> > Due to the low-leveled-ness of the problem, I'd guess it's a qemu//lib64/libtcmalloc malloc bug, and not directly related to ovirt.
> > 
> > Please report the precise version of qemu,kernel,libvirt and gperftools-libs to qemu-devel mailing list and the complete stack trace and qemu command line, if possible.
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2015-11-17 11:36 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <6A17C71B52524C408E7AAF69103E9E490F14400C@fabamailserver.fabagl.fabasoft.com>
     [not found] ` <20151113190014.GB18986@redhat.com>
2015-11-16  8:11   ` [Qemu-devel] WG: [ovirt-users] Segmentation fault in libtcmalloc Grundmann, Christian
2015-11-17  9:59     ` Dr. David Alan Gilbert
2015-11-17 10:36       ` Grundmann, Christian
2015-11-17 11:36         ` Dr. David Alan Gilbert [this message]
2015-11-17 14:11           ` Grundmann, Christian
2015-11-17 14:20             ` Grundmann, Christian
2015-11-17 14:42               ` Dr. David Alan Gilbert
2015-11-19 16:00                 ` Grundmann, Christian
2015-11-19 17:02                   ` Paolo Bonzini
2015-12-03  8:18                     ` Grundmann, Christian
2015-12-03  9:04                       ` Dr. David Alan Gilbert
2015-12-03  9:07                         ` Grundmann, Christian
2015-12-10 12:38                           ` Dr. David Alan Gilbert
2015-12-10 13:18                             ` Markus Armbruster
2015-12-10 13:37                               ` Grundmann, Christian
2015-11-20 19:06                   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151117113601.GD2498@work-vm \
    --to=dgilbert@redhat.com \
    --cc=Christian.Grundmann@fabasoft.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).