All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Grundmann, Christian" <Christian.Grundmann@fabasoft.com>
Cc: "'qemu-devel@nongnu.org'" <qemu-devel@nongnu.org>,
	"stefanha@redhat.com" <stefanha@redhat.com>
Subject: Re: [Qemu-devel] WG: [ovirt-users] Segmentation fault in libtcmalloc
Date: Tue, 17 Nov 2015 11:36:02 +0000	[thread overview]
Message-ID: <20151117113601.GD2498@work-vm> (raw)
In-Reply-To: <6A17C71B52524C408E7AAF69103E9E490F153F45@fabamailserver.fabagl.fabasoft.com>

* Grundmann, Christian (Christian.Grundmann@fabasoft.com) wrote:
> Hi,
> 
> @ Can you please use a 'thread apply all bt full'   the full gives a little more info.
> 
> gdb --batch /usr/libexec/qemu-kvm core.52281.1447709011.dump -ex "set pagination off" -ex "thread apply all bt full"

OK, it doesn't relaly give any more without the debuginfo package mentioned below.

<snip>

> @ Also, if you've not already got it installed can you please install the debuginfo package for qemu, it gives a lot more information in backtraces.
> Sorry it's a ovirt-node System where I can't you yum

Ah, although perhaps if you took the core dump, onto another machine with matching qemu and debuginfo you should
be able to get more detail.

> @ Does this part always look the same in your backtraces?
> The most are the same, found one a little bit different :
> Thread 1 (Thread 0x7f378a0d7c00 (LWP 6658)):
> #0  0x00007f3785d18353 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) () from /lib64/libtcmalloc.so.4
> No symbol table info available.
> #1  0x00007f3785d186b0 in tcmalloc::ThreadCache::Scavenge() () from /lib64/libtcmalloc.so.4
> No symbol table info available.
> #2  0x00007f3785d27057 in tc_free () from /lib64/libtcmalloc.so.4
> No symbol table info available.
> #3  0x00007f37885e858f in g_free () from /lib64/libglib-2.0.so.0
> No symbol table info available.
> #4  0x00007f37885fec89 in g_slice_free1 () from /lib64/libglib-2.0.so.0
> No symbol table info available.
> #5  0x00007f378a1f232e in virtio_blk_rw_complete ()
> No symbol table info available.
> #6  0x00007f378a39f1ae in bdrv_co_em_bh ()
> No symbol table info available.
> #7  0x00007f378a398394 in aio_bh_poll ()
> No symbol table info available.
> #8  0x00007f378a3a7409 in aio_dispatch_clients ()
> No symbol table info available.
> #9  0x00007f378a39820e in aio_ctx_dispatch ()
> No symbol table info available.
> #10 0x00007f37885e299a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
> No symbol table info available.
> #11 0x00007f378a3a6288 in main_loop_wait ()
> No symbol table info available.
> #12 0x00007f378a1a5a4e in main ()
> No symbol table info available.
> 

OK, that's a bit different but interesting....

> @  1) Was there anything nasty in the /var/log/libvirt/qemu/yourvmname.log ?
> No nothing abnormal
> 
> @  2) Did you hit any IO errors and need to tell the VM to continue after a problem?
> Ovirt tells me "no Storage space error". Which is something like the disk is growing to fast i think. I use Snapshots so on heavy write the disk has to grow a lot.
> Sometimes the VM is paused and resumed from ovirt. Sometimes the VM stays offline.

OK, that's interesting, because you may be hitting the following bug;
http://lists.nongnu.org/archive/html/qemu-block/2015-11/msg00585.html

whose fix coincidentally just got accepted today; it's related to error cases with error=stop which
you are using.

Do you think you're only hitting these crashes on VMs that have been paused because of these space errors?

>      disk emulation and see if the problem goes away - e.g. virtio-scsi would be a good one to try.
> 
> Ok will try that and report

Thanks,

Dave

> 
> Thx Christian
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Dr. David Alan Gilbert [mailto:dgilbert@redhat.com] 
> Gesendet: Dienstag, 17. November 2015 10:59
> An: Grundmann, Christian <Christian.Grundmann@fabasoft.com>
> Cc: 'qemu-devel@nongnu.org' <qemu-devel@nongnu.org>; stefanha@redhat.com
> Betreff: Re: [Qemu-devel] WG: [ovirt-users] Segmentation fault in libtcmalloc
> 
> * Grundmann, Christian (Christian.Grundmann@fabasoft.com) wrote:
> > Hi,
> > Dan sent me over to you,
> > please let me know if i can provide additional informations
> 
> Hi Christian,
>   Thanks for reporting this,
> 
> > Softwareversions:
> > ovirt-node-iso-3.6-0.999.201510221942.el7.centos.iso
> > 
> > qemu-img-ev-2.3.0-29.1.el7.x86_64
> > qemu-kvm-ev-2.3.0-29.1.el7.x86_64
> > qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64
> > qemu-kvm-tools-ev-2.3.0-29.1.el7.x86_64
> > ipxe-roms-qemu-20130517-7.gitc4bce43.el7.noarch
> > kernel-3.10.0-229.14.1.el7.x86_64
> > gperftools-libs-2.4-7.el7.x86_64
> > 
> > Commandline:
> > /usr/libexec/qemu-kvm -name myvmname -S -machine 
> > rhel6.5.0,accel=kvm,usb=off -cpu Westmere -m 7168 -realtime mlock=off 
> > -smp 2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 
> > 5b6b8899-5a9d-4c07-a6aa-6171527ad319 -smbios 
> > type=1,manufacturer=oVirt,product=oVirt 
> > Node,version=3.6-0.999.201510221942.el7.centos,serial=30343536-3138-5A
> > 43-4A34-323630303253,uuid=5b6b8899-5a9d-4c07-a6aa-6171527ad319 
> > -nographic -no-user-config -nodefaults -chardev 
> > socket,id=charmonitor,path=/var/lib/libvirt/qemu/myvmname.monitor,serv
> > er,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc 
> > base=2015-11-15T20:04:35,driftfix=slew -global 
> > kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on 
> > -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
> > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device 
> > virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 
> > -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial= 
> > -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 
> > -drive 
> > file=/rhev/data-center/00000002-0002-0002-0002-0000000000e2/5df61b84-8
> > 746-4460-b148-65cc0eb8d29c/images/8202b81d-6191-495f-8c9d-7d90baffaecf
> > /d7665e07-1786-4051-aa26-0a3e1c9d2574,if=none,id=drive-virtio-disk0,fo
> > rmat=qcow2,serial=8202b81d-6191-495f-8c9d-7d90baffaecf,cache=none,werr
> > or=stop,rerror=stop,aio=native -device 
> > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id
> > =virtio-disk0,bootindex=1 -netdev 
> > tap,fd=39,id=hostnet0,vhost=on,vhostfd=65 -device 
> > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:83:a2:0e,bus=pci.0
> > ,addr=0x3 -chardev 
> > socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/5b6b8899-5a
> > 9d-4c07-a6aa-6171527ad319.com.redhat.rhevm.vdsm,server,nowait -device 
> > virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=chann
> > el0,name=com.redhat.rhevm.vdsm -chardev 
> > socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/5b6b8899-5a
> > 9d-4c07-a6aa-6171527ad319.org.qemu.guest_agent.0,server,nowait -device 
> > virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=chann
> > el1,name=org.qemu.guest_agent.0 -device 
> > cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device 
> > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on
> > 
> > Stack Trace:
> > 
> > gdb --batch /usr/libexec/qemu-kvm core.14750.1447544080.dump -ex "set pagination off" -ex "thread apply all bt"
> 
> Can you please use a 'thread apply all bt full'   the full gives a little more info.
> Also, if you've not already got it installed can you please install the debuginfo package for qemu, it gives a lot more information in backtraces.
> 
> > Thread 1 (Thread 0x7fa8b16afc00 (LWP 14750)):
> > #0  0x00007fa8ad2febe1 in tc_malloc () from /lib64/libtcmalloc.so.4
> > #1  0x00007fa8b186b489 in malloc_and_trace ()
> > #2  0x00007fa8afbc047f in g_malloc () from /lib64/libglib-2.0.so.0
> > #3  0x00007fa8afbd666e in g_slice_alloc () from 
> > /lib64/libglib-2.0.so.0
> > #4  0x00007fa8b17cbffd in virtio_blk_handle_output ()
> > #5  0x00007fa8b197e6b6 in qemu_iohandler_poll ()
> > #6  0x00007fa8b197e296 in main_loop_wait ()
> > #7  0x00007fa8b177da4e in main ()
> 
> Does this part always look the same in your backtraces?
> The segfault in tc_malloc is probably due to a heap corruption, or double free or similar - although it can be a bit tricky to find out what did it, since the corruption might have happened a bit before the place it crashed.
> 
> Some other ideas:
>   1) Was there anything nasty in the /var/log/libvirt/qemu/yourvmname.log ?
>   2) Did you hit any IO errors and need to tell the VM to continue after a problem?
>   3) If this is pretty repeatable, then it would be interesting to try changing to a different
>      disk emulation and see if the problem goes away - e.g. virtio-scsi would be a good one to try.
> 
> Dave
> > 
> > 
> > Thx Christian
> > 
> > -----Ursprüngliche Nachricht-----
> > Von: Dan Kenigsberg [mailto:danken@redhat.com]
> > Gesendet: Freitag, 13. November 2015 20:00
> > An: Grundmann, Christian <Christian.Grundmann@fabasoft.com>
> > Cc: 'users@ovirt.org' <users@ovirt.org>
> > Betreff: Re: [ovirt-users] Segmentation fault in libtcmalloc
> > 
> > On Fri, Nov 13, 2015 at 07:56:14AM +0000, Grundmann, Christian wrote:
> > > Hi,
> > > i am using "ovirt-node-iso-3.6-0.999.201510221942.el7.centos.iso" 
> > > (is there something better to use?) fort he nodes, and have random 
> > > crashes of VMs The dumps are always the Same
> > > 
> > > gdb --batch /usr/libexec/qemu-kvm core.45902.1447199164.dump [Thread 
> > > debugging using libthread_db enabled] Using host libthread_db 
> > > library "/lib64/libthread_db.so.1".
> > > Core was generated by `/usr/libexec/qemu-kvm -name vmname -S -machine rhel6.5.0,accel=kvm,usb=o'.
> > > Program terminated with signal 11, Segmentation fault.
> > > #0  0x00007f0c559c4353 in
> > > tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::
> > > Fr eeList*, unsigned long, int) () from /lib64/libtcmalloc.so.4
> > > 
> > > 
> > > Didn't have the Problem with 3.5 el6 nodes, so don't no if ist 
> > > centos7 or 3.6
> > 
> > Due to the low-leveled-ness of the problem, I'd guess it's a qemu//lib64/libtcmalloc malloc bug, and not directly related to ovirt.
> > 
> > Please report the precise version of qemu,kernel,libvirt and gperftools-libs to qemu-devel mailing list and the complete stack trace and qemu command line, if possible.
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2015-11-17 11:36 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <6A17C71B52524C408E7AAF69103E9E490F14400C@fabamailserver.fabagl.fabasoft.com>
     [not found] ` <20151113190014.GB18986@redhat.com>
2015-11-16  8:11   ` [Qemu-devel] WG: [ovirt-users] Segmentation fault in libtcmalloc Grundmann, Christian
2015-11-17  9:59     ` Dr. David Alan Gilbert
2015-11-17 10:36       ` Grundmann, Christian
2015-11-17 11:36         ` Dr. David Alan Gilbert [this message]
2015-11-17 14:11           ` Grundmann, Christian
2015-11-17 14:20             ` Grundmann, Christian
2015-11-17 14:42               ` Dr. David Alan Gilbert
2015-11-19 16:00                 ` Grundmann, Christian
2015-11-19 17:02                   ` Paolo Bonzini
2015-12-03  8:18                     ` Grundmann, Christian
2015-12-03  9:04                       ` Dr. David Alan Gilbert
2015-12-03  9:07                         ` Grundmann, Christian
2015-12-10 12:38                           ` Dr. David Alan Gilbert
2015-12-10 13:18                             ` Markus Armbruster
2015-12-10 13:37                               ` Grundmann, Christian
2015-11-20 19:06                   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151117113601.GD2498@work-vm \
    --to=dgilbert@redhat.com \
    --cc=Christian.Grundmann@fabasoft.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.