From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NB5zZ-0002dt-En for qemu-devel@nongnu.org; Thu, 19 Nov 2009 07:20:05 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NB5zU-0002bB-Ek for qemu-devel@nongnu.org; Thu, 19 Nov 2009 07:20:04 -0500 Received: from [199.232.76.173] (port=50314 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NB5zU-0002b5-6V for qemu-devel@nongnu.org; Thu, 19 Nov 2009 07:20:00 -0500 Received: from thoth.sbs.de ([192.35.17.2]:23751) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1NB5zT-0001dh-JG for qemu-devel@nongnu.org; Thu, 19 Nov 2009 07:19:59 -0500 Message-ID: <4B0537EB.4000909@siemens.com> Date: Thu, 19 Nov 2009 13:19:55 +0100 From: Jan Kiszka MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Endless loop in qcow2_alloc_cluster_offset List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel Cc: Kevin Wolf , kvm Hi, I just managed to push a qemu-kvm process (git rev. b496fe3431) into an endless loop in qcow2_alloc_cluster_offset, namely over QLIST_FOREACH(old_alloc, &s->cluster_allocs, next_in_flight): (gdb) bt #0 0x000000000048614b in qcow2_alloc_cluster_offset (bs=0xc4e1d0, offset=7417184256, n_start=0, n_end=16, num=0xcb351c, m=0xcb3568) at /data/qemu-kvm/block/qcow2-cluster.c:750 #1 0x00000000004828d0 in qcow_aio_write_cb (opaque=0xcb34d0, ret=0) at /data/qemu-kvm/block/qcow2.c:587 #2 0x0000000000482a44 in qcow_aio_writev (bs=, sector_num=, qiov=, nb_sectors=, cb=, opaque=) at /data/qemu-kvm/block/qcow2.c:645 #3 0x0000000000470e89 in bdrv_aio_writev (bs=0xc4e1d0, sector_num=2, qiov=0x7f48a9010ed0, nb_sectors=16, cb=0x470d20 , opaque=0x7f48a9010f0c) at /data/qemu-kvm/block.c:1362 #4 0x0000000000472991 in bdrv_write_em (bs=0xc4e1d0, sector_num=14486688, buf=0xd67200 "H\a", nb_sectors=16) at /data/qemu-kvm/block.c:1736 #5 0x0000000000435581 in ide_sector_write (s=0xc92650) at /data/qemu-kvm/hw/ide/core.c:622 #6 0x0000000000425fc2 in kvm_handle_io (env=) at /data/qemu-kvm/kvm-all.c:553 #7 kvm_run (env=) at /data/qemu-kvm/qemu-kvm.c:964 #8 0x0000000000426049 in kvm_cpu_exec (env=0x1000) at /data/qemu-kvm/qemu-kvm.c:1651 #9 0x000000000042627d in kvm_main_loop_cpu (_env=) at /data/qemu-kvm/qemu-kvm.c:1893 #10 ap_main_loop (_env=) at /data/qemu-kvm/qemu-kvm.c:1943 #11 0x00007f48ae89d070 in start_thread () from /lib64/libpthread.so.0 #12 0x00007f48abf0711d in clone () from /lib64/libc.so.6 #13 0x0000000000000000 in ?? () (gdb) print ((BDRVQcowState *)bs->opaque)->cluster_allocs.lh_first $5 = (struct QCowL2Meta *) 0xcb3568 (gdb) print *((BDRVQcowState *)bs->opaque)->cluster_allocs.lh_first $6 = {offset = 7417176064, n_start = 0, nb_available = 16, nb_clusters = 0, depends_on = 0xcb3568, dependent_requests = {lh_first = 0x0}, next_in_flight = {le_next = 0xcb3568, le_prev = 0xc4ebd8}} So next == first. Is something fiddling with cluster_allocs concurrently, e.g. some signal handler? Or what could cause this list corruption? Would it be enough to move to QLIST_FOREACH_SAFE? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux