From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53945) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XdEQN-0000KC-7k for qemu-devel@nongnu.org; Sun, 12 Oct 2014 04:22:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XdEQE-0003kM-5d for qemu-devel@nongnu.org; Sun, 12 Oct 2014 04:22:43 -0400 Received: from mail-pd0-x22a.google.com ([2607:f8b0:400e:c02::22a]:43000) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XdEQD-0003k6-U5 for qemu-devel@nongnu.org; Sun, 12 Oct 2014 04:22:34 -0400 Received: by mail-pd0-f170.google.com with SMTP id p10so4051021pdj.29 for ; Sun, 12 Oct 2014 01:22:32 -0700 (PDT) Message-ID: <543A3A4E.8020607@gmail.com> Date: Sun, 12 Oct 2014 16:22:38 +0800 From: Zhang Haoyu MIME-Version: 1.0 References: <201410111514227991260@sangfor.com> <20141012073432.GA3739@noname.redhat.com> In-Reply-To: <20141012073432.GA3739@noname.redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] qcow2: fix double-free of Qcow2DiscardRegion in qcow2_process_discards List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf , Zhang Haoyu Cc: qemu-devel , Stefan Hajnoczi , kvm On 2014-10-12 15:34, Kevin Wolf wrote: > Am 11.10.2014 um 09:14 hat Zhang Haoyu geschrieben: >> In qcow2_update_snapshot_refcount -> qcow2_process_discards() -> bdrv_discard() >> may free the Qcow2DiscardRegion which is referenced by "next" pointer in >> qcow2_process_discards() now, in next iteration, d = next, so g_free(d) >> will double-free this Qcow2DiscardRegion. >> >> qcow2_snapshot_delete >> |- qcow2_update_snapshot_refcount >> |-- qcow2_process_discards >> |--- bdrv_discard >> |---- aio_poll >> |----- aio_dispatch >> |------ bdrv_co_io_em_complete >> |------- qemu_coroutine_enter(co->coroutine, NULL); <=== coroutine entry is bdrv_co_do_rw >> |--- g_free(d) <== free first Qcow2DiscardRegion is okay >> |--- d = next; <== this set is done in QTAILQ_FOREACH_SAFE() macro. >> |--- g_free(d); <== double-free will happen if during previous iteration, bdrv_discard had free this object. > Do you have a reproducer for this or did code review lead you to this? This problem can be reproduced with loop of savevm -> delvm -> savem -> delvm ..., about 4 hours. When I delete the vm snapshot, qemu crashed with a core file, I debug the core file and find the double-free and the stack. So I add a breakpoint at g_free(d);, and find that indeed a double-free happened, twice free with the same address. And only the first discard region have not happened with double-free. > > At the moment I can't see how bdrv_discard(bs->file) could ever free a > Qcow2DiscardRegion of bs, as it's working on a completely different > BlockDriverState (which usually won't even be a qcow2 one). I think the "aio_context" in bdrv_discard -> aio_poll(aio_context, true) is the qemu_aio_context, no matter the bs or bs->file passed to bdrv_discard, so aio_poll(aio_context) will poll all of the aio. > >> bdrv_co_do_rw >> |- bdrv_co_do_writev >> |-- bdrv_co_do_pwritev >> |--- bdrv_aligned_pwritev >> |---- qcow2_co_writev >> |----- qcow2_alloc_cluster_link_l2 >> |------ qcow2_free_any_clusters >> |------- qcow2_free_clusters >> |-------- update_refcount >> |--------- qcow2_process_discards >> |---------- g_free(d) <== In next iteration, this Qcow2DiscardRegion will be double-free. > This shouldn't happen in a nested call either, as s->lock can't be taken > recursively. Could you detail how s->lock prevent that, above stack is from the gdb, when I add a breakpoint in g_free(d). Thanks, Zhang Haoyu > > Kevin > >