From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36868) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VUFlM-0007cT-Ny for qemu-devel@nongnu.org; Thu, 10 Oct 2013 08:54:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VUFlG-0007na-N7 for qemu-devel@nongnu.org; Thu, 10 Oct 2013 08:54:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:11019) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VUFlG-0007nN-Dm for qemu-devel@nongnu.org; Thu, 10 Oct 2013 08:54:38 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r9ACsa97026682 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 10 Oct 2013 08:54:37 -0400 Message-ID: <5256A38A.70407@redhat.com> Date: Thu, 10 Oct 2013 14:54:34 +0200 From: Max Reitz MIME-Version: 1.0 References: <1381395144-4449-1-git-send-email-mreitz@redhat.com> <1381395144-4449-2-git-send-email-mreitz@redhat.com> <20131010122625.GG3046@dhcp-200-207.str.redhat.com> In-Reply-To: <20131010122625.GG3046@dhcp-200-207.str.redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 1/2] qcow2: Undo leaked allocations in co_writev List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org, Stefan Hajnoczi On 2013-10-10 14:26, Kevin Wolf wrote: > Am 10.10.2013 um 10:52 hat Max Reitz geschrieben: >> If the write request spans more than one L2 table, >> qcow2_alloc_cluster_offset cannot handle the required allocations >> atomically. This results in leaks if it allocated new clusters in any >> but the last L2 table touched and an error occurs in qcow2_co_writev >> before having established the L2 link. These non-atomic allocations >> were, however, indeed successful and are therefore given to the caller >> in the L2Meta list. >> >> If an error occurs in qcow2_co_writev and the L2Meta list is unwound, >> all its remaining entries are clusters whose L2 links were not yet >> established. Thus, all allocations in that list should be undone. >> >> Signed-off-by: Max Reitz >> --- >> block/qcow2.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/block/qcow2.c b/block/qcow2.c >> index b2489fb..6bedd5d 100644 >> --- a/block/qcow2.c >> +++ b/block/qcow2.c >> @@ -1017,6 +1017,13 @@ fail: >> while (l2meta !=3D NULL) { >> QCowL2Meta *next; >> =20 >> + /* Undo all leaked allocations */ >> + if (l2meta->nb_clusters !=3D 0) { >> + qcow2_free_clusters(bs, l2meta->alloc_offset, >> + l2meta->nb_clusters << s->cluster_bit= s, >> + QCOW2_DISCARD_ALWAYS); >> + } >> + >> if (l2meta->nb_clusters !=3D 0) { >> QLIST_REMOVE(l2meta, next_in_flight); >> } > This feels a bit risky. > > I think currently it does work, because qcow2_alloc_cluster_link_l2() > can only return an error when it didn't update the L2 entry in the cach= e > yet, but adding any error condition between that point and the L2Meta > unwinding would result in corruption. I'm unsure, but perhaps a cluster > leak is the lesser evil. Did you consider this? Do other people have an > opinion on it? What error conditions are there which can occur between=20 qcow2_alloc_cluster_link_l2 and the L2Meta unwinding? If all=20 qcow2_alloc_cluster_link_l2 calls are successful, the list is empty and=20 the while loop either goes into another iteration or the function=20 returns successfully (without any further need to unwind the list). If=20 some call fails, all previous (successful) calls have already been=20 removed from the list, therefore the unwinding only affects L2Meta=20 request with failed calls to qcow2_alloc_cluster_link_l2 (or ones where=20 that function wasn't called at all). If the "currently" implied that this will turn out bad if there is a new=20 error condition between a successful call to qcow2_alloc_cluster_link_l2=20 and the removal of the L2Meta request from the list: Yes, that's true,=20 of course. However, as you've said, currently, there is no such=20 condition; and I don't see why it should be introduced. The sole purpose=20 of the list seems to be (to me) to execute qcow2_alloc_cluster_link_l2=20 on every of its elements. Thus, as soon as qcow2_alloc_cluster_link_l2=20 is successful, the corresponding request should be removed from the list. So, in case you do agree that it currently works fine, I would not=20 consider it risky; if this patch is applied and some time in the future=20 anything introduces a "goto fail" between qcow2_alloc_cluster_link_l2=20 and l2_meta =3D next, this patch would simply have to make sure that=20 qcow2_free_clusters isn't called in this case. In the probably very=20 unlikely case all my previous assumptions and conclusions were true, I'd=20 just add a comment in the qcow2_alloc_cluster_link_l2 loop informing=20 about this case (=93If you add a goto fail here, make sure to pay=20 attention=94 or something along these lines). > Also, shouldn't it be QCOW2_DISCARD_OTHER? I'm always unsure about the discard flags. ;-) I try to follow the rule of =93use the specific type (or =91other=92) for= =20 freeing =91out of the blue=92, but use =91always=92 if it's just a very r= ecent=20 allocation that is being undone again=94. I'd gladly accept better=20 recommendations. ;-) Max