From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5991C2BB1D for ; Sat, 18 Apr 2020 00:35:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 585A820B1F for ; Sat, 18 Apr 2020 00:35:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FkLLUdsc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 585A820B1F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53442 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jPbSV-0003VY-Id for qemu-devel@archiver.kernel.org; Fri, 17 Apr 2020 20:35:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59722) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jPbRi-0002vo-SO for qemu-devel@nongnu.org; Fri, 17 Apr 2020 20:35:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jPbRg-0006SB-8o for qemu-devel@nongnu.org; Fri, 17 Apr 2020 20:34:58 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:30798 helo=us-smtp-delivery-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jPbRe-0006OE-LM for qemu-devel@nongnu.org; Fri, 17 Apr 2020 20:34:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1587170093; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=wU7Nm+UCq8miRl3cxozsYdCmuwWBOzg/Nrtpxfk6J3I=; b=FkLLUdscdMozuL8Re0/e/hl3Gfvx7j1SMjGn4Z7k30S/Bp2grsLH4gcDO1H1lZZPyZxMzq 8fRyjdY37isqrxPvb0iLX3F8/FXaPlo51Sw9+0yg5C+9U172BWDkQPr8QjQS1etHUjM3fv 3pqP151R4bKLG/5DO/NYAY4t3OmE3z4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-297-sieWBkYwMxqBYTyjMdowmg-1; Fri, 17 Apr 2020 20:34:51 -0400 X-MC-Unique: sieWBkYwMxqBYTyjMdowmg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 22F6B66F; Sat, 18 Apr 2020 00:34:50 +0000 (UTC) Received: from [10.10.119.33] (ovpn-119-33.rdu2.redhat.com [10.10.119.33]) by smtp.corp.redhat.com (Postfix) with ESMTP id 49D7F5D9CA; Sat, 18 Apr 2020 00:34:47 +0000 (UTC) Subject: Re: Avoid copying unallocated clusters during full backup To: Leo Luan , Eric Blake References: <7c722a98-29ab-ba65-2f19-088628ce8f00@redhat.com> From: John Snow Autocrypt: addr=jsnow@redhat.com; prefer-encrypt=mutual; keydata= mQINBFTKefwBEAChvwqYC6saTzawbih87LqBYq0d5A8jXYXaiFMV/EvMSDqqY4EY6whXliNO IYzhgrPEe7ZmPxbCSe4iMykjhwMh5byIHDoPGDU+FsQty2KXuoxto+ZdrP9gymAgmyqdk3aV vzzmCa3cOppcqKvA0Kqr10UeX/z4OMVV390V+DVWUvzXpda45/Sxup57pk+hyY52wxxjIqef rj8u5BN93s5uCVTus0oiVA6W+iXYzTvVDStMFVqnTxSxlpZoH5RGKvmoWV3uutByQyBPHW2U 1Y6n6iEZ9MlP3hcDqlo0S8jeP03HaD4gOqCuqLceWF5+2WyHzNfylpNMFVi+Hp0H/nSDtCvQ ua7j+6Pt7q5rvqgHvRipkDDVsjqwasuNc3wyoHexrBeLU/iJBuDld5iLy+dHXoYMB3HmjMxj 3K5/8XhGrDx6BDFeO3HIpi3u2z1jniB7RtyVEtdupED6lqsDj0oSz9NxaOFZrS3Jf6z/kHIf h42mM9Sx7+s4c07N2LieUxcfqhFTaa/voRibF4cmkBVUhOD1AKXNfhEsTvmcz9NbUchCkcvA T9119CrsxfVsE7bXiGvdXnzyGLXdsoosjzwacKdOrVaDmN3Uy+SHiQXo6TlkSdV0XH2PUxTM LsBFIO9qXO43Ai6J6iPAP/01l8fuZfpJE0/L/c25yyaND7xA3wARAQABtCpKb2huIFNub3cg KEpvaG4gSHVzdG9uKSA8anNub3dAcmVkaGF0LmNvbT6JAlQEEwECAD4CGwMCHgECF4AFCwkI BwMFFQoJCAsFFgIDAQAWIQT665cRoSz0dYEvGPKIqQZNGDVh6wUCXF392gUJC1Xq3gAKCRCI qQZNGDVh6558D/9pM4pu4njX5aT6uUW3vAmbWLF1jfPxiTQgSHAnm9EBMZED/fsvkzj97clo LN7JKmbYZNgJmR01A7flG45V4iOR/249qAfaVuD+ZzZi1R4jFzr13WS+IEdn0hYp9ITndb7R ezW+HGu6/rP2PnfmDnNowgJu6Dp6IUEabq8SXXwGHXZPuMIrsXJxUdKJdGnh1o2u7271yNO7 J9PEMuMDsgjsdnaGtv7aQ9CECtXvBleAc06pLW2HU10r5wQyBMZGITemJdBhhdzGmbHAL0M6 vKi/bafHRWqfMqOAdDkv3Jg4arl2NCG/uNateR1z5e529+UlB4XVAQT+f5T/YyI65DFTY940 il3aZhA8u788jZEPMXmt94u7uPZbEYp7V0jt68SrTaOgO7NaXsboXFjwEa42Ug5lB5d5/Qdp 1AITUv0NJ51kKwhHL1dEagGeloIsGVQILmpS0MLdtitBHqZLsnJkRvtMaxo47giyBlv2ewmq tIGTlVLxHx9xkc9aVepOuiGlZaZB72c9AvZs9rKaAjgU2UfJHlB/Hr4uSk/1EY0IgMv4vnsG 1sA5gvS7A4T4euu0PqHtn2sZEWDrk5RDbw0yIb53JYdXboLFmFXKzVASfKh2ZVeXRBlQQSJi 3PBR1GzzqORlfryby7mkY857xzCI2NkIkD2eq+HhzFTfFOTdGrkCDQRUynn8ARAAwbhP45BE d/zAMBPV2dk2WwIwKRSKULElP3kXpcuiDWYQob3UODUUqClO+3aXVRndaNmZX9WbzGYexVo3 5j+CVBCGr3DlU8AL9pp3KQ3SJihWcDed1LSmUf8tS+10d6mdGxDqgnd/OWU214isvhgWZtZG MM/Xj7cx5pERIiP+jqu7PT1cibcfcEKhPjYdyV1QnLtKNGrTg/UMKaL+qkWBUI/8uBoa0HLs NH63bXsRtNAG8w6qG7iiueYZUIXKc4IHINUguqYQJVdSe+u8b2N5XNhDSEUhdlqFYraJvX6d TjxMTW5lzVG2KjztfErRNSUmu2gezbw1/CV0ztniOKDA7mkQi6UIUDRh4LxRm5mflfKiCyDQ L6P/jxHBxFv+sIgjuLrfNhIC1p3z9rvCh+idAVJgtHtYl8p6GAVrF+4xQV2zZH45tgmHo2+S JsLPjXZtWVsWANpepXnesyabWtNAV4qQB7/SfC77zZwsVX0OOY2Qc+iohmXo8U7DgXVDgl/R /5Qgfnlv0/3rOdMt6ZPy5LJr8D9LJmcP0RvX98jyoBOf06Q9QtEwJsNLCOCo2LKNL71DNjZr nXEwjUH66CXiRXDbDKprt71BiSTitkFhGGU88XCtrp8R9yArXPf4MN+wNYBjfT7K29gWTzxt 9DYQIvEf69oZD5Z5qHYGp031E90AEQEAAYkCPAQYAQIAJgIbDBYhBPrrlxGhLPR1gS8Y8oip Bk0YNWHrBQJcXf3JBQkLVerNAAoJEIipBk0YNWHrU1AP/1FOK2SBGbyhHa5vDHuf47fgLipC e0/h1E0vdSonzlhPxuZoQ47FjzG9uOhqqQG6/PqtWs/FJIyz8aGG4aV+pSA/9Ko3/2ND8MSY ZflWs7Y8Peg08Ro01GTHFITjEUgHpTpHiT6TNcZB5aZNJ8jqCtW5UlqvXXbVeSTmO70ZiVtc vUJbpvSxYmzhFfZWaXIPcNcKWL1rnmnzs67lDhMLdkYVf91aml/XtyMUlfB8Iaejzud9Ht3r C0pA9MG57pLblX7okEshxAC0+tUdY2vANWFeX0mgqRt1GSuG9XM9H/cKP1czfUV/FgaWo/Ya fM4eMhUAlL/y+/AJxxumPhBXftM4yuiktp2JMezoIMJI9fmhjfWDw7+2jVrx9ze1joLakFD1 rVAoHxVJ7ORfQ4Ni/qWbQm3T6qQkSMt4N/scNsMczibdTPxU7qtwQwIeFOOc3wEwmJ9Qe3ox TODQ0agXiWVj0OXYCHJ6MxTDswtyTGQW+nUHpKBgHGwUaR6d1kr/LK9+5LpOfRlK9VRfEu7D PGNiRkr8Abp8jHsrBqQWfUS1bAf62bq6XUel0kUCtb7qCq024aOczXYWPFpJFX+nhp4d7NeH Edq+wlC13sBSiSHC7T5yssJ+7JPa2ATLlSKhEvBsLe2TsSTTtFlA0nBclqhfJXzimiuge9qU E40lvMWBuQINBFTKimUBEADDbJ+pQ5M4QBMWkaWImRj7c598xIZ37oKM6rGaSnuB1SVb7YCr Ci2MTwQcrQscA2jm80O8VFqWk+/XsEp62dty47GVwSfdGje/3zv3VTH2KhOCKOq3oPP5ZXWY rz2d2WnTvx++o6lU7HLHDEC3NGLYNLkL1lyVxLhnhvcMxkf1EGA1DboEcMgnJrNB1pGP27ww cSfvdyPGseV+qZZa8kuViDga1oxmnYDxFKMGLxrClqHrRt8geQL1Wj5KFM5hFtGTK4da5lPn wGNd6/CINMeCT2AWZY5ySz7/tSZe5F22vPvVZGoPgQicYWdNc3ap7+7IKP86JNjmec/9RJcz jvrYjJdiqBVldXou72CtDydKVLVSKv8c2wBDJghYZitfYIaL8cTvQfUHRYTfo0n5KKSec8Vo vjDuxmdbOUBA+SkRxqmneP5OxGoZ92VusrwWCjry8HRsNdR+2T+ClDCO6Wpihu4V3CPkQwTy eCuMHPAT0ka5paTwLrnZIxsdfnjUa96T10vzmQgAxpbbiaLvgKJ8+76OPdDnhddyxd2ldYfw RkF5PEGg3mqZnYKNNBtwjvX49SAvgETQvLzQ8IKVgZS0m4z9qHHvtc1BsQnFfe+LJOFjzZr7 CrDNJMqk1JTHYsSi2JcN3vY32WMezXSQ0TzeMK4kdnclSQyp/h23GWod5QARAQABiQRbBBgB AgAmAhsCFiEE+uuXEaEs9HWBLxjyiKkGTRg1YesFAlxd/coFCQtV2mQCKcFdIAQZAQIABgUC VMqKZQAKCRB974EGqvw5DiJoEACLmuiRq9ifvOh5DyBFwRS7gvA14DsGQngmC57EzV0EFcfM XVi1jX5OtwUyUe0Az5r6lHyyHDsDsIpLKBlWrYCeLpUhRR3oy181T7UNxvujGFeTkzvLAOo6 Hs3b8Wv9ARg+7acRYkQRNY7k0GIJ6YZz149tRyRKAy/vSjsaB9Lt0NOd1wf2EQMKwRVELwJD y0AazGn+0PRP7Bua2YbtxaBmhBBDb2tPpwn8U9xdckB4Vlft9lcWNsC/18Gi9bpjd9FSbdH/ sOUI+3ToWYENeoT4IP09wn6EkgWaJS3nAUN/MOycNej2i4Yhy2wDDSKyTAnVkSSSoXk+tK91 HfqtokbDanB8daP+K5LgoiWHzjfWzsxA2jKisI4YCGjrYQzTyGOT6P6u6SEeoEx10865B/zc 8/vN50kncdjYz2naacIDEKQNZlnGLsGkpCbfmfdi3Zg4vuWKNdWr0wGUzDUcpqW0y/lUXna+ 6uyQShX5e4JD2UPuf9WAQ9HtgSAkaDd4O1I2J41sleePzZOVB3DmYgy+ECRJJ5nw3ihdxpgc y/v3lfcJaqiyCv0PF+K/gSOvwhH7CbVqARmptT7yhhxqFdaYWo2Z2ksuKyoKSRMFCXQY5oac uTmyPIT4STFyUQFeqSCWDum/NFNoSKhmItw2Td+4VSJHShRVbg39KNFPZ7mXYAkQiKkGTRg1 YesWJA/+PV3qDUtPNEGwjVvjQqHSbrBy94tu6gJvPHgGPtRDYvxnCaJsmgiC0pGB2KFRsnfl 2zBNBEWF/XwsI081jQE5UO60GKmHTputChLXpVobyuc+lroG2YhknXRBAV969SLnZR4BS/1s Gi046gOXfaKYatve8BiZr5it5Foq3FMPDNgZMit1H9Dk8rkKFfDMRf8EGS/Z+TmyEsIf99H7 TH3n7lco8qO81fSFwkh4pvo2kWRFYTC5vsIVQ+GqVUp+W1DZJHxX8LwWuF1AzUt4MUTtNAvy TXl5EgsmoY9mpNNL7ZnW65oG63nEP5KNiybvuQJzXVxR8eqzOh2Mod4nHg3PE7UCd3DvLNsn GXFRo44WyT/G2lArBtjpkut7bDm0i1nENABy2UgS+1QvdmgNu6aEZxdNthwRjUhuuvCCDMA4 rCDQYyakH2tJNQgkXkeLodBKF4bHiBbuwj0E39S9wmGgg+q4OTnAO/yhQGknle7a7G5xHBwE i0HjnLoJP5jDcoMTabZTIazXmJz3pKM11HYJ5/ZsTIf3ZRJJKIvXJpbmcAPVwTZII6XxiJdh RSSX4Mvd5pL/+5WI6NTdW6DMfigTtdd85fe6PwBNVJL2ZvBfsBJZ5rxg1TOH3KLsYBqBTgW2 glQofxhkJhDEcvjLhe3Y2BlbCWKOmvM8XS9TRt0OwUs= Message-ID: <93052f9b-6539-0d4a-c922-fff7618c542d@redhat.com> Date: Fri, 17 Apr 2020 20:34:43 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 207.211.31.81 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vladimir Sementsov-Ogievskiy , qemu-devel@nongnu.org, Qemu-block , Max Reitz Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On 4/17/20 6:57 PM, Leo Luan wrote: > On Fri, Apr 17, 2020 at 1:24 PM Eric Blake > wrote: >=20 > On 4/17/20 3:11 PM, John Snow wrote: >=20 > >> + > >> + =C2=A0 =C2=A0if (s->sync_mode =3D=3D MIRROR_SYNC_MODE_FULL && > >> + =C2=A0 =C2=A0 =C2=A0 s->bcs->target->bs->drv !=3D NULL && > >> + =C2=A0 =C2=A0 =C2=A0 strncmp(s->bcs->target->bs->drv->format_nam= e, "qcow2", 5) > =3D=3D 0 && > >> + =C2=A0 =C2=A0 =C2=A0 s->bcs->source->bs->backing_file[0] =3D=3D = '\0') > > > > This isn't going to suffice upstream; the backup job can't be > performing > > format introspection to determine behavior on the fly. >=20 > Agreed.=C2=A0 The idea is right (we NEED to make backup operations sm= arter > based on knowledge about both source and destination block status), b= ut > the implementation is not (a check for strcncmp("qcow2") is not ideal= ). >=20 >=20 > I see/agree that using strncmp("qcow2") is not general enough for the > upstream.=C2=A0 Would changing it to bdrv_unallocated_blocks_are_zero() s= uffice? >=20 I don't know, to be really honest with you. Vladimir reworked the backup code recently and Virtuozzo et al have shown a very aggressive interest in optimizing the backup loop. I haven't really worked on that code since their rewrite. Dropping unallocated regions from the backup manifest is one strategy, but I think there will be cases where we won't be able to treat it like "TOP", but may still have unallocated regions we don't want to copy (We have a backing file which is itself unallocated.) I'm interested in a more general purpose mechanism for efficient copying. I think that instead of the backup job itself doing this in backup.c by populating the copy manifest, that it's also appropriate to try to copy every last block and have the backup loop implementation decide it doesn't actually need to copy that block. That way, the copy optimizations can be shared by any implementation that needs to do efficient copying, and we can avoid special format and graph-inspection code in the backup job main interface code. To be clear, I see these as identical amounts of work: - backup job runs a loop to inspect every cluster to see if it is allocated or not, and modifies its cluster backup manifest accordingly - backup job loops through the entire block and calls a smart_copy() function that might degrade into a no-op if the right conditions are met (source is unallocated, explicit zeroes are not needed on the destination) Either way, you're looping and interrogating the disk, but in one case the efficiencies go deeper than *just* the backup code. I think Vladimir has put a lot of work into making the backup code highly optimized, so I would consult with him to find out where the best place to put new optimizations are, if any -- he'll know! --js >=20 > > > > I think what you're really after is something like > > bdrv_unallocated_blocks_are_zero(). >=20 > The fact that qemu-img already has a lot of optimizations makes me > wonder what we can salvage from there into reusable code that both > qemu-img and block backup can share, so that we're not reimplementing > block status handling in multiple places. >=20 >=20 > A general fix reusing some existing code would be great.=C2=A0 When will = it > appear in the upstream?=C2=A0 We are hoping to avoid needing to use a pri= vate > branch if possible.=C2=A0=C2=A0 >=20 >=20 > > So the basic premise is that if you are copying a qcow2 file and th= e > > unallocated portions as defined by the qcow2 metadata are zero, it'= s > > safe to skip those, so you can treat it like SYNC_MODE_TOP. > > > > I think you *also* have to know if the *source* needs those regions > > explicitly zeroed, and it's not always safe to just skip them at th= e > > manifest level. > > > > I thought there was code that handled this to some extent already, > but I > > don't know. I think Vladimir has worked on it recently and can > probably > > let you know where I am mistaken :) >=20 > Yes, I'm hoping Vladimir (or his other buddies at Virtuozzo) can chim= e > in.=C2=A0 Meanwhile, I've working on v2 of some patches that will imp= rove > qemu's ability to tell if a destination qcow2 file already reads as a= ll > zeroes, and we already have bdrv_block_status() for telling which > portions of a source image already read as all zeroes (whether or > not it > is due to not being allocated, the goal here is that we should NOT ha= ve > to copy anything that reads as zero on the source over to the > destination if the destination already starts life as reading all zer= o). >=20 >=20 > Can the eventual/optimal solution allow unallocated clusters to be > skipped entirely in the backup loop and make the detection of allocated > zeroes an option,=C2=A0not forcing the backup thread to loop through a > potentially=C2=A0huge empty virtual disk? >=20 I mean, using the TOP code is doing the same thing, really: it's looking at allocation status and marking those blocks as "already copied", more or less. >=20 > And if nothing else, qemu 5.0 just added 'qemu-img convert > --target-is-zero' as a last-ditch means of telling qemu to assume the > destination reads as all zeroes, even if it cannot quickly prove it; = we > probably want to add a similar knob into the QMP commands for > initiating > block backup, for the same reasons. >=20 >=20 > This seems a good way of assuring the=C2=A0status of the target file. >=20 > Thanks! >=20