From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=39581 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PsGwQ-0004uz-5V for qemu-devel@nongnu.org; Wed, 23 Feb 2011 10:47:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PsGwK-0004je-OF for qemu-devel@nongnu.org; Wed, 23 Feb 2011 10:47:46 -0500 Received: from mail-vw0-f45.google.com ([209.85.212.45]:34197) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PsGwK-0004ja-KB for qemu-devel@nongnu.org; Wed, 23 Feb 2011 10:47:44 -0500 Received: by vws19 with SMTP id 19so3891816vws.4 for ; Wed, 23 Feb 2011 07:47:44 -0800 (PST) Message-ID: <4D652C26.2010304@codemonkey.ws> Date: Wed, 23 Feb 2011 09:47:50 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: Strategic decision: COW format References: <4D5BC467.4070804@redhat.com> <4D5E4271.80501@redhat.com> <4D5E8031.5020402@codemonkey.ws> <4D637A20.9020307@redhat.com> <4D650F10.3060900@redhat.com> <4D651858.9040106@codemonkey.ws> <4D651BD2.3040500@redhat.com> <4D6527F4.2010101@codemonkey.ws> <4D652984.90401@redhat.com> In-Reply-To: <4D652984.90401@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Kevin Wolf , Chunqiang Tang , qemu-devel@nongnu.org, Markus Armbruster , Stefan Hajnoczi On 02/23/2011 09:36 AM, Avi Kivity wrote: > On 02/23/2011 05:29 PM, Anthony Liguori wrote: >> >>>> existed, what about snapshots? Are we okay having a feature in a >>>> prominent format that isn't going to meet user's expectations? >>>> >>>> Is there any hope that an image with 1000, 1000, or 10000 snapshots is >>>> going to have even reasonable performance in qcow2? >>> Is there any hope for backing file chains of 1000 files or more? I >>> haven't tried it out, but in theory I'd expect that internal snapshots >>> could cope better with it than external ones because internal snapshots >>> don't have to go through the whole chain all the time. >> >> I don't think there's a user expectation of backing file chains of >> 1000 files performing well. However, I've talked to a number of >> customers that have been interested in using internal snapshots for >> checkpointing which would involve a large number of snapshots. >> >> In fact, Fabrice originally added qcow2 because he was interested in >> doing reverse debugging. The idea of internal snapshots was to store >> a high number of checkpoints to allow reverse debugging to be optimized. > > I don't see how that works, since the memory image is duplicated for > each snapshot. So thousands of snapshots = terabytes of storage, and > hours of creating the snapshots. Fabrice wanted to use CoW to as a mechanism to deduplicate the memory contents with the on-disk state specifically to address this problem. For the longest time, there was a comment in the savevm code along these lines. It might still be there. I think the lack of on-disk hashes was a critical missing bit to make this feature really work well. > Migrate-to-file with block live migration, or even better, something > based on Kemari would be a lot faster. > >> >> I think the way snapshot metadata is stored makes this not realistic >> since they're stored in more or less a linear array. I think to >> really support a high number of snapshots, you'd want to store a hash >> with each block that contained a refcount > 1. I think you quickly >> end up reinventing btrfs though in the process. > > Can you elaborate? What's the problem with a linear array of > snapshots (say up to 10,000 snapshots)? Lots of things. The array will start to consume quite a bit of contiguous space as it gets larger which means it needs to be relocated. Deleting a snapshot is a far more expensive operation than it needs to be. Regards, Anthony Liguori