From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:43953)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1UuI0x-0007oh-27
	for qemu-devel@nongnu.org; Wed, 03 Jul 2013 04:02:15 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1UuI0q-0007ar-KF
	for qemu-devel@nongnu.org; Wed, 03 Jul 2013 04:02:11 -0400
Received: from mx1.redhat.com ([209.132.183.28]:64514)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1UuI0q-0007an-Bh
	for qemu-devel@nongnu.org; Wed, 03 Jul 2013 04:02:04 -0400
Date: Wed, 3 Jul 2013 10:01:53 +0200
From: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20130703080153.GC16585@stefanha-thinkpad.muc.redhat.com>
References: <1371738392-9594-1-git-send-email-benoit@irqsave.net>
	<1371738392-9594-2-git-send-email-benoit@irqsave.net>
	<20130702144224.GF9870@stefanha-thinkpad.redhat.com>
	<20130702212355.GB4985@irqsave.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20130702212355.GB4985@irqsave.net>
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [RFC V8 01/24] qcow2: Add journal specification.
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?iso-8859-1?Q?Beno=EEt?= Canet <benoit.canet@irqsave.net>
Cc: kwolf@redhat.com, qemu-devel@nongnu.org

On Tue, Jul 02, 2013 at 11:23:56PM +0200, Beno=EEt Canet wrote:
> > > +QCOW2 can use one or more instance of a metadata journal.
> >=20
> > s/instance/instances/
> >=20
> > Is there a reason to use multiple journals rather than a single journ=
al
> > for all entry types?  The single journal area avoids seeks.
>=20
> Here are the main reason for this:
>=20
> For the deduplication some patterns like cycles of insertion/deletion c=
ould
> leave the hash table almost empty while filling the journal.
>=20
> If the journal is full and the hash table is empty a packing operation =
is
> started.
>=20
> Basically a new journal is created and only the entry presents in the h=
ash table
> are reinserted.
>=20
> This is why I want to keep the deduplication journal appart from regula=
r qcow2
> journal: to avoid interferences between a pack operation and regular qc=
ow2
> journal entries.
>=20
> The other thing is that freezing the log store would need a replay of r=
egular
> qcow2 entries as it trigger a reset of the journal.
>=20
> Also since deduplication will not work on spinning disk I discarded the=
 seek
> time factor.
>=20
> Maybe commiting the dedupe journal by erase block sized chunk would be =
a good
> idea to reduce random writes to the SSD.
>=20
> The additional reason for having multiple journals is that the SILT pap=
er
> propose a mode where prefix of the hash is used to dispatch insertions =
in
> multiples store and it easier to do with multiple journals.

It sounds like the journal is more than just a data integrity mechanism.
It's an integral part of your dedup algorithm and you plan to carefully
manage it while rebuilding some of the other dedup data structures.

Does this mean the journal forms the first-stage data structure for
deduplication?  Dedup records will accumulate in the journal until it
becomes time to convert them in bulk into a more compact representation?

When I read this specification I was thinking of a journal purely for
logging operations.  You could use a commit record to mark previous
records applied.  Upon startup, qcow2 would inspect uncommitted records
and deal with them.

We just need to figure out how to define a good interface so that the
journal can be used in a general way but also for dedup's specific
needs.

Stefan