From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:45609)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1VHVo3-0004aT-AK
	for qemu-devel@nongnu.org; Thu, 05 Sep 2013 05:24:57 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1VHVnx-0002AR-B9
	for qemu-devel@nongnu.org; Thu, 05 Sep 2013 05:24:51 -0400
Received: from mx1.redhat.com ([209.132.183.28]:39357)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1VHVnx-0002AN-27
	for qemu-devel@nongnu.org; Thu, 05 Sep 2013 05:24:45 -0400
Date: Thu, 5 Sep 2013 11:24:40 +0200
From: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20130905092440.GB12293@stefanha-thinkpad.redhat.com>
References: <1378215952-7151-1-git-send-email-kwolf@redhat.com>
	<20130904080352.GA8031@stefanha-thinkpad.redhat.com>
	<20130904093950.GB3562@dhcp-200-207.str.redhat.com>
	<20130904095523.GC5054@irqsave.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20130904095523.GC5054@irqsave.net>
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [RFC] qcow2 journalling draft
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?iso-8859-1?Q?Beno=EEt?= Canet <benoit.canet@irqsave.net>
Cc: Kevin Wolf <kwolf@redhat.com>, jcody@redhat.com, famz@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com

On Wed, Sep 04, 2013 at 11:55:23AM +0200, Beno=EEt Canet wrote:
> > > I'm not sure if multiple journals will work in practice.  Doesn't t=
his
> > > re-introduce the need to order update steps and flush between them?
> >=20
> > This is a question for Beno=EEt, who made this requirement. I asked h=
im
> > the same a while ago and apparently his explanation made some sense t=
o
> > me, or I would have remembered that I don't want it. ;-)
>=20
> The reason behind the multiple journal requirement is that if a block g=
et
> created and deleted in a cyclic way it can generate cyclic insertions/d=
eletions
> journal entries.
> The journal could easilly be filled if this pathological corner case ha=
ppen.
> When it happen the dedup code repack the journal by writting only the n=
on
> redundant information into a new journal and then use the new one.
> It would not be easy to do so if non dedup journal entries are present =
in the
> journal hence the multiple journal requirement.
>=20
> The deduplication also need two journals because when the first one is =
frozen it
> take some time to write the hash table to disk and anyway new entries m=
ust be
> stored somewhere at the same time. The code cannot block.
>=20
> > It might have something to do with the fact that deduplication uses t=
he
> > journal more as a kind of cache for hash values that can be dropped a=
nd
> > rebuilt after a crash.
>=20
> For dedupe the journal is more a "resume after exit" tool.

I'm not sure anymore if dedupe needs the same kind of "journal" as a
metadata journal for qcow2.

Since you have a dirty flag to discard the "journal" on crash, the
journal is not used for data integrity.

That makes me wonder if the metadata journal is the right structure for
dedupe?  Maybe your original proposal was fine for dedupe and we just
misinterpreted it because we thought this needs to be a safe journal.

Stefan