qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Benoît Canet" <benoit.canet@irqsave.net>
Cc: kwolf@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC V8 01/24] qcow2: Add journal specification.
Date: Wed, 3 Jul 2013 10:01:53 +0200	[thread overview]
Message-ID: <20130703080153.GC16585@stefanha-thinkpad.muc.redhat.com> (raw)
In-Reply-To: <20130702212355.GB4985@irqsave.net>

On Tue, Jul 02, 2013 at 11:23:56PM +0200, Benoît Canet wrote:
> > > +QCOW2 can use one or more instance of a metadata journal.
> > 
> > s/instance/instances/
> > 
> > Is there a reason to use multiple journals rather than a single journal
> > for all entry types?  The single journal area avoids seeks.
> 
> Here are the main reason for this:
> 
> For the deduplication some patterns like cycles of insertion/deletion could
> leave the hash table almost empty while filling the journal.
> 
> If the journal is full and the hash table is empty a packing operation is
> started.
> 
> Basically a new journal is created and only the entry presents in the hash table
> are reinserted.
> 
> This is why I want to keep the deduplication journal appart from regular qcow2
> journal: to avoid interferences between a pack operation and regular qcow2
> journal entries.
> 
> The other thing is that freezing the log store would need a replay of regular
> qcow2 entries as it trigger a reset of the journal.
> 
> Also since deduplication will not work on spinning disk I discarded the seek
> time factor.
> 
> Maybe commiting the dedupe journal by erase block sized chunk would be a good
> idea to reduce random writes to the SSD.
> 
> The additional reason for having multiple journals is that the SILT paper
> propose a mode where prefix of the hash is used to dispatch insertions in
> multiples store and it easier to do with multiple journals.

It sounds like the journal is more than just a data integrity mechanism.
It's an integral part of your dedup algorithm and you plan to carefully
manage it while rebuilding some of the other dedup data structures.

Does this mean the journal forms the first-stage data structure for
deduplication?  Dedup records will accumulate in the journal until it
becomes time to convert them in bulk into a more compact representation?

When I read this specification I was thinking of a journal purely for
logging operations.  You could use a commit record to mark previous
records applied.  Upon startup, qcow2 would inspect uncommitted records
and deal with them.

We just need to figure out how to define a good interface so that the
journal can be used in a general way but also for dedup's specific
needs.

Stefan

  reply	other threads:[~2013-07-03  8:02 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-20 14:26 [Qemu-devel] [RFC V8 00/24] QCOW2 deduplication core functionality Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 01/24] qcow2: Add journal specification Benoît Canet
2013-07-02 14:42   ` Stefan Hajnoczi
2013-07-02 14:54     ` Kevin Wolf
2013-07-02 21:26       ` Benoît Canet
2013-07-03  8:08         ` Kevin Wolf
2013-07-03  7:51       ` Stefan Hajnoczi
2013-07-02 21:23     ` Benoît Canet
2013-07-03  8:01       ` Stefan Hajnoczi [this message]
2013-07-03 12:35         ` Benoît Canet
2013-07-03  8:04       ` Kevin Wolf
2013-07-03 12:30         ` Benoît Canet
2013-07-03  8:12       ` Stefan Hajnoczi
2013-07-03 12:53         ` Benoît Canet
2013-07-04  7:13           ` Stefan Hajnoczi
2013-07-04 10:01             ` Benoît Canet
2013-07-16 22:45               ` Benoît Canet
2013-07-17  8:20                 ` Kevin Wolf
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 02/24] qcow2: Add deduplication structures and fields Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 03/24] qcow2: Add journal Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 04/24] qcow2: Create the log store Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 05/24] qcow2: Add the hash store Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 06/24] qcow2: Add the deduplication store Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 07/24] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 08/24] qcow2: Create a way to link to l2 tables when deduplicating Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 09/24] qcow2: Make qcow2_update_cluster_refcount public Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 10/24] qcow2: Add qcow2_dedup and related functions Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 11/24] qcow2: Add qcow2_dedup_store_new_hashes Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 12/24] qcow2: Do allocate on rewrite on the dedup case Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 13/24] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 14/24] qcow2: Load and save deduplication table header extension Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 15/24] qcow2: Extract qcow2_set_incompat_feature and qcow2_clear_incompat_feature Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 16/24] block: Add qcow2_dedup format and image creation code Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 17/24] qcow2: Drop hash for a given cluster when dedup makes refcount > 2^16/2 Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 18/24] qcow2: Remove hash when cluster is deleted Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 19/24] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 20/24] qcow2: Serialize write requests when deduplication is activated Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 21/24] qcow2: Integrate SKEIN hash algorithm in deduplication Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 22/24] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 23/24] qcow2: Enable the deduplication feature Benoît Canet
2013-06-20 14:26 ` [Qemu-devel] [RFC V8 24/24] qcow2: Enable deduplication tests Benoît Canet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130703080153.GC16585@stefanha-thinkpad.muc.redhat.com \
    --to=stefanha@redhat.com \
    --cc=benoit.canet@irqsave.net \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).