Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Valerie Aurora Henson <vaurora@redhat.com>
To: Chris Mason <chris.mason@oracle.com>
Cc: Ray Van Dolson <rayvd@bludgeon.org>, linux-btrfs@vger.kernel.org
Subject: Re: Data-deduplication?
Date: Fri, 17 Oct 2008 14:24:04 -0400	[thread overview]
Message-ID: <20081017182404.GF16946@shell> (raw)
In-Reply-To: <1224185449.6938.81.camel@think.oraclecorp.com>

On Thu, Oct 16, 2008 at 03:30:49PM -0400, Chris Mason wrote:
> On Thu, 2008-10-16 at 15:25 -0400, Valerie Aurora Henson wrote:
> > 
> > Both deduplication and compression have an interesting side effect in
> > which a write to a previously "allocated" block can return ENOSPC.
> > This is even more exciting when you factor in mmap.  Any thoughts on
> > how to handle this?
> 
> Unfortunately we'll have a number of places where ENOSPC will jump in
> where people don't expect it, and this includes any COW overwrite of an
> existing extent.  The old extent isn't freed until snapshot deletion
> time, which won't happen until after the current transaction commits.
> 
> Another example is fallocate.  The extent will have a little flag that
> says I'm a preallocated extent, which is how we'll know we're allowed to
> overwrite it directly instead of doing COW.
> 
> But, to write to the fallocated extent, we'll have to clear the flag.
> So, we'll have to cow the block that holds the file extent pointer,
> which means we can enospc.

I'm sure you know this, but for the peanut gallery: You can avoid some
of these sort of purely copy-on-write ENOSPC cases.  Any operation
where the space used afterwards is less than or equal to the space
used before - like in your fallocate case - can avoid ENOSPC as long
as you reserve a certain amount of space on the fs and break down the
changes into small enough groups.  Most file systems don't let you
fill up beyond 90-95% anyway because performance goes to hell.  You
also need to do this so you can delete when your file system is full.

In general, it'd be nice to say that if your app can't handle suprise
ENOSPC, then if you run without snapshots, compression, or data dedup,
we guarantee you'll only get ENOSPC in the "normal" cases.  What do
you think?

-VAL

  reply	other threads:[~2008-10-17 18:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-12  2:06 Data-deduplication? Ray Van Dolson
2008-10-13  8:52 ` Data-deduplication? Andi Kleen
2008-10-15 13:39   ` Data-deduplication? Avi Kivity
2008-10-15 14:15     ` Data-deduplication? Andi Kleen
2008-10-15 14:43       ` Data-deduplication? Miguel Sousa Filipe
2008-10-15 15:00         ` Data-deduplication? Andi Kleen
2008-10-15 17:49       ` Data-deduplication? Avi Kivity
2008-10-13 11:02 ` Data-deduplication? Chris Mason
2008-10-16 19:25   ` Data-deduplication? Valerie Aurora Henson
2008-10-16 19:30     ` Data-deduplication? Chris Mason
2008-10-17 18:24       ` Valerie Aurora Henson [this message]
2008-10-20  0:16         ` Data-deduplication? Chris Mason
2008-10-21 20:33           ` Data-deduplication? Valerie Aurora Henson
2008-10-17 20:10     ` Data-deduplication? Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081017182404.GF16946@shell \
    --to=vaurora@redhat.com \
    --cc=chris.mason@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=rayvd@bludgeon.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox