Re:Re: Re: compression btrfs

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: yiletian <lonat_front@163.com>
To: "Josef Bacik" <jbacik@fusionio.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re:Re: Re: compression btrfs
Date: Wed, 27 Mar 2013 02:18:45 +0800 (CST)	[thread overview]
Message-ID: <794fecb2.396.13da7ec79cb.Coremail.lonat_front@163.com> (raw)
In-Reply-To: <20130326180357.GC28030@localhost.localdomain>

I think the biggest problem is how we can reclaim the space when the extent is a compressed one.
In this case, we may need to read and decompress data in the extent, and then compress the valid range to generate a new extent.
Is this process a performance killer?
At 2013-03-27 02:03:57,"Josef Bacik" <jbacik@fusionio.com> wrote:
>On Tue, Mar 26, 2013 at 10:27:34AM -0600, yiletian wrote:
>> Yes, I use compress-force=zlib for my partition.
>> 
>> Consider this scenario.
>> 
>> We first write a file with size of 256KB. Assume all data is compressed to 128KB size,
>> btrfs create a extent item in extent-tree to record the 128KB disk range  (named E).
>> and btrfs also creates a single file extent to records the disk range of E.
>> 
>> Then we overwrite from 16KB to the end of file, with size of 240KB.
>> Btrfs will create a new file extent for the overwritten range.
>> That is, the file has two file extents: the first one is to record the first 16KB and the second one record the remaining 240KB.
>> 
>> Then we are in a dilemma:
>> 1. the first one only occupies a disk range of 16KB, but entire E is reserved for it. This is because the __btrfs_drop_exte nts function do not decrease the number of back refs of E.
>> 2. because the overwritten range is large enough, the compress_file_range does not  call btrfs_add_inode_defrag to kick off a defrag for the file automatically.
>> 
>> With this dilemma,  how can btrfs reclaim the 112KB disk range (at least) recorded in E.
>> 
>
>Oh yeah welcome to btrfs, you must be new here ;).  So yeah this is the way it
>works, until we overwrite the entire extent we don't reclaim any of the space.
>This includes the "prealloc an 8 gig vm image and then random write inside of
>it" workload, you could end up using up to 16gb in the worst case scenario.  The
>thing we could do to fix this would be to instead of splitting the file extents
>and then inc'ing the ref of the original extent we instead split the extent ref
>as well, so we can reclaim this space.  It's on my list of things to do down the
>road, but it keeps getting supplanted by other priorities.  THanks,
>
>Josef

     prev parent reply	other threads:[~2013-03-26 18:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-26  4:03 compression btrfs lonat_front
2013-03-26 13:14 ` Josef Bacik
     [not found]   ` <57473f27.23ac0.13da786afad.Coremail.lonat_front@163.com>
2013-03-26 18:03     ` Josef Bacik
2013-03-26 18:18       ` yiletian [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=794fecb2.396.13da7ec79cb.Coremail.lonat_front@163.com \
    --to=lonat_front@163.com \
    --cc=jbacik@fusionio.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox