From: Nikolay Borisov <nborisov@suse.com>
To: Hamish Moffatt <hamish-btrfs@moffatt.email>,
Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: new database files not compressed
Date: Mon, 31 Aug 2020 12:25:30 +0300 [thread overview]
Message-ID: <1ba6d793-30c5-39fc-3b6f-46fee70e5dd8@suse.com> (raw)
In-Reply-To: <baadab71-61a7-704e-86f7-3607895df663@moffatt.email>
On 31.08.20 г. 11:53 ч., Hamish Moffatt wrote:
> On 31/8/20 1:47 pm, Zygo Blaxell wrote:
>> On Sun, Aug 30, 2020 at 07:35:59PM +1000, Hamish Moffatt wrote:
>>> I am trying to store Firebird database files compressed on btrfs.
>>> Although I
>>> have mounted the file system with -o compress-force, new files
>>> created by
>>> Firebird are not being compressed according to compsize. If I copy
>>> them, or
>>> use btrfs filesystem defrag, they compress well.
>>>
>>> Other files seem to be compressed automatically OK. Why are the Firebird
>>> files different?
>> If it is writing single 4K blocks with fsync() between writes, or writing
>> 4K blocks to discontiguous file offsets, then the extents will be 4K
>> and there can be no compression.
>>
>> Allocation is in 4K blocks (with default mkfs options on popular CPUs).
>> To save any space, compression must reduce the size of an extent by at
>> least 4K. A 4K extent can't be compressed because even a single bit of
>> compressed output would round the extent size back up to 4K, resulting
>> in no size reduction on disk.
>>
>> 8K extents can be compressed if the compression ratio is 50% or higher,
>> 12K extents can be compressed if the ratio is at least 33%, 16K extents
>> can be compressed if the ratio is at least 25%, and so on. Larger writes
>> are better for compression.
>>
>> Defrag and copies are able to compress because they write contiguously up
>> to the maximum compressed extent size of 128K; however, after defrag,
>> small random writes will not release the large contiguous extents
>> and total space usage reported by compsize can reach over 100% of the
>> original uncompressed file size. With nodatacow (and no compression)
>> the disk usage of the database remains stable at 100% of the file size.
>>
>> With defrag and compression the disk usage varies from the best
>> compressed
>> size to (size_of_compressed_database + uncompressed_file_size) over time.
>> e.g. if you have a 50% compression ratio on a 1MB database then the disk
>> usage varies from 512K immediately after defrag to a maximum of 1502K
>> in the worst case (out of every 32 blocks, 31 are written in separate
>> transactions, which leaves references in the file to all of the
>> compressed
>> extents, and adds 31 uncompressed 4K extents for each compressed extent).
>> This means that if you want to keep a database compressed with a 4K
>> database page size, you have to run defrag frequently.
>>
>> Another way to get compression is to increase the database page size.
>> Sizes up to 128K are useful--128K is the maximum btrfs compressed extent
>> size, and increasing the database page size higher will have no further
>> compression benefit. Most databases I've encountered max out at 64K
>> pages, but even 64K gives some compression.
>
> Understood. Thanks for this explanation.
>
> Perhaps I'm missing something more fundamental, because I don't seem to
> get compression even if I create a file full of zeroes with dd;
>
> $ sudo mount -O compress-force=zstd /dev/sdb /mnt/test
> $ cd /mnt/test/db
> $ dd if=/dev/zero of=zero bs=16k count=1024
> 1024+0 records in
> 1024+0 records out
> 16777216 bytes (17 MB, 16 MiB) copied, 0.0154404 s, 1.1 GB/s
> $ sudo compsize zero
> Type Perc Disk Usage Uncompressed Referenced
> TOTAL 100% 16M 16M 16M
> none 100% 16M 16M 16M
> $ sudo btrfs fi defrag -czstd zero
> $ sudo compsize zero
> Type Perc Disk Usage Uncompressed Referenced
> TOTAL 3% 512K 16M 16M
> zstd 3% 512K 16M 16M
>
> I did trying my Firebird tests with a 16k database page size and didn't
> see any compression there either.
Doing the following test :
root@ubuntu18:~# mount -O compress-force=zstd /dev/vdc /media/scratch/
root@ubuntu18:~# rm -rf /media/scratch/zero
root@ubuntu18:~# dd if=/dev/zero of=/media/scratch/zero bs=16k count=1024
sync
btrfs inspect-internal dump-tree -t5 /dev/vdc
results in:
item 6 key (259 EXTENT_DATA 0) itemoff 15816 itemsize 53
generation 12 type 1 (regular)
extent data disk byte 315621376 nr 4096
extent data offset 0 nr 131072 ram 131072
extent compression 3 (zstd)
item 7 key (259 EXTENT_DATA 131072) itemoff 15763 itemsize 53
generation 12 type 1 (regular)
extent data disk byte 315625472 nr 4096
extent data offset 0 nr 131072 ram 131072
extent compression 3 (zstd)
item 8 key (259 EXTENT_DATA 262144) itemoff 15710 itemsize 53
generation 12 type 1 (regular)
extent data disk byte 315629568 nr 4096
extent data offset 0 nr 131072 ram 131072
extent compression 3 (zstd)
I.e a bunch of 128k extents, which in fact take only 4k on disk each.
Whereas if I write the same file but without the compress-force mount
option I get:
item 138 key (260 EXTENT_DATA 0) itemoff 8787 itemsize 53
generation 14 type 1 (regular)
extent data disk byte 298844160 nr 16777216
extent data offset 0 nr 16777216 ram 16777216
extent compression 0 (none)
I.e a single extent, 16m in size. So instead of using this compsize
utility or whatever it is can you dump the state of the filesystem as
per the btrfs inspect-internal command shown above?
>
>
> Hamish
>
next prev parent reply other threads:[~2020-08-31 9:25 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-30 9:35 new database files not compressed Hamish Moffatt
2020-08-31 2:20 ` Eric Wong
2020-08-31 2:44 ` Hamish Moffatt
2020-08-31 3:15 ` A L
2020-08-31 3:47 ` Zygo Blaxell
2020-08-31 8:53 ` Hamish Moffatt
2020-08-31 9:25 ` Nikolay Borisov [this message]
2020-08-31 10:40 ` Hamish Moffatt
2020-08-31 10:47 ` Nikolay Borisov
2020-08-31 12:56 ` Hamish Moffatt
2020-08-31 11:15 ` Roman Mamedov
2020-08-31 12:54 ` Hamish Moffatt
2020-08-31 12:57 ` Nikolay Borisov
2020-08-31 23:50 ` Hamish Moffatt
2020-09-01 5:15 ` Nikolay Borisov
2020-09-01 8:55 ` Hamish Moffatt
2020-09-02 0:32 ` Hamish Moffatt
2020-09-02 5:57 ` Nikolay Borisov
2020-09-02 6:05 ` Hamish Moffatt
2020-09-02 6:10 ` Nikolay Borisov
2020-09-02 9:57 ` A L
2020-09-02 10:09 ` Nikolay Borisov
2020-09-03 15:04 ` A L
2020-09-02 16:16 ` Zygo Blaxell
2020-09-03 12:53 ` Hamish Moffatt
2020-09-03 19:44 ` Zygo Blaxell
2020-09-04 8:07 ` Hamish Moffatt
2020-09-05 4:07 ` Zygo Blaxell
2020-09-03 15:03 ` A L
2020-09-03 21:52 ` Zygo Blaxell
2020-09-01 1:43 ` Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1ba6d793-30c5-39fc-3b6f-46fee70e5dd8@suse.com \
--to=nborisov@suse.com \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=hamish-btrfs@moffatt.email \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox