public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Borisov <nborisov@suse.com>
To: Hamish Moffatt <hamish-btrfs@moffatt.email>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: new database files not compressed
Date: Wed, 2 Sep 2020 08:57:36 +0300	[thread overview]
Message-ID: <03ec55ee-5cf3-54fa-1a81-abc93006ca7b@suse.com> (raw)
In-Reply-To: <d0399ea6-f198-b58f-8b34-f8ba95ef400f@moffatt.email>



On 2.09.20 г. 3:32 ч., Hamish Moffatt wrote:
> On 1/9/20 6:55 pm, Hamish Moffatt wrote:
>> On 1/9/20 3:15 pm, Nikolay Borisov wrote:
>>>
>>> On 1.09.20 г. 2:50 ч., Hamish Moffatt wrote:
>>>> On 31/8/20 10:57 pm, Nikolay Borisov wrote:
>>>>> This means the data being passed to btrfs is not compressible. I.e
>>>>> after
>>>>> coompression the data is not smaller than the original, input data.
>>>> It is though - if I copy it, or run defrag, it compresses very well:
>>>>
>>>>
>>> As Zygo explained - with 16k writes you'd need at least 25% compression
>>>   in order for btrfs to deem it useful. If firebird's 16k writes are not
>>> 25% compressible then it won't compress. It also depends on whether it
>>> issues fsync after every write to ensure consistency meaning it won't
>>> allow more data to accumulate.
> 
> I've been able to reproduce this with a trivial test program which
> mimics the I/O behaviour of Firebird.
> 
> It is calling fallocate() to set up a bunch of blocks and then writing
> them with pwrite(). It seems to be the fallocate() step which is
> preventing compression.
> 
> Here is my trivial test program which just writes zeroes to a file. The
> output file does not get compressed by btrfs.

Ag yes, this makes sense, because fallocate creates PREALLOC extents
which are NOCOW (since they are essentially empty so it makes no sense
to CoW them) hence they go through a different path which doesn't
perform compression.

> 
> 
> #define _GNU_SOURCE
> 
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <string.h>
> 
> #define BLOCK_SIZE 16384
> 
> int main()
> {
>     unlink("fill");
>     int fd = open("fill", O_RDWR | O_CREAT | O_EXCL, 0666);
> 
>     char buf[BLOCK_SIZE];
>     memset(buf, 0, BLOCK_SIZE);
> 
>     for (int count = 0; count < 256; ++count) {
>         if (count % 8 == 0)
>             fallocate(fd, 0, count * BLOCK_SIZE, 8 * BLOCK_SIZE);
>         pwrite(fd, buf, BLOCK_SIZE, count * BLOCK_SIZE);
>     }
> 
>     close(fd);
> 
>     return 0;
> }
> 
> 
> Hamish
> 

  reply	other threads:[~2020-09-02  5:57 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-30  9:35 new database files not compressed Hamish Moffatt
2020-08-31  2:20 ` Eric Wong
2020-08-31  2:44   ` Hamish Moffatt
2020-08-31  3:15   ` A L
2020-08-31  3:47 ` Zygo Blaxell
2020-08-31  8:53   ` Hamish Moffatt
2020-08-31  9:25     ` Nikolay Borisov
2020-08-31 10:40       ` Hamish Moffatt
2020-08-31 10:47         ` Nikolay Borisov
2020-08-31 12:56           ` Hamish Moffatt
2020-08-31 11:15     ` Roman Mamedov
2020-08-31 12:54       ` Hamish Moffatt
2020-08-31 12:57         ` Nikolay Borisov
2020-08-31 23:50           ` Hamish Moffatt
2020-09-01  5:15             ` Nikolay Borisov
2020-09-01  8:55               ` Hamish Moffatt
2020-09-02  0:32                 ` Hamish Moffatt
2020-09-02  5:57                   ` Nikolay Borisov [this message]
2020-09-02  6:05                     ` Hamish Moffatt
2020-09-02  6:10                       ` Nikolay Borisov
2020-09-02  9:57                     ` A L
2020-09-02 10:09                       ` Nikolay Borisov
2020-09-03 15:04                         ` A L
2020-09-02 16:16                       ` Zygo Blaxell
2020-09-03 12:53                         ` Hamish Moffatt
2020-09-03 19:44                           ` Zygo Blaxell
2020-09-04  8:07                             ` Hamish Moffatt
2020-09-05  4:07                               ` Zygo Blaxell
2020-09-03 15:03                         ` A L
2020-09-03 21:52                           ` Zygo Blaxell
2020-09-01  1:43 ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=03ec55ee-5cf3-54fa-1a81-abc93006ca7b@suse.com \
    --to=nborisov@suse.com \
    --cc=hamish-btrfs@moffatt.email \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox