* [PATCH 0/3] Btrfs: compression fixes
@ 2017-05-19 13:38 Timofey Titovets
2017-05-19 13:38 ` [PATCH 1/3] Btrfs: lzo.c pr_debug() deflate->lzo Timofey Titovets
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Timofey Titovets @ 2017-05-19 13:38 UTC (permalink / raw)
To: linux-btrfs; +Cc: Timofey Titovets
First patch fix copy paste typo in debug message in lzo.c, lzo is not deflate
Second and third patches force btrfs not compress data if compression will not free at least one PAGE_SIZE
Because it's useless in term of storage and read data from disk, as a result productivity suffers
Timofey Titovets (3):
Btrfs: lzo.c pr_debug() deflate->lzo
Btrfs: lzo compression must free at least PAGE_SIZE
Btrfs: zlib compression must free at least PAGE_SIZE
fs/btrfs/lzo.c | 4 ++--
fs/btrfs/zlib.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
--
2.13.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/3] Btrfs: lzo.c pr_debug() deflate->lzo
2017-05-19 13:38 [PATCH 0/3] Btrfs: compression fixes Timofey Titovets
@ 2017-05-19 13:38 ` Timofey Titovets
2017-05-19 13:38 ` [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE Timofey Titovets
2017-05-19 13:38 ` [PATCH 3/3] Btrfs: zlib " Timofey Titovets
2 siblings, 0 replies; 9+ messages in thread
From: Timofey Titovets @ 2017-05-19 13:38 UTC (permalink / raw)
To: linux-btrfs; +Cc: Timofey Titovets
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
fs/btrfs/lzo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index f48c8c14..bd0b0938 100644
--- a/fs/btrfs/lzo.c
+++ b/fs/btrfs/lzo.c
@@ -141,7 +141,7 @@ static int lzo_compress_pages(struct list_head *ws,
ret = lzo1x_1_compress(data_in, in_len, workspace->cbuf,
&out_len, workspace->mem);
if (ret != LZO_E_OK) {
- pr_debug("BTRFS: deflate in loop returned %d\n",
+ pr_debug("BTRFS: lzo in loop returned %d\n",
ret);
ret = -EIO;
goto out;
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE
2017-05-19 13:38 [PATCH 0/3] Btrfs: compression fixes Timofey Titovets
2017-05-19 13:38 ` [PATCH 1/3] Btrfs: lzo.c pr_debug() deflate->lzo Timofey Titovets
@ 2017-05-19 13:38 ` Timofey Titovets
2017-05-19 14:17 ` Lionel Bouton
2017-05-19 13:38 ` [PATCH 3/3] Btrfs: zlib " Timofey Titovets
2 siblings, 1 reply; 9+ messages in thread
From: Timofey Titovets @ 2017-05-19 13:38 UTC (permalink / raw)
To: linux-btrfs; +Cc: Timofey Titovets
If data compression didn't free at least one PAGE_SIZE, it useless to store that compressed extent
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
fs/btrfs/lzo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index bd0b0938..637ef1b0 100644
--- a/fs/btrfs/lzo.c
+++ b/fs/btrfs/lzo.c
@@ -207,7 +207,7 @@ static int lzo_compress_pages(struct list_head *ws,
}
/* we're making it bigger, give up */
- if (tot_in > 8192 && tot_in < tot_out) {
+ if (tot_in > 8192 && tot_in < tot_out + PAGE_SIZE) {
ret = -E2BIG;
goto out;
}
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE
2017-05-19 13:38 ` [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE Timofey Titovets
@ 2017-05-19 14:17 ` Lionel Bouton
2017-05-19 20:19 ` Lionel Bouton
0 siblings, 1 reply; 9+ messages in thread
From: Lionel Bouton @ 2017-05-19 14:17 UTC (permalink / raw)
To: Timofey Titovets, linux-btrfs
Hi,
Le 19/05/2017 à 15:38, Timofey Titovets a écrit :
> If data compression didn't free at least one PAGE_SIZE, it useless to store that compressed extent
>
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> ---
> fs/btrfs/lzo.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
> index bd0b0938..637ef1b0 100644
> --- a/fs/btrfs/lzo.c
> +++ b/fs/btrfs/lzo.c
> @@ -207,7 +207,7 @@ static int lzo_compress_pages(struct list_head *ws,
> }
>
> /* we're making it bigger, give up */
> - if (tot_in > 8192 && tot_in < tot_out) {
> + if (tot_in > 8192 && tot_in < tot_out + PAGE_SIZE) {
> ret = -E2BIG;
> goto out;
> }
I'm not familiar with this code but I was surprised by the test : you
would expect compression having a benefit when you are freeing an actual
page not reducing data by a page size. So unless I don't understand the
context shouldn't it be something like :
if (tot_in > 8192 && ((tot_in % PAGE_SIZE) <= (tot_out % PAGE_SIZE))
but looking at the code I see that this is in a while loop and there's
another test just after the loop in the existing code :
if (tot_out > tot_in)
goto out;
There's a couple of things I don't understand but isn't this designed to
stream data in small chunks through compression before writing it in the
end ? So isn't this later test the proper location to detect if
compression was beneficial ?
You might not save a page early on in the while loop working on a subset
of the data to compress but after enough data being processed you could
save a page. It seems odd that your modification could abort compression
early on although the same condition would become true after enough loops.
Isn't what you want something like :
if (tot_out % PAGE_SIZE >= tot_in % PAGE_SIZE)
goto out;
after the loop ?
The >= instead of > would avoid decompression in the case where the
compressed data is smaller but uses the same space on disk.
Best regards,
Lionel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE
2017-05-19 14:17 ` Lionel Bouton
@ 2017-05-19 20:19 ` Lionel Bouton
2017-05-19 21:15 ` Timofey Titovets
0 siblings, 1 reply; 9+ messages in thread
From: Lionel Bouton @ 2017-05-19 20:19 UTC (permalink / raw)
To: Timofey Titovets, linux-btrfs
Le 19/05/2017 à 16:17, Lionel Bouton a écrit :
> Hi,
>
> Le 19/05/2017 à 15:38, Timofey Titovets a écrit :
>> If data compression didn't free at least one PAGE_SIZE, it useless to store that compressed extent
>>
>> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
>> ---
>> fs/btrfs/lzo.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
>> index bd0b0938..637ef1b0 100644
>> --- a/fs/btrfs/lzo.c
>> +++ b/fs/btrfs/lzo.c
>> @@ -207,7 +207,7 @@ static int lzo_compress_pages(struct list_head *ws,
>> }
>>
>> /* we're making it bigger, give up */
>> - if (tot_in > 8192 && tot_in < tot_out) {
>> + if (tot_in > 8192 && tot_in < tot_out + PAGE_SIZE) {
>> ret = -E2BIG;
>> goto out;
>> }
> I'm not familiar with this code but I was surprised by the test : you
> would expect compression having a benefit when you are freeing an actual
> page not reducing data by a page size. So unless I don't understand the
> context shouldn't it be something like :
>
> if (tot_in > 8192 && ((tot_in % PAGE_SIZE) <= (tot_out % PAGE_SIZE))
>
> but looking at the code I see that this is in a while loop and there's
> another test just after the loop in the existing code :
>
> if (tot_out > tot_in)
> goto out;
>
> There's a couple of things I don't understand but isn't this designed to
> stream data in small chunks through compression before writing it in the
> end ? So isn't this later test the proper location to detect if
> compression was beneficial ?
>
> You might not save a page early on in the while loop working on a subset
> of the data to compress but after enough data being processed you could
> save a page. It seems odd that your modification could abort compression
> early on although the same condition would become true after enough loops.
>
> Isn't what you want something like :
>
> if (tot_out % PAGE_SIZE >= tot_in % PAGE_SIZE)
> goto out;
>
> after the loop ?
> The >= instead of > would avoid decompression in the case where the
> compressed data is smaller but uses the same space on disk.
I was too focused on other problems and having a fresh look at what I
wrote I'm embarrassed by what I read.
Used pages for a given amount of data should be (amount / PAGE_SIZE) +
((amount % PAGE_SIZE) == 0 ? 0 : 1) this seems enough of a common thing
to compute that the kernel might have a macro defined for this.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE
2017-05-19 20:19 ` Lionel Bouton
@ 2017-05-19 21:15 ` Timofey Titovets
2017-05-20 15:47 ` Lionel Bouton
0 siblings, 1 reply; 9+ messages in thread
From: Timofey Titovets @ 2017-05-19 21:15 UTC (permalink / raw)
To: Lionel Bouton; +Cc: linux-btrfs
2017-05-19 23:19 GMT+03:00 Lionel Bouton <lionel-subscription@bouton.name>:
> I was too focused on other problems and having a fresh look at what I
> wrote I'm embarrassed by what I read.
> Used pages for a given amount of data should be (amount / PAGE_SIZE) +
> ((amount % PAGE_SIZE) == 0 ? 0 : 1) this seems enough of a common thing
> to compute that the kernel might have a macro defined for this.
If i understand the code correctly, the logic of comparing the size of
input/output by bytes is enough (IMHO) for skipping the compression of
non-compressible data, and as btrfs uses PAGE_SIZE as a data cluster
size (and if i understand correctly it's minimum IO size), the logic
of that can be improved in case when compressed data use 126978 <
compressed_size < 131072.
The easiest way to improve that case, i think, is making the
compressed size bigger by PAGE_SIZE.
JFYI:
Once I've read on the list that btrfs don't compress data, if data are
less then PAGE_SIZE because it's useless (i didn't find that in the
code, so i think that i don't fully understand code of compress
routine)
After some time i got a idea that if btrfs determines store data
compressed or not by comparing input/output size of data (i.e. if
compressed size is bigger in compare to input data, don't store
compressed version), this logic can be improved by also checking if
compression profit is more then PAGE_SIZE, because if it's not,
compressed data will pass check and comsume the same amount of space
and as a result will not show any gain.
--
Have a nice day,
Timofey.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE
2017-05-19 21:15 ` Timofey Titovets
@ 2017-05-20 15:47 ` Lionel Bouton
2017-05-20 16:23 ` Timofey Titovets
0 siblings, 1 reply; 9+ messages in thread
From: Lionel Bouton @ 2017-05-20 15:47 UTC (permalink / raw)
To: Timofey Titovets; +Cc: linux-btrfs
Le 19/05/2017 à 23:15, Timofey Titovets a écrit :
> 2017-05-19 23:19 GMT+03:00 Lionel Bouton
> <lionel-subscription@bouton.name>:
>> I was too focused on other problems and having a fresh look at what I
>> wrote I'm embarrassed by what I read. Used pages for a given amount
>> of data should be (amount / PAGE_SIZE) + ((amount % PAGE_SIZE) == 0 ?
>> 0 : 1) this seems enough of a common thing to compute that the kernel
>> might have a macro defined for this.
> If i understand the code correctly, the logic of comparing the size of
> input/output by bytes is enough (IMHO)
As I suspected I missed context : the name of the function makes it
clear it is supposed to work on whole pages so you are right about the
comparison.
What I'm still unsure is if the test is at the right spot. The inner
loop seems to work on at most
in_len = min(len, PAGE_SIZE)
chunks of data,
for example on anything with len >= 4xPAGE_SIZE and PAGE_SIZE=4096 it
seems to me there's a problem.
if·(tot_in·>·8192·&&·tot_in·<·tot_out·+·PAGE_SIZE)
tot_in > 8192 is true starting at the 3rd page being processedin my example
If the 3 first pages don't manage to free one full page (ie the function
only reaches at best a 2/3 compression ratio) the modified second
condition is true and the compression is aborted. This even if
continuing the compression would end up in freeing one page (tot_out is
expected to rise slower than tot_in on compressible data so the
difference could rise and reach a full PAGE_SIZE).
Am I still confused by something ?
Best regards,
Lionel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE
2017-05-20 15:47 ` Lionel Bouton
@ 2017-05-20 16:23 ` Timofey Titovets
0 siblings, 0 replies; 9+ messages in thread
From: Timofey Titovets @ 2017-05-20 16:23 UTC (permalink / raw)
To: Lionel Bouton; +Cc: linux-btrfs
2017-05-20 18:47 GMT+03:00 Lionel Bouton <lionel-subscription@bouton.name>:
> Le 19/05/2017 à 23:15, Timofey Titovets a écrit :
>> 2017-05-19 23:19 GMT+03:00 Lionel Bouton
>> <lionel-subscription@bouton.name>:
>>> I was too focused on other problems and having a fresh look at what I
>>> wrote I'm embarrassed by what I read. Used pages for a given amount
>>> of data should be (amount / PAGE_SIZE) + ((amount % PAGE_SIZE) == 0 ?
>>> 0 : 1) this seems enough of a common thing to compute that the kernel
>>> might have a macro defined for this.
>> If i understand the code correctly, the logic of comparing the size of
>> input/output by bytes is enough (IMHO)
>
> As I suspected I missed context : the name of the function makes it
> clear it is supposed to work on whole pages so you are right about the
> comparison.
>
> What I'm still unsure is if the test is at the right spot. The inner
> loop seems to work on at most
> in_len = min(len, PAGE_SIZE)
> chunks of data,
> for example on anything with len >= 4xPAGE_SIZE and PAGE_SIZE=4096 it
> seems to me there's a problem.
>
> if·(tot_in·>·8192·&&·tot_in·<·tot_out·+·PAGE_SIZE)
>
> tot_in > 8192 is true starting at the 3rd page being processedin my example
>
> If the 3 first pages don't manage to free one full page (ie the function
> only reaches at best a 2/3 compression ratio) the modified second
> condition is true and the compression is aborted. This even if
> continuing the compression would end up in freeing one page (tot_out is
> expected to rise slower than tot_in on compressible data so the
> difference could rise and reach a full PAGE_SIZE).
>
> Am I still confused by something ?
>
> Best regards,
>
> Lionel
You right, size must be checked after all data are already comressed,
so i will fix that and update patch set, thanks
--
Have a nice day,
Timofey.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/3] Btrfs: zlib compression must free at least PAGE_SIZE
2017-05-19 13:38 [PATCH 0/3] Btrfs: compression fixes Timofey Titovets
2017-05-19 13:38 ` [PATCH 1/3] Btrfs: lzo.c pr_debug() deflate->lzo Timofey Titovets
2017-05-19 13:38 ` [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE Timofey Titovets
@ 2017-05-19 13:38 ` Timofey Titovets
2 siblings, 0 replies; 9+ messages in thread
From: Timofey Titovets @ 2017-05-19 13:38 UTC (permalink / raw)
To: linux-btrfs; +Cc: Timofey Titovets
If data compression didn't free at least one PAGE_SIZE, it useless to store that compressed extent
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
fs/btrfs/zlib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
index 135b1082..7c3c27e6 100644
--- a/fs/btrfs/zlib.c
+++ b/fs/btrfs/zlib.c
@@ -134,7 +134,7 @@ static int zlib_compress_pages(struct list_head *ws,
/* we're making it bigger, give up */
if (workspace->strm.total_in > 8192 &&
workspace->strm.total_in <
- workspace->strm.total_out) {
+ workspace->strm.total_out + PAGE_SIZE) {
ret = -E2BIG;
goto out;
}
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-05-20 16:24 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-19 13:38 [PATCH 0/3] Btrfs: compression fixes Timofey Titovets
2017-05-19 13:38 ` [PATCH 1/3] Btrfs: lzo.c pr_debug() deflate->lzo Timofey Titovets
2017-05-19 13:38 ` [PATCH 2/3] Btrfs: lzo compression must free at least PAGE_SIZE Timofey Titovets
2017-05-19 14:17 ` Lionel Bouton
2017-05-19 20:19 ` Lionel Bouton
2017-05-19 21:15 ` Timofey Titovets
2017-05-20 15:47 ` Lionel Bouton
2017-05-20 16:23 ` Timofey Titovets
2017-05-19 13:38 ` [PATCH 3/3] Btrfs: zlib " Timofey Titovets
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).