From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f68.google.com ([74.125.82.68]:34787 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752856AbdHWA1O (ORCPT ); Tue, 22 Aug 2017 20:27:14 -0400 Received: by mail-wm0-f68.google.com with SMTP id r187so711329wma.1 for ; Tue, 22 Aug 2017 17:27:13 -0700 (PDT) From: Timofey Titovets To: linux-btrfs@vger.kernel.org Cc: Timofey Titovets Subject: [PATCH v5 6/6] Btrfs: heuristic add byte core set calculation Date: Wed, 23 Aug 2017 03:26:50 +0300 Message-Id: <20170823002650.3133-7-nefelim4ag@gmail.com> In-Reply-To: <20170823002650.3133-1-nefelim4ag@gmail.com> References: <20170823002650.3133-1-nefelim4ag@gmail.com> Sender: linux-btrfs-owner@vger.kernel.org List-ID: Calculate byte core set for data sample: Sort bucket's numbers in decreasing order Count how many numbers use 90% of sample If core set are low (<=25%), data are easily compressible If core set high (>=80%), data are not compressible Signed-off-by: Timofey Titovets --- fs/btrfs/heuristic.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/heuristic.c b/fs/btrfs/heuristic.c index 953428fde305..14128f77d5ae 100644 --- a/fs/btrfs/heuristic.c +++ b/fs/btrfs/heuristic.c @@ -18,6 +18,7 @@ #include #include #include +#include #include "compression.h" #define READ_SIZE 16 @@ -32,6 +33,8 @@ #define MAX_INPUT_PAGES ((BTRFS_MAX_UNCOMPRESSED >> PAGE_SHIFT)+1) #define MAX_SAMPLE_SIZE (MAX_INPUT_PAGES*PAGE_SIZE*READ_SIZE/ITER_SHIFT) #define BYTE_SET_THRESHOLD 64 +#define BYTE_CORE_SET_LOW BYTE_SET_THRESHOLD +#define BYTE_CORE_SET_HIGH 200 // ~80% struct bucket_item { u32 count; @@ -74,6 +77,45 @@ static struct list_head *heuristic_alloc_workspace(void) return ERR_PTR(-ENOMEM); } +/* For bucket sorting */ +static inline int bucket_compare(const void *lv, const void *rv) +{ + struct bucket_item *l = (struct bucket_item *)(lv); + struct bucket_item *r = (struct bucket_item *)(rv); + + return r->count - l->count; +} + +/* + * Byte Core set size + * How many bytes use 90% of sample + */ +static int byte_core_set_size(struct workspace *workspace) +{ + int a = 0; + u32 coreset_sum = 0; + struct bucket_item *bucket = workspace->bucket; + u32 core_set_threshold = workspace->sample_size*90/100; + + /* Sort in reverse order */ + sort(bucket, BUCKET_SIZE, sizeof(*bucket), + &bucket_compare, NULL); + + for (; a < BYTE_CORE_SET_LOW; a++) + coreset_sum += bucket[a].count; + + if (coreset_sum > core_set_threshold) + return a; + + for (; a < BYTE_CORE_SET_HIGH && bucket[a].count > 0; a++) { + coreset_sum += bucket[a].count; + if (coreset_sum > core_set_threshold) + break; + } + + return a; +} + static int byte_set_size(const struct workspace *workspace) { int a = 0; @@ -161,7 +203,14 @@ static int heuristic(struct list_head *ws, struct inode *inode, if (a > BYTE_SET_THRESHOLD) return 2; - return 1; + a = byte_core_set_size(workspace); + if (a <= BYTE_CORE_SET_LOW) + return 3; + + if (a >= BYTE_CORE_SET_HIGH) + return 0; + + return 4; } const struct btrfs_compress_op btrfs_heuristic = { -- 2.14.1