From: Timofey Titovets <nefelim4ag@gmail.com>
To: linux-btrfs@vger.kernel.org
Cc: Timofey Titovets <nefelim4ag@gmail.com>
Subject: [PATCH v7 6/6] Btrfs: heuristic add byte core set calculation
Date: Fri, 25 Aug 2017 12:18:45 +0300 [thread overview]
Message-ID: <20170825091845.4120-7-nefelim4ag@gmail.com> (raw)
In-Reply-To: <20170825091845.4120-1-nefelim4ag@gmail.com>
Calculate byte core set for data sample:
Sort bucket's numbers in decreasing order
Count how many numbers use 90% of sample
If core set are low (<=25%), data are easily compressible
If core set high (>=80%), data are not compressible
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
fs/btrfs/heuristic.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 50 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/heuristic.c b/fs/btrfs/heuristic.c
index ef723e991576..df0cefa42857 100644
--- a/fs/btrfs/heuristic.c
+++ b/fs/btrfs/heuristic.c
@@ -18,6 +18,7 @@
#include <linux/pagemap.h>
#include <linux/string.h>
#include <linux/bio.h>
+#include <linux/sort.h>
#include "compression.h"
#define READ_SIZE 16
@@ -25,6 +26,8 @@
#define BUCKET_SIZE 256
#define MAX_SAMPLE_SIZE (BTRFS_MAX_UNCOMPRESSED*READ_SIZE/ITER_SHIFT)
#define BYTE_SET_THRESHOLD 64
+#define BYTE_CORE_SET_LOW BYTE_SET_THRESHOLD
+#define BYTE_CORE_SET_HIGH 200 // ~80%
struct bucket_item {
u32 count;
@@ -67,6 +70,45 @@ static struct list_head *heuristic_alloc_workspace(void)
return ERR_PTR(-ENOMEM);
}
+/* For bucket sorting */
+static inline int bucket_compare(const void *lv, const void *rv)
+{
+ struct bucket_item *l = (struct bucket_item *)(lv);
+ struct bucket_item *r = (struct bucket_item *)(rv);
+
+ return r->count - l->count;
+}
+
+/*
+ * Byte Core set size
+ * How many bytes use 90% of sample
+ */
+static int byte_core_set_size(struct workspace *ws)
+{
+ u32 a = 0;
+ u32 coreset_sum = 0;
+ u32 core_set_threshold = ws->sample_size*90/100;
+ struct bucket_item *bucket = ws->bucket;
+
+ /* Sort in reverse order */
+ sort(bucket, BUCKET_SIZE, sizeof(*bucket),
+ &bucket_compare, NULL);
+
+ for (; a < BYTE_CORE_SET_LOW; a++)
+ coreset_sum += bucket[a].count;
+
+ if (coreset_sum > core_set_threshold)
+ return a;
+
+ for (; a < BYTE_CORE_SET_HIGH && bucket[a].count > 0; a++) {
+ coreset_sum += bucket[a].count;
+ if (coreset_sum > core_set_threshold)
+ break;
+ }
+
+ return a;
+}
+
static u32 byte_set_size(const struct workspace *ws)
{
u32 a = 0;
@@ -164,7 +206,14 @@ static int heuristic(struct list_head *ws, struct inode *inode,
if (a > BYTE_SET_THRESHOLD)
return 2;
- return 1;
+ a = byte_core_set_size(workspace);
+ if (a <= BYTE_CORE_SET_LOW)
+ return 3;
+
+ if (a >= BYTE_CORE_SET_HIGH)
+ return 0;
+
+ return 4;
}
const struct btrfs_compress_op btrfs_heuristic = {
--
2.14.1
next prev parent reply other threads:[~2017-08-25 9:19 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-25 9:18 [PATCH v7 0/6] Btrfs: populate heuristic with code Timofey Titovets
2017-08-25 9:18 ` [PATCH v7 1/6] Btrfs: heuristic make use compression workspaces Timofey Titovets
2017-09-27 13:12 ` David Sterba
2017-08-25 9:18 ` [PATCH v7 2/6] Btrfs: heuristic workspace add bucket and sample items Timofey Titovets
2017-09-27 13:22 ` David Sterba
2017-08-25 9:18 ` [PATCH v7 3/6] Btrfs: implement heuristic sampling logic Timofey Titovets
2017-09-27 13:38 ` David Sterba
2017-08-25 9:18 ` [PATCH v7 4/6] Btrfs: heuristic add detection of repeated data patterns Timofey Titovets
2017-09-27 13:47 ` David Sterba
2017-08-25 9:18 ` [PATCH v7 5/6] Btrfs: heuristic add byte set calculation Timofey Titovets
2017-09-27 13:50 ` David Sterba
2017-08-25 9:18 ` Timofey Titovets [this message]
2017-09-27 13:54 ` [PATCH v7 6/6] Btrfs: heuristic add byte core " David Sterba
2017-09-27 13:56 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170825091845.4120-7-nefelim4ag@gmail.com \
--to=nefelim4ag@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).