[RFC PATCH v3 0/2] Btrfs: add compression heuristic

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH v3 0/2] Btrfs: add compression heuristic
@ 2017-06-26 19:40 Timofey Titovets
  2017-06-26 19:40 ` [RFC PATCH v3 1/2] Btrfs: add precomputed log2() Timofey Titovets
  2017-06-26 19:40 ` [RFC PATCH v3 2/2] Btrfs: add heuristic method for make decision compress or not compress Timofey Titovets
  0 siblings, 2 replies; 3+ messages in thread
From: Timofey Titovets @ 2017-06-26 19:40 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Timofey Titovets

Today btrfs use simple logic to make decision
compress data or not:
Selected compression algorithm try compress
data and if this save some space
store that extent as compressed.

It's Reliable way to detect uncompressible data
but it's will waste/burn cpu time for
bad/un-compressible data and add latency.

This way also add additional pressure on
memory subsystem as for every compressed write
btrfs need to allocate some buffered pages and
reuse compression workspace.

This is quite efficient, but not free.

So, try create basic heuristic framework,
this heuristic code will analizy data on the fly
before call of compression code,
can detect uncompressible data and advice to skip it.

I leave comments with description in code,
but i also will try describe that logic short.
Heuristic have several internal layers:
1. Get sample data - this is cpu expensive
   to analize whole stream, so let's get some
   big enough sample from input data
   Scaling:
   In data: 128K  64K   32K   4K
   Sample:  4096b 3072b 2048b 1024b

2. For performance reason and for reuse it in 7th level
   copy selected data to sample buffer
3. Count every byte type in sample buffer
4. Count how many types of bytes we find
   If it's not many - data will be easy compressible
5. Count character core set size, i.e.
   which characters use 90% of input stream
   If core set small (1-50 different types)
   Data easy compressible
   If big (200-256) - data probably can't be compressed
6. If above methods are fail to make decision,
   try compute shannon entropy
   If entropy are small - data will be easy compressible
   If not - go to 7th
7. Entropy can't detect repeated strings of bytes
   So try look at the data for detect repeated bytes
   Compute a difference between frequency of bytes from
   coreset and between frequency of pair of that bytes
   If sum of that defferent from zero and entropy and not
   big, give compression code a try
   If entropy are High 7.2/8 - 8/8 (> 90%), and if we find BIG enough
   difference between frequency of a pairs and characters
   Give compression code a try

   7th level needed for decreasing false negative returns,
   where data can be compressed (like ~131072b -> ~87000b ~ 0.66),
   but not so easy.

That code, as i see, forbidden compression like:
- 131072b -> ~110000b
If compression ratio are better, it's allow that.

Shannon entropy use log2(a/b) function,
I did a try replace that with int_log2(a)-int_log2(b), but
integer realization of log2 show a lack of accuracy (+-7-10%) in our case.
So i precalculate some input/output values (1/131072 - 1/1) and create log2_lshift16();
I already decrease lines of that function from 1200 -> 200
for save memory (and lose some accuracy), so with precomputed function
I get +- 0.5-2% of accuracy (in compare to normal "true" float log2 shannon)

Thanks.

Patches based on latest mainline: v4.12-rc7

P.S.
I made only stability tests at now, all works stable.
About performance:
In userspace realization of that algorithm, which
iterate over data by 128kb block and do Mmap() of file, it
show ~4GiB/s over in memory (cached) data in one stream.
For i5-4200M && DDR3.

So i expect to not hurt compression performance.

I've also duplicate patch set to:
https://github.com/Nefelim4ag/linux

log2_lshift() - tested by log2_generator
https://github.com/Nefelim4ag/Entropy_Calculation

P.S.S.
Sorry for my bad english and may be for ugly code.
I do my best, thanks.

Changes since v1:
  - Fixes of checkpatch.pl warnings/errors
  - Use div64_u64() instead of "/"
  - Make log2_lshift16() more like binary tree as suggested by:
    Adam Borowski <kilobyte@angband.pl>

Changes since v2:
  - Fix page read address overflow in heuristic.c
  - Make "bucket" dynamically allocated, for fix warnings about big stack.
  - Small cleanups

Timofey Titovets (2):
  Btrfs: add precomputed log2()
  Btrfs: add heuristic method for make decision compress or not compress

 fs/btrfs/Makefile        |   2 +-
 fs/btrfs/heuristic.c     | 275 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/heuristic.h     |  13 +++
 fs/btrfs/inode.c         |  37 ++++---
 fs/btrfs/log2_lshift16.c | 278 +++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/log2_lshift16.h |  11 ++
 6 files changed, 601 insertions(+), 15 deletions(-)
 create mode 100644 fs/btrfs/heuristic.c
 create mode 100644 fs/btrfs/heuristic.h
 create mode 100644 fs/btrfs/log2_lshift16.c
 create mode 100644 fs/btrfs/log2_lshift16.h

--
2.13.1

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [RFC PATCH v3 1/2] Btrfs: add precomputed log2()
  2017-06-26 19:40 [RFC PATCH v3 0/2] Btrfs: add compression heuristic Timofey Titovets
@ 2017-06-26 19:40 ` Timofey Titovets
  2017-06-26 19:40 ` [RFC PATCH v3 2/2] Btrfs: add heuristic method for make decision compress or not compress Timofey Titovets
  1 sibling, 0 replies; 3+ messages in thread
From: Timofey Titovets @ 2017-06-26 19:40 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Timofey Titovets

Heuristic code compute shannon entropy in cases when
other methods can't make clear decision
For realization that calculation it's needs floating point,
but as this doesn't possible to use floating point,
lets just precalculate all our input/output values

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
 fs/btrfs/log2_lshift16.c | 278 +++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/log2_lshift16.h |  11 ++
 2 files changed, 289 insertions(+)
 create mode 100644 fs/btrfs/log2_lshift16.c
 create mode 100644 fs/btrfs/log2_lshift16.h

diff --git a/fs/btrfs/log2_lshift16.c b/fs/btrfs/log2_lshift16.c
new file mode 100644
index 000000000000..0d5d414b2adf
--- /dev/null
+++ b/fs/btrfs/log2_lshift16.c
@@ -0,0 +1,278 @@
+#include <linux/types.h>
+#include "log2_lshift16.h"
+
+/*
+ * Precalculated log2 values
+ * Shifting used for avoiding floating point
+ * Fraction must be left shifted by 16
+ * Return of log are left shifted by 3
+ */
+int log2_lshift16(u64 lshift16)
+{
+	if (lshift16 < 558) {
+		if (lshift16 < 54) {
+			if (lshift16 < 13) {
+				if (lshift16 < 7) {
+					if (lshift16 < 1)
+						return -136;
+					if (lshift16 < 2)
+						return -123;
+					if (lshift16 < 3)
+						return -117;
+					if (lshift16 < 4)
+						return -113;
+					if (lshift16 < 5)
+						return -110;
+					if (lshift16 < 6)
+						return -108;
+					if (lshift16 < 7)
+						return -106;
+				} else {
+					if (lshift16 < 8)
+						return -104;
+					if (lshift16 < 9)
+						return -103;
+					if (lshift16 < 10)
+						return -102;
+					if (lshift16 < 11)
+						return -100;
+					if (lshift16 < 12)
+						return -99;
+					if (lshift16 < 13)
+						return -98;
+				}
+			} else {
+				if (lshift16 < 29) {
+					if (lshift16 < 15)
+						return -97;
+					if (lshift16 < 16)
+						return -96;
+					if (lshift16 < 17)
+						return -95;
+					if (lshift16 < 19)
+						return -94;
+					if (lshift16 < 21)
+						return -93;
+					if (lshift16 < 23)
+						return -92;
+					if (lshift16 < 25)
+						return -91;
+					if (lshift16 < 27)
+						return -90;
+					if (lshift16 < 29)
+						return -89;
+				} else {
+					if (lshift16 < 32)
+						return -88;
+					if (lshift16 < 35)
+						return -87;
+					if (lshift16 < 38)
+						return -86;
+					if (lshift16 < 41)
+						return -85;
+					if (lshift16 < 45)
+						return -84;
+					if (lshift16 < 49)
+						return -83;
+					if (lshift16 < 54)
+						return -82;
+				}
+			}
+		} else {
+			if (lshift16 < 181) {
+				if (lshift16 < 99) {
+					if (lshift16 < 59)
+						return -81;
+					if (lshift16 < 64)
+						return -80;
+					if (lshift16 < 70)
+						return -79;
+					if (lshift16 < 76)
+						return -78;
+					if (lshift16 < 83)
+						return -77;
+					if (lshift16 < 91)
+						return -76;
+					if (lshift16 < 99)
+						return -75;
+				} else {
+					if (lshift16 < 108)
+						return -74;
+					if (lshift16 < 117)
+						return -73;
+					if (lshift16 < 128)
+						return -72;
+					if (lshift16 < 140)
+						return -71;
+					if (lshift16 < 152)
+						return -70;
+					if (lshift16 < 166)
+						return -69;
+					if (lshift16 < 181)
+						return -68;
+				}
+			} else {
+				if (lshift16 < 304) {
+					if (lshift16 < 197)
+						return -67;
+					if (lshift16 < 215)
+						return -66;
+					if (lshift16 < 235)
+						return -65;
+					if (lshift16 < 256)
+						return -64;
+					if (lshift16 < 279)
+						return -63;
+					if (lshift16 < 304)
+						return -62;
+				} else {
+					if (lshift16 < 332)
+						return -61;
+					if (lshift16 < 362)
+						return -60;
+					if (lshift16 < 395)
+						return -59;
+					if (lshift16 < 431)
+						return -58;
+					if (lshift16 < 470)
+						return -57;
+					if (lshift16 < 512)
+						return -56;
+					if (lshift16 < 558)
+						return -55;
+				}
+			}
+		}
+	} else {
+		if (lshift16 < 6317) {
+			if (lshift16 < 2048) {
+				if (lshift16 < 1117) {
+					if (lshift16 < 609)
+						return -54;
+					if (lshift16 < 664)
+						return -53;
+					if (lshift16 < 724)
+						return -52;
+					if (lshift16 < 790)
+						return -51;
+					if (lshift16 < 861)
+						return -50;
+					if (lshift16 < 939)
+						return -49;
+					if (lshift16 < 1024)
+						return -48;
+					if (lshift16 < 1117)
+						return -47;
+				} else {
+					if (lshift16 < 1218)
+						return -46;
+					if (lshift16 < 1328)
+						return -45;
+					if (lshift16 < 1448)
+						return -44;
+					if (lshift16 < 1579)
+						return -43;
+					if (lshift16 < 1722)
+						return -42;
+					if (lshift16 < 1878)
+						return -41;
+					if (lshift16 < 2048)
+						return -40;
+				}
+			} else {
+				if (lshift16 < 3756) {
+					if (lshift16 < 2233)
+						return -39;
+					if (lshift16 < 2435)
+						return -38;
+					if (lshift16 < 2656)
+						return -37;
+					if (lshift16 < 2896)
+						return -36;
+					if (lshift16 < 3158)
+						return -35;
+					if (lshift16 < 3444)
+						return -34;
+					if (lshift16 < 3756)
+						return -33;
+				} else {
+					if (lshift16 < 4096)
+						return -32;
+					if (lshift16 < 4467)
+						return -31;
+					if (lshift16 < 4871)
+						return -30;
+					if (lshift16 < 5312)
+						return -29;
+					if (lshift16 < 5793)
+						return -28;
+					if (lshift16 < 6317)
+						return -27;
+				}
+			}
+		} else {
+			if (lshift16 < 21247) {
+				if (lshift16 < 11585) {
+					if (lshift16 < 6889)
+						return -26;
+					if (lshift16 < 7512)
+						return -25;
+					if (lshift16 < 8192)
+						return -24;
+					if (lshift16 < 8933)
+						return -23;
+					if (lshift16 < 9742)
+						return -22;
+					if (lshift16 < 10624)
+						return -21;
+					if (lshift16 < 11585)
+						return -20;
+				} else {
+					if (lshift16 < 12634)
+						return -19;
+					if (lshift16 < 13777)
+						return -18;
+					if (lshift16 < 15024)
+						return -17;
+					if (lshift16 < 16384)
+						return -16;
+					if (lshift16 < 17867)
+						return -15;
+					if (lshift16 < 19484)
+						return -14;
+					if (lshift16 < 21247)
+						return -13;
+				}
+			} else {
+				if (lshift16 < 35734) {
+					if (lshift16 < 23170)
+						return -12;
+					if (lshift16 < 25268)
+						return -11;
+					if (lshift16 < 27554)
+						return -10;
+					if (lshift16 < 30048)
+						return -9;
+					if (lshift16 < 32768)
+						return -8;
+					if (lshift16 < 35734)
+						return -7;
+				} else {
+					if (lshift16 < 38968)
+						return -6;
+					if (lshift16 < 42495)
+						return -5;
+					if (lshift16 < 46341)
+						return -4;
+					if (lshift16 < 50535)
+						return -3;
+					if (lshift16 < 55109)
+						return -2;
+					if (lshift16 < 60097)
+						return -1;
+				}
+			}
+		}
+	}
+	return 0;
+}
diff --git a/fs/btrfs/log2_lshift16.h b/fs/btrfs/log2_lshift16.h
new file mode 100644
index 000000000000..fba434e3984a
--- /dev/null
+++ b/fs/btrfs/log2_lshift16.h
@@ -0,0 +1,11 @@
+#include <linux/types.h>
+/*
+ * Precalculated log2 values
+ * Shifting used for avoiding floating point
+ * Fraction must be left shifted by 16
+ * Return of log are left shifted by 3
+ */
+
+#define LOG2_ARG_SHIFT (1 << 16)
+#define LOG2_RET_SHIFT (1 << 3)
+int log2_lshift16(u64 lshift16);
--
2.13.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [RFC PATCH v3 2/2] Btrfs: add heuristic method for make decision compress or not compress
  2017-06-26 19:40 [RFC PATCH v3 0/2] Btrfs: add compression heuristic Timofey Titovets
  2017-06-26 19:40 ` [RFC PATCH v3 1/2] Btrfs: add precomputed log2() Timofey Titovets
@ 2017-06-26 19:40 ` Timofey Titovets
  1 sibling, 0 replies; 3+ messages in thread
From: Timofey Titovets @ 2017-06-26 19:40 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Timofey Titovets

Add a heuristic computation before compression,
for avoiding load resource heavy compression workspace,
if data are probably can't be compressed.

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
 fs/btrfs/Makefile    |   2 +-
 fs/btrfs/heuristic.c | 275 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/heuristic.h |  13 +++
 fs/btrfs/inode.c     |  37 ++++---
 4 files changed, 312 insertions(+), 15 deletions(-)
 create mode 100644 fs/btrfs/heuristic.c
 create mode 100644 fs/btrfs/heuristic.h

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 128ce17a80b0..8386095c9032 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
 	   export.o tree-log.o free-space-cache.o zlib.o lzo.o \
 	   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
 	   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
-	   uuid-tree.o props.o hash.o free-space-tree.o
+	   uuid-tree.o props.o hash.o free-space-tree.o heuristic.o log2_lshift16.o

 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/heuristic.c b/fs/btrfs/heuristic.c
new file mode 100644
index 000000000000..cac6f0917b59
--- /dev/null
+++ b/fs/btrfs/heuristic.c
@@ -0,0 +1,275 @@
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/fs.h>
+#include <linux/math64.h>
+#include <linux/pagemap.h>
+#include <linux/highmem.h>
+#include <linux/sort.h>
+#include <linux/slab.h>
+
+#include "heuristic.h"
+/* Precalculated log2 realization */
+#include "log2_lshift16.h"
+
+/* For shannon full integer entropy calculation */
+#define BUCKET_SIZE (1 << 8)
+
+struct _backet_item {
+	u8  padding;
+	u8  symbol;
+	u16 count;
+};
+
+
+/* For sorting */
+static int compare(const void *lhs, const void *rhs)
+{
+	struct _backet_item *l = (struct _backet_item *)(lhs);
+	struct _backet_item *r = (struct _backet_item *)(rhs);
+
+	return r->count - l->count;
+}
+
+/*
+ * For good compressible data
+ * symbol set size over sample
+ * will be small <= 64
+ */
+static u32 _symbset_calc(const struct _backet_item *bucket)
+{
+	u32 a = 0;
+	u32 symbset_size = 0;
+
+	for (; a < BUCKET_SIZE && symbset_size <= 64; a++) {
+		if (bucket[a].count)
+			symbset_size++;
+	}
+	return symbset_size;
+}
+
+
+/*
+ * Try calculate coreset size
+ * i.e. how many symbols use 90% of input data
+ * < 50 - good compressible data
+ * > 200 - bad compressible data
+ * For right & fast calculation bucket must be reverse sorted
+ */
+static u32 _coreset_calc(const struct _backet_item *bucket,
+	const u32 sum_threshold)
+{
+	u32 a = 0;
+	u32 coreset_sum = 0;
+
+	for (a = 0; a < 201 && bucket[a].count; a++) {
+		coreset_sum += bucket[a].count;
+		if (coreset_sum > sum_threshold)
+			break;
+	}
+	return a;
+}
+
+static u64 _entropy_perc(const struct _backet_item *bucket,
+	const u32 sample_size)
+{
+	u64 a, p;
+	u64 entropy_sum = 0;
+	u64 entropy_max = LOG2_RET_SHIFT*8;
+
+	for (a = 0; a < BUCKET_SIZE && bucket[a].count > 0; a++) {
+		p = bucket[a].count;
+		p = div64_u64(p*LOG2_ARG_SHIFT, sample_size);
+		entropy_sum += -p*log2_lshift16(p);
+	}
+
+	entropy_sum = div64_u64(entropy_sum, LOG2_ARG_SHIFT);
+	return div64_u64(entropy_sum*100, entropy_max);
+}
+
+/* Pair distance from random distribution */
+static u64 _random_pairs_distribution(const struct _backet_item *bucket,
+	const u32 coreset_size, const u8 *sample, u32 sample_size)
+{
+	u32 a, b;
+	u8 pair_a[2], pair_b[2];
+	u32 pairs_count;
+	u64 sum = 0;
+	u64 buf1, buf2;
+
+	for (a = 0; a < coreset_size-1; a++) {
+		pairs_count = 0;
+		pair_a[0] = bucket[a].symbol;
+		pair_a[1] = bucket[a+1].symbol;
+		pair_b[1] = bucket[a].symbol;
+		pair_b[0] = bucket[a+1].symbol;
+		for (b = 0; b < sample_size-1; b++) {
+			u16 *pair_c = (u16 *) &sample[b];
+
+			if (pair_c == (u16 *) pair_a)
+				pairs_count++;
+			else if (pair_c == (u16 *) pair_b)
+				pairs_count++;
+		}
+		buf1 = bucket[a].count*bucket[a+1].count;
+		buf1 = div64_u64(buf1*100000, (sample_size*sample_size));
+		buf2 = pairs_count*2*100000;
+		buf2 = div64_u64(pairs_count, sample_size);
+		sum += (buf1 - buf2)*(buf1 - buf2);
+	}
+
+	return div64_u64(sum, 2048);
+}
+
+/*
+ * Algorithm description
+ * 1. Get subset of data for fast computation
+ * 2. Scan bucket for symbol set
+ *    - symbol set < 64 - data will be easy compressible, return
+ * 3. Try compute coreset size (symbols count that use 90% of input data)
+ *    - reverse sort bucket
+ *    - sum cells until we reach 90% threshold,
+ *      incriment coreset size each time
+ *    - coreset_size < 50  - data will be easy compressible, return
+ *                   > 200 - data will be bad compressible, return
+ *      in general this looks like data compression ratio 0.2 - 0.8
+ * 4. Compute shannon entropy
+ *    - shannon entropy count of bytes and can't count pairs & entropy_calc
+ *      so assume:
+ *        - usage of entropy can lead to false negative
+ *          so for prevent that (in bad) case it's useful to "count" pairs
+ *        - entropy are not to high < 70% easy compressible, return
+ *        - entropy are high < 90%, try count pairs,
+ *          if there is any noticeable amount, compression are possible, return
+ *        - entropy are high > 90%, try count pairs,
+ *          if there is noticeable amount, compression are possible, return
+ */
+
+#define READ_SIZE 16
+
+enum compression_advice btrfs_compress_heuristic(struct inode *inode,
+	u64 start, u64 end)
+{
+	enum compression_advice ret = COMPRESS_NONE;
+	u64 input_size = end - start;
+	u64 index = start >> PAGE_SHIFT;
+	u64 end_index = end >> PAGE_SHIFT;
+	struct page *page;
+	u64 a, b, c;
+	u64 offset_count, shift, sample_size;
+	u64 coreset_size, entropy_lvl;
+	u8 *sample;
+	u8 *input_data;
+	struct _backet_item *bucket = NULL;
+
+
+	/*
+	 * In data: 128K  64K   32K   4K
+	 * Sample:  4096b 3072b 2048b 1024b
+	 * Avoid allocating array bigger then 4kb
+	 */
+	if (input_size >= 96*1024)
+		offset_count = 256;
+	else
+		offset_count = 64 + input_size/512;
+
+	shift = input_size/offset_count;
+	sample_size = offset_count*READ_SIZE;
+
+	/*
+	 * speedup by copy data to sample array +30%
+	 * I think it's because of memcpy optimizations and
+	 * cpu cache hits
+	 */
+	sample = kmalloc(sample_size, GFP_NOFS);
+	if (!sample)
+		goto out;
+
+	bucket = kcalloc(BUCKET_SIZE, sizeof(struct _backet_item), GFP_NOFS);
+	if (!bucket)
+		goto out;
+
+	/* Read small subset of data 1024b-4096b */
+	a = 0; b = 0;
+	while (index <= end_index) {
+		page = find_get_page(inode->i_mapping, index);
+		BUG_ON(!page); /* Pages should be in the extent_io_tree */
+		input_data = kmap(page);
+		c = 0;
+		while (c < PAGE_SIZE - READ_SIZE) {
+			if (a >= input_size  - READ_SIZE)
+			 	break;
+			if (b >= sample_size - READ_SIZE)
+				break;
+			memcpy(&sample[b], &input_data[c], READ_SIZE);
+			c += shift;
+			a += shift;
+			b += READ_SIZE;
+		}
+		kunmap(page);
+		put_page(page);
+		index++;
+	}
+
+	for (a = 0; a < sample_size; a++)
+		bucket[sample[a]].count++;
+
+	a = _symbset_calc(bucket);
+	if (a < 64) {
+		ret = COMPRESS_COST_EASY;
+		goto out;
+	}
+
+	/*
+	 * Preset symbols
+	 * For computation _random_pairs_distribution() of symbols pairs
+	 * code must understand which symbols in array and where
+	 */
+	for (a = 0; a < BUCKET_SIZE; a++)
+		bucket[a].symbol = a;
+
+	/* Sort in reverse order */
+	sort(bucket, BUCKET_SIZE, sizeof(struct _backet_item), &compare, NULL);
+
+	coreset_size = _coreset_calc(bucket, sample_size*90/100);
+
+	if (coreset_size < 50) {
+		ret = COMPRESS_COST_EASY;
+		goto out;
+	}
+
+	if (coreset_size > 200) {
+		ret = COMPRESS_NONE;
+		goto out;
+	}
+
+	/*
+	 * Okay, code fail to fast detect data type
+	 * Let's calculate entropy
+	 */
+	entropy_lvl = _entropy_perc(bucket, sample_size);
+	if (entropy_lvl < 70) {
+		ret = COMPRESS_COST_MEDIUM;
+		goto out;
+	}
+
+	a = _random_pairs_distribution(bucket, coreset_size, sample, sample_size);
+	if (entropy_lvl < 90) {
+		if (a > 0)
+			ret = COMPRESS_COST_MEDIUM;
+		else
+			ret = COMPRESS_NONE;
+		goto out;
+	} else {
+		if (a > 10)
+			ret = COMPRESS_COST_HARD;
+		else
+			ret = COMPRESS_NONE;
+		goto out;
+	}
+
+out:
+	kfree(sample);
+	kfree(bucket);
+	return ret;
+}
diff --git a/fs/btrfs/heuristic.h b/fs/btrfs/heuristic.h
new file mode 100644
index 000000000000..f978bb3c21b1
--- /dev/null
+++ b/fs/btrfs/heuristic.h
@@ -0,0 +1,13 @@
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/fs.h>
+
+enum compression_advice {
+	COMPRESS_NONE,
+	COMPRESS_COST_EASY,
+	COMPRESS_COST_MEDIUM,
+	COMPRESS_COST_HARD
+};
+
+enum compression_advice btrfs_compress_heuristic(struct inode *inode,
+	u64 start, u64 end);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ef3c98c527c1..d5a13e3e0c6c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -60,6 +60,7 @@
 #include "props.h"
 #include "qgroup.h"
 #include "dedupe.h"
+#include "heuristic.h"

 struct btrfs_iget_args {
 	struct btrfs_key *location;
@@ -458,16 +459,16 @@ static noinline void compress_file_range(struct inode *inode,
 	unsigned long total_compressed = 0;
 	unsigned long total_in = 0;
 	int i;
-	int will_compress;
+	bool will_compress;
 	int compress_type = fs_info->compress_type;
-	int redirty = 0;
+	bool redirty = false;

 	inode_should_defrag(BTRFS_I(inode), start, end, end - start + 1,
 			SZ_16K);

 	actual_end = min_t(u64, isize, end + 1);
 again:
-	will_compress = 0;
+	will_compress = false;
 	nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;
 	BUILD_BUG_ON((BTRFS_MAX_COMPRESSED % PAGE_SIZE) != 0);
 	nr_pages = min_t(unsigned long, nr_pages,
@@ -510,15 +511,6 @@ static noinline void compress_file_range(struct inode *inode,
 	 */
 	if (inode_need_compress(inode)) {
 		WARN_ON(pages);
-		pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
-		if (!pages) {
-			/* just bail out to the uncompressed code */
-			goto cont;
-		}
-
-		if (BTRFS_I(inode)->force_compress)
-			compress_type = BTRFS_I(inode)->force_compress;
-
 		/*
 		 * we need to call clear_page_dirty_for_io on each
 		 * page in the range.  Otherwise applications with the file
@@ -529,7 +521,24 @@ static noinline void compress_file_range(struct inode *inode,
 		 * dirty again later on.
 		 */
 		extent_range_clear_dirty_for_io(inode, start, end);
-		redirty = 1;
+		redirty = true;
+
+		ret = btrfs_compress_heuristic(inode, start, end);
+
+		/* Heuristic say: dont try compress that */
+		if (!ret)
+			goto cont;
+
+		pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
+		if (!pages) {
+			/* just bail out to the uncompressed code */
+			goto cont;
+		}
+
+		if (BTRFS_I(inode)->force_compress)
+			compress_type = BTRFS_I(inode)->force_compress;
+
+
 		ret = btrfs_compress_pages(compress_type,
 					   inode->i_mapping, start,
 					   pages,
@@ -552,7 +561,7 @@ static noinline void compress_file_range(struct inode *inode,
 				       PAGE_SIZE - offset);
 				kunmap_atomic(kaddr);
 			}
-			will_compress = 1;
+			will_compress = true;
 		}
 	}
 cont:
--
2.13.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-06-26 19:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-26 19:40 [RFC PATCH v3 0/2] Btrfs: add compression heuristic Timofey Titovets
2017-06-26 19:40 ` [RFC PATCH v3 1/2] Btrfs: add precomputed log2() Timofey Titovets
2017-06-26 19:40 ` [RFC PATCH v3 2/2] Btrfs: add heuristic method for make decision compress or not compress Timofey Titovets

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).