All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Timofey Titovets <nefelim4ag@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 1/3] Btrfs: heuristic add simple sampling logic
Date: Mon, 24 Jul 2017 10:55:26 -0400	[thread overview]
Message-ID: <20170724145526.GD9406@destiny> (raw)
In-Reply-To: <20170724113708.18088-2-nefelim4ag@gmail.com>

On Mon, Jul 24, 2017 at 02:37:06PM +0300, Timofey Titovets wrote:
> Get small sample from input data
> and calculate byte type count for that sample
> 
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> ---
>  fs/btrfs/compression.c | 24 ++++++++++++++++++++++--
>  fs/btrfs/compression.h | 11 +++++++++++
>  2 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index 63f54bd2d5bb..1501d4fe90cc 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -1068,15 +1068,35 @@ int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end)
>  	u64 index = start >> PAGE_SHIFT;
>  	u64 end_index = end >> PAGE_SHIFT;
>  	struct page *page;
> -	int ret = 1;
> +	struct heuristic_bucket_item *bucket;
> +	int a, b, ret;
> +	u8 symbol, *input_data;
> +
> +	ret = 1;
> +
> +	bucket = kcalloc(BTRFS_HEURISTIC_BUCKET_SIZE,
> +		sizeof(struct heuristic_bucket_item), GFP_NOFS);
> +
> +	if (!bucket)
> +		goto out;
>  
>  	while (index <= end_index) {
>  		page = find_get_page(inode->i_mapping, index);
> -		kmap(page);
> +		input_data = kmap(page);
> +		a = 0;
> +		while (a < PAGE_SIZE) {
> +			for (b = 0; b < BTRFS_HEURISTIC_READ_SIZE; b++) {
> +				symbol = input_data[a+b];
> +				bucket[symbol].count++;
> +			}
> +			a += BTRFS_HEURISTIC_ITERATOR_OFFSET;
> +		}
>  		kunmap(page);
>  		put_page(page);
>  		index++;
>  	}
>  
> +out:
> +	kfree(bucket);
>  	return ret;
>  }
> diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
> index d1f4eee2d0af..984943e5e1ae 100644
> --- a/fs/btrfs/compression.h
> +++ b/fs/btrfs/compression.h
> @@ -129,6 +129,17 @@ struct btrfs_compress_op {
>  extern const struct btrfs_compress_op btrfs_zlib_compress;
>  extern const struct btrfs_compress_op btrfs_lzo_compress;
>  
> +struct heuristic_bucket_item {
> +       u8  padding;
> +       u8  symbol;
> +       u16 count;
> +};
> +
> +#define BTRFS_HEURISTIC_READ_SIZE 16
> +#define BTRFS_HEURISTIC_READS_PER_PAGE 8*PAGE_SIZE/4096

I hate magic numbers, why is this 8*PAGE_SIZE/4096?  If you want to check every
512 bytes why not just set BTRFS_HEURISTIC_ITERATOR_OFFSET to 512?  That makes
it easier to understand what you are trying to accomplish.  Thanks,

Josef

  reply	other threads:[~2017-07-24 14:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-24 11:37 [PATCH 0/3] Btrfs: populate heuristic with detection logic Timofey Titovets
2017-07-24 11:37 ` [PATCH 1/3] Btrfs: heuristic add simple sampling logic Timofey Titovets
2017-07-24 14:55   ` Josef Bacik [this message]
2017-07-24 11:37 ` [PATCH 2/3] Btrfs: heuristic add byte set calculation Timofey Titovets
2017-07-24 11:37 ` [PATCH 3/3] Btrfs: heuristic add byte core " Timofey Titovets
2017-07-24 15:02 ` [PATCH 0/3] Btrfs: populate heuristic with detection logic Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170724145526.GD9406@destiny \
    --to=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nefelim4ag@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.