Flexible I/O Tester development
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Alket Memushaj <amemushaj@gmail.com>
Cc: fio@vger.kernel.org
Subject: Re: dedupe and replay
Date: Sat, 27 Sep 2014 08:02:03 -0600	[thread overview]
Message-ID: <5426C35B.2010001@kernel.dk> (raw)
In-Reply-To: <5426BF5C.9090301@kernel.dk>

On 2014-09-27 07:45, Jens Axboe wrote:
> On 09/27/2014 05:19 AM, Alket Memushaj wrote:
>> thanks for the quick response - but unfortunately this patch does not
>> seem to have changed the result. I managed to get a 1.4:1 dedupe when
>> setting dedupe_percentage to 99%. The test rig I have is like this: an
>> array w/ inline dedupe @ 4K and I am the only user on it - I have a
>> single machine mounting a single lun from the array (the whole space
>> in the array is assigned to this lun). The synthetic test at
>> dedupe_percentage=80 results in a 4:1 dedupe ratio from the array. The
>> replay at 80% comes out as undedupable (ratio = 1.0:1) even with the
>> latest patch. The replay generates about 1.7 million I/Os. ~1 million
>> are 4K, 80 thousand are 8K, 70 thousand are 32K, etc... there's a
>> decent number of I/O that is not a multiple of 4K, but the vast
>> majority of the I/O is a multiple of 4K. The offset for each I/O is 4K
>> aligned in the vast majority of I/Os.
>
> (please don't top post)
>
> I'll try with a trace replay here. If you can, would be nice to get your
> trace.

So ran a quick test here. The original file:

[dedupe]
filename=test
bssplit=4k/58:8k/5:16k/5:32k/5:64k/5:128k/5
rw=write
size=1g
dedupe_percentage=80
write_iolog=dedupe.log

Ran this and check how well 'test' dedupes:

axboe@nelson:/home/axboe/git/fio $ t/dedupe test
Will check <test>, size <1073741824>, using 8 threads
Threads(8): 262144 items processed
Extents=262144, Unique extents=51955
De-dupe factor: 5.05
Fio setting: dedupe_percentage=80

So far, so good, we get expected results (1:4, or 80%). The dedupe.log 
looks fine, lets replay that with:

[dedupe]
filename=test
rw=write
read_iolog=dedupe.log
size=1g
dedupe_percentage=80
write_iolog=dedupe.log

Ran this, and then check how well that dedupes:

axboe@nelson:/home/axboe/git/fio $ t/dedupe test
Will check <test>, size <1073741824>, using 8 threads
Threads(8): 262144 items processed
Extents=262144, Unique extents=51953
De-dupe factor: 5.05
Fio setting: dedupe_percentage=80

Pretty much identical. So it seems to work fine for me, generating 
appropriately random buffers.

I might need your trace to figure out what's going on for you.

-- 
Jens Axboe



      reply	other threads:[~2014-09-27 14:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-26 18:30 dedupe and replay Alket Memushaj
2014-09-26 19:08 ` Jens Axboe
2014-09-26 19:38   ` Jens Axboe
2014-09-27 11:19     ` Alket Memushaj
2014-09-27 13:45       ` Jens Axboe
2014-09-27 14:02         ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5426C35B.2010001@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=amemushaj@gmail.com \
    --cc=fio@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox