From: Jens Axboe <axboe@kernel.dk>
To: Alket Memushaj <amemushaj@gmail.com>
Cc: fio@vger.kernel.org
Subject: Re: dedupe and replay
Date: Sat, 27 Sep 2014 08:02:03 -0600 [thread overview]
Message-ID: <5426C35B.2010001@kernel.dk> (raw)
In-Reply-To: <5426BF5C.9090301@kernel.dk>
On 2014-09-27 07:45, Jens Axboe wrote:
> On 09/27/2014 05:19 AM, Alket Memushaj wrote:
>> thanks for the quick response - but unfortunately this patch does not
>> seem to have changed the result. I managed to get a 1.4:1 dedupe when
>> setting dedupe_percentage to 99%. The test rig I have is like this: an
>> array w/ inline dedupe @ 4K and I am the only user on it - I have a
>> single machine mounting a single lun from the array (the whole space
>> in the array is assigned to this lun). The synthetic test at
>> dedupe_percentage=80 results in a 4:1 dedupe ratio from the array. The
>> replay at 80% comes out as undedupable (ratio = 1.0:1) even with the
>> latest patch. The replay generates about 1.7 million I/Os. ~1 million
>> are 4K, 80 thousand are 8K, 70 thousand are 32K, etc... there's a
>> decent number of I/O that is not a multiple of 4K, but the vast
>> majority of the I/O is a multiple of 4K. The offset for each I/O is 4K
>> aligned in the vast majority of I/Os.
>
> (please don't top post)
>
> I'll try with a trace replay here. If you can, would be nice to get your
> trace.
So ran a quick test here. The original file:
[dedupe]
filename=test
bssplit=4k/58:8k/5:16k/5:32k/5:64k/5:128k/5
rw=write
size=1g
dedupe_percentage=80
write_iolog=dedupe.log
Ran this and check how well 'test' dedupes:
axboe@nelson:/home/axboe/git/fio $ t/dedupe test
Will check <test>, size <1073741824>, using 8 threads
Threads(8): 262144 items processed
Extents=262144, Unique extents=51955
De-dupe factor: 5.05
Fio setting: dedupe_percentage=80
So far, so good, we get expected results (1:4, or 80%). The dedupe.log
looks fine, lets replay that with:
[dedupe]
filename=test
rw=write
read_iolog=dedupe.log
size=1g
dedupe_percentage=80
write_iolog=dedupe.log
Ran this, and then check how well that dedupes:
axboe@nelson:/home/axboe/git/fio $ t/dedupe test
Will check <test>, size <1073741824>, using 8 threads
Threads(8): 262144 items processed
Extents=262144, Unique extents=51953
De-dupe factor: 5.05
Fio setting: dedupe_percentage=80
Pretty much identical. So it seems to work fine for me, generating
appropriately random buffers.
I might need your trace to figure out what's going on for you.
--
Jens Axboe
prev parent reply other threads:[~2014-09-27 14:02 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-26 18:30 dedupe and replay Alket Memushaj
2014-09-26 19:08 ` Jens Axboe
2014-09-26 19:38 ` Jens Axboe
2014-09-27 11:19 ` Alket Memushaj
2014-09-27 13:45 ` Jens Axboe
2014-09-27 14:02 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5426C35B.2010001@kernel.dk \
--to=axboe@kernel.dk \
--cc=amemushaj@gmail.com \
--cc=fio@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox