From: Jens Axboe <axboe@kernel.dk>
To: Alket Memushaj <amemushaj@gmail.com>
Cc: fio@vger.kernel.org
Subject: Re: dedupe and replay
Date: Sat, 27 Sep 2014 08:02:03 -0600 [thread overview]
Message-ID: <5426C35B.2010001@kernel.dk> (raw)
In-Reply-To: <5426BF5C.9090301@kernel.dk>
On 2014-09-27 07:45, Jens Axboe wrote:
> On 09/27/2014 05:19 AM, Alket Memushaj wrote:
>> thanks for the quick response - but unfortunately this patch does not
>> seem to have changed the result. I managed to get a 1.4:1 dedupe when
>> setting dedupe_percentage to 99%. The test rig I have is like this: an
>> array w/ inline dedupe @ 4K and I am the only user on it - I have a
>> single machine mounting a single lun from the array (the whole space
>> in the array is assigned to this lun). The synthetic test at
>> dedupe_percentage=80 results in a 4:1 dedupe ratio from the array. The
>> replay at 80% comes out as undedupable (ratio = 1.0:1) even with the
>> latest patch. The replay generates about 1.7 million I/Os. ~1 million
>> are 4K, 80 thousand are 8K, 70 thousand are 32K, etc... there's a
>> decent number of I/O that is not a multiple of 4K, but the vast
>> majority of the I/O is a multiple of 4K. The offset for each I/O is 4K
>> aligned in the vast majority of I/Os.
>
> (please don't top post)
>
> I'll try with a trace replay here. If you can, would be nice to get your
> trace.
So ran a quick test here. The original file:
[dedupe]
filename=test
bssplit=4k/58:8k/5:16k/5:32k/5:64k/5:128k/5
rw=write
size=1g
dedupe_percentage=80
write_iolog=dedupe.log
Ran this and check how well 'test' dedupes:
axboe@nelson:/home/axboe/git/fio $ t/dedupe test
Will check <test>, size <1073741824>, using 8 threads
Threads(8): 262144 items processed
Extents=262144, Unique extents=51955
De-dupe factor: 5.05
Fio setting: dedupe_percentage=80
So far, so good, we get expected results (1:4, or 80%). The dedupe.log
looks fine, lets replay that with:
[dedupe]
filename=test
rw=write
read_iolog=dedupe.log
size=1g
dedupe_percentage=80
write_iolog=dedupe.log
Ran this, and then check how well that dedupes:
axboe@nelson:/home/axboe/git/fio $ t/dedupe test
Will check <test>, size <1073741824>, using 8 threads
Threads(8): 262144 items processed
Extents=262144, Unique extents=51953
De-dupe factor: 5.05
Fio setting: dedupe_percentage=80
Pretty much identical. So it seems to work fine for me, generating
appropriately random buffers.
I might need your trace to figure out what's going on for you.
--
Jens Axboe
prev parent reply other threads:[~2014-09-27 14:02 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-26 18:30 dedupe and replay Alket Memushaj
2014-09-26 19:08 ` Jens Axboe
2014-09-26 19:38 ` Jens Axboe
2014-09-27 11:19 ` Alket Memushaj
2014-09-27 13:45 ` Jens Axboe
2014-09-27 14:02 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5426C35B.2010001@kernel.dk \
--to=axboe@kernel.dk \
--cc=amemushaj@gmail.com \
--cc=fio@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.