* Running fio with buffer_compress_percentage=0 and scramble_buffers=1 produces high-dedupe data
@ 2017-04-26 20:19 Bryan Gurney
2017-04-27 21:29 ` Sitsofe Wheeler
0 siblings, 1 reply; 3+ messages in thread
From: Bryan Gurney @ 2017-04-26 20:19 UTC (permalink / raw)
To: fio
Hello,
I found an issue with a number of fio versions (2.2.8, 2.11, and
fio-2.19-14-g306f) where a configuration with both
"buffer_compress_percentage=0" and "scramble_buffers=1" results in data
buffer content with very low compressibility, but very high dedupability.
In a fio test run, I was using the "buffer_compress_percentage" and
"dedupe_percentage" parameters to alter the compressibility and
dedupability of the data buffers. I wanted to create a "control"
configuration that would produce random, scrambled buffer content that
would result in no dedupe, and no compression. Working backward from my
other configurations, I constructed the configuration below, with the
following intentions:
- Set compression to 0 percent, which should match fio's default buffer
pattern.
- Remove the "dedupe_percentage" line, and leave "scramble_buffers=1" to
prevent dedupe, since the default fio behavior is to reuse buffers.
[globals]
bs=4096
rw=write
name=write_1G_control_scrambled
numjobs=1
size=1g
norandommap
randrepeat=1
group_reporting
unlink=0
direct=1
iodepth=128
iodepth_batch_complete=16
iodepth_batch_submit=16
ioengine=libaio
scramble_buffers=1
buffer_compress_percentage=0
buffer_compress_chunk=4096
[thread1]
filename=/dev/sdc
The result of the write was 1 GB of data, which exhibited nearly 100%
dedupe, but was almost incompressible. On examination with "hexdump -C",
the resulting data does not exhibit the "buffer modifications"
characteristic of the scramble_buffers option.
I wondered if this was related to the existence of the
"buffer_compress_percentage=0" and "buffer_compress_chunk=4096" lines, so
I removed those two lines, resulting in the following configuration:
[globals]
bs=4096
rw=write
name=write_1G_control_scrambled
numjobs=1
size=1g
norandommap
randrepeat=1
group_reporting
unlink=0
direct=1
iodepth=128
iodepth_batch_complete=16
iodepth_batch_submit=16
ioengine=libaio
scramble_buffers=1
[thread1]
filename=/dev/sdc
The result of this write was 1 GB of data, with 0% dedupe and 0%
compression. On examination with "hexdump -C", the resulting data
exhibits the "buffer modifications" characteristic of the scramble_buffers
option.
The behavior above seems to suggest that the "buffer_compress"
functionality is mutually exclusive of the "scramble_buffers=1" setting.
I performed some tests for various non-zero values of
"buffer_compress_percentage", and the resulting data was not dedupable
(which would be consistent with the behavior of "scramble_buffers=1", but
the data pattern seems to suggest that the algorithm used in
scramble_buffers is not being used. Comparing this to when
buffer_compress_percentage is set to zero, the resulting data is almost
incompressible, but exhibits a high frequency of dedupe. This is despite
the intentions of the user's configuration for buffer data content of 0%
compression, and scrambled to avoid dedupe.
Thanks,
Bryan Gurney
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Running fio with buffer_compress_percentage=0 and scramble_buffers=1 produces high-dedupe data
2017-04-26 20:19 Running fio with buffer_compress_percentage=0 and scramble_buffers=1 produces high-dedupe data Bryan Gurney
@ 2017-04-27 21:29 ` Sitsofe Wheeler
2017-04-28 16:57 ` Bryan Gurney
0 siblings, 1 reply; 3+ messages in thread
From: Sitsofe Wheeler @ 2017-04-27 21:29 UTC (permalink / raw)
To: Bryan Gurney; +Cc: fio@vger.kernel.org
Hi,
On 26 April 2017 at 21:19, Bryan Gurney <bgurney@permabit.com> wrote:
>
> I performed some tests for various non-zero values of
> "buffer_compress_percentage", and the resulting data was not dedupable
> (which would be consistent with the behavior of "scramble_buffers=1", but
> the data pattern seems to suggest that the algorithm used in
> scramble_buffers is not being used. Comparing this to when
> buffer_compress_percentage is set to zero, the resulting data is almost
> incompressible, but exhibits a high frequency of dedupe. This is despite
> the intentions of the user's configuration for buffer data content of 0%
> compression, and scrambled to avoid dedupe.
http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-buffer-compress-percentage
alludes to buffer_compress_percentage=0 actually turning the option
off rather than setting a compression ratio of 0%. Here's a link to
the code that would normally react to the option:
https://github.com/axboe/fio/blob/fio-2.19/io_u.c#L2045 .
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: Running fio with buffer_compress_percentage=0 and scramble_buffers=1 produces high-dedupe data
2017-04-27 21:29 ` Sitsofe Wheeler
@ 2017-04-28 16:57 ` Bryan Gurney
0 siblings, 0 replies; 3+ messages in thread
From: Bryan Gurney @ 2017-04-28 16:57 UTC (permalink / raw)
To: Sitsofe Wheeler; +Cc: fio
> Hi,
>
> On 26 April 2017 at 21:19, Bryan Gurney <bgurney@permabit.com> wrote:
> >
> > I performed some tests for various non-zero values of
> > "buffer_compress_percentage", and the resulting data was not dedupable
> > (which would be consistent with the behavior of "scramble_buffers=1",
> > but
> > the data pattern seems to suggest that the algorithm used in
> > scramble_buffers is not being used. Comparing this to when
> > buffer_compress_percentage is set to zero, the resulting data is almost
> > incompressible, but exhibits a high frequency of dedupe. This is
> > despite
> > the intentions of the user's configuration for buffer data content of 0%
> > compression, and scrambled to avoid dedupe.
>
> http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-buffer-compress-percentage
> alludes to buffer_compress_percentage=0 actually turning the option
> off rather than setting a compression ratio of 0%. Here's a link to
> the code that would normally react to the option:
> https://github.com/axboe/fio/blob/fio-2.19/io_u.c#L2045 .
>
> --
> Sitsofe | http://sucs.org/~sits/
>
Hi Sitsofe,
Thanks, I actually didn't know about that part in fill_io_buffer(), which
ultimately disables compression by the nature of the
buffer_compress_percentage value being 0.
However, the part that caught my attention was also in io_u.c, inside the
function "struct io_u *get_io_u(struct thread_data *td)", starting here:
https://github.com/axboe/fio/blob/fio-2.19/io_u.c#L1644
if (io_u->ddir == DDIR_WRITE) {
if (td->flags & TD_F_REFILL_BUFFERS) {
io_u_fill_buffer(td, io_u,
td->o.min_bs[DDIR_WRITE],
io_u->buflen);
} else if ((td->flags & TD_F_SCRAMBLE_BUFFERS) &&
!(td->flags & TD_F_COMPRESS))
do_scramble = 1;
if (td->flags & TD_F_VER_NONE) {
populate_verify_io_u(td, io_u);
do_scramble = 0;
}
} else if (io_u->ddir == DDIR_READ) {
...
Note that "do_scramble" is only set to 1 if (td->flags &
TD_F_SCRAMBLE_BUFFERS) is true, AND (td->flags & TD_F_COMPRESS) is false.
If I'm reading this correctly, the TD_F_ flags indicate whether the
specified option was passed to fio by the user or script.
Thanks,
Bryan
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-04-28 16:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-26 20:19 Running fio with buffer_compress_percentage=0 and scramble_buffers=1 produces high-dedupe data Bryan Gurney
2017-04-27 21:29 ` Sitsofe Wheeler
2017-04-28 16:57 ` Bryan Gurney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).