All of lore.kernel.org
 help / color / mirror / Atom feed
* Running fio with buffer_compress_percentage=0 and scramble_buffers=1 produces high-dedupe data
@ 2017-04-26 20:19 Bryan Gurney
  2017-04-27 21:29 ` Sitsofe Wheeler
  0 siblings, 1 reply; 3+ messages in thread
From: Bryan Gurney @ 2017-04-26 20:19 UTC (permalink / raw)
  To: fio

Hello,

I found an issue with a number of fio versions (2.2.8, 2.11, and
fio-2.19-14-g306f) where a configuration with both
"buffer_compress_percentage=0" and "scramble_buffers=1" results in data
buffer content with very low compressibility, but very high dedupability.

In a fio test run, I was using the "buffer_compress_percentage" and
"dedupe_percentage" parameters to alter the compressibility and
dedupability of the data buffers.  I wanted to create a "control"
configuration that would produce random, scrambled buffer content that
would result in no dedupe, and no compression.  Working backward from my
other configurations, I constructed the configuration below, with the
following intentions:

- Set compression to 0 percent, which should match fio's default buffer
pattern.
- Remove the "dedupe_percentage" line, and leave "scramble_buffers=1" to
prevent dedupe, since the default fio behavior is to reuse buffers.

[globals]

bs=4096
rw=write
name=write_1G_control_scrambled
numjobs=1
size=1g
norandommap
randrepeat=1
group_reporting
unlink=0
direct=1
iodepth=128
iodepth_batch_complete=16
iodepth_batch_submit=16
ioengine=libaio
scramble_buffers=1
buffer_compress_percentage=0
buffer_compress_chunk=4096

[thread1]
filename=/dev/sdc

The result of the write was 1 GB of data, which exhibited nearly 100%
dedupe, but was almost incompressible.  On examination with "hexdump -C",
the resulting data does not exhibit the "buffer modifications"
characteristic of the scramble_buffers option.

I wondered if this was related to the existence of the
"buffer_compress_percentage=0" and "buffer_compress_chunk=4096" lines, so
I removed those two lines, resulting in the following configuration:

[globals]

bs=4096
rw=write
name=write_1G_control_scrambled
numjobs=1
size=1g
norandommap
randrepeat=1
group_reporting
unlink=0
direct=1
iodepth=128
iodepth_batch_complete=16
iodepth_batch_submit=16
ioengine=libaio
scramble_buffers=1

[thread1]
filename=/dev/sdc

The result of this write was 1 GB of data, with 0% dedupe and 0%
compression.  On examination with "hexdump -C", the resulting data
exhibits the "buffer modifications" characteristic of the scramble_buffers
option.

The behavior above seems to suggest that the "buffer_compress"
functionality is mutually exclusive of the "scramble_buffers=1" setting.

I performed some tests for various non-zero values of
"buffer_compress_percentage", and the resulting data was not dedupable
(which would be consistent with the behavior of "scramble_buffers=1", but
the data pattern seems to suggest that the algorithm used in
scramble_buffers is not being used.  Comparing this to when
buffer_compress_percentage is set to zero, the resulting data is almost
incompressible, but exhibits a high frequency of dedupe.  This is despite
the intentions of the user's configuration for buffer data content of 0%
compression, and scrambled to avoid dedupe.


Thanks,

Bryan Gurney

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-04-28 16:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-26 20:19 Running fio with buffer_compress_percentage=0 and scramble_buffers=1 produces high-dedupe data Bryan Gurney
2017-04-27 21:29 ` Sitsofe Wheeler
2017-04-28 16:57   ` Bryan Gurney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.