Random distribution: zoned argument

All of lore.kernel.org
 help / color / mirror / Atom feed

* Random distribution: zoned argument
@ 2017-11-20 17:17 Phillip Chen
  2017-11-21  1:57 ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Phillip Chen @ 2017-11-20 17:17 UTC (permalink / raw)
  To: fio

[-- Attachment #1: Type: text/plain, Size: 3228 bytes --]

Hello,
I'm a test engineer at Seagate and we're using FIO to gather some
performance data. This email has two parts: a bug report and a feature
request.
The bug that I'm seeing is that when using the
random_distribution=zoned argument, the zone order is not honored. So
using zoned:18/90:7/5:75/5 will not weight IO towards the end of the
disk but rather towards the beginning. Using zoned:75/5:7/5:18/90
apparently gives the same distribution, but also not the correct
distribution. I've attached a python3.6 script that shows this
behaviour. Here is the histogram information from running the two
zoned arguments as described above:

Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
--thread --filename=/dev/sdc --runtime=30 --readwrite=randread
--iodepth=1 --random_distribution=zoned:18/90:7/5:75/5 --norandommap
--output-format=terse
histogram bins = [2302, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
32, 27, 49, 34, 24, 36, 26, 184]
histogram percents = [75.99867943215582, 0.8253549026081215,
0.8253549026081215, 0.9904258831297458, 1.0894684714427203,
1.188511059755695, 0.8583690987124464, 1.0564542753383954,
0.9574116870254209, 1.2215252558600198, 0.6932981181908221,
0.6932981181908221, 1.0564542753383954, 0.8913832948167713,
1.6176956091119181, 1.1224826675470452, 0.7923407065037966,
1.188511059755695, 0.8583690987124464, 6.074612083195774]

Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
--thread --filename=/dev/sdc --runtime=30 --readwrite=randread
--iodepth=1 --random_distribu   tion=zoned:75/5:7/5:18/90
--norandommap --output-format=terse
histogram bins = [2306, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
32, 27, 49, 34, 24, 36, 26, 184]
histogram percents = [76.03033300362677, 0.8242664029014177,
0.8242664029014177, 0.9891196834817013, 1.0880316518298714,
1.1869436201780414, 0.8572370590174745, 1.0550609957138146,
0.9561490273656446, 1.2199142762940982, 0.6923837784371909,
0.6923837784371909, 1.0550609957138146, 0.8902077151335311,
1.6155621496867787, 1.1210023079459281, 0.7912957467853611,
1.1869436201780414, 0.8572370590174745, 6.0666007253544345]

To run the script, use the -h flag to see usage, but at a minimum
you'll need to give the device handle to run on as the first argument
(the workload only does reads). The random_distribution argument is
set at the top of the file.

Here is my environment information:
# cat /etc/centos-release
CentOS Linux release 7.3.1611 (Core)
# uname -r
3.10.0-514.21.1.el7.x86_64
I used fio-3.2-13-g40e5f which was the newest version I could see as of today.

As for the feature request:
I am trying to adapt our current FIO job files for FLEX testing which
is a new protocol we announced recently
(http://blog.seagate.com/intelligent/new-flex-dynamic-recording-method-redefines-data-center-hard-drive/)
that has some requirements on where writes/reads are allowed. I would
like to have better control where random reads and writes are going
using the zoned random_distribution setting using sector numbers
rather than capacity percentages. Would that be a possible feature to
add? Or is there an existing way to randomly read/write to
non-contiguous zones on the disk with varying sizes?

Thank you,
Phillip Chen

[-- Attachment #2: fio_zoned.py --]
[-- Type: text/plain, Size: 4476 bytes --]

import re
import subprocess
import time
import sys
import math
import argparse

# Weight heavily towards the last 5% of the drive
dist_str = "zoned:18/90:7/5:75/5"

# Weights are in descending order, as in the example -- this seems to be the only way that works
# dist_str = "zoned:75/5:7/5:18/90"

# Weighted heavily towards the middle of the drive
# dist_str = "zoned:5/45:90/10:5/45"


arg_parser = argparse.ArgumentParser()
arg_parser.add_argument("drive_handle", help = "Drive handle to test")
arg_parser.add_argument("-rt", "--runtime", default = 30, help = "Time to run workload")
arg_parser.add_argument("-sbp", "--save_block_parse", action = "store_true",
                        help = "Save blockparse output to blkparse_output.txt if flag is set")
arg_parser.add_argument("-fp", "--fio_path", default = "fio",
                        help = "The path to the FIO executable to run")
args = arg_parser.parse_args()

dev_handle = args.drive_handle

blktrace = subprocess.Popen(["blktrace", dev_handle, "-o", "-"], stdout = subprocess.PIPE,
                            stderr = subprocess.PIPE)
# blktrace needs a little time to get set up
time.sleep(1)

# Start FIO job
fio_string = (args.fio_path + " --name=rand_reads --ioengine=libaio --direct=1 --exitall "
              "--thread --filename=" + dev_handle + " --runtime=" + str(args.runtime) +
              " --readwrite=randread --iodepth=1 --random_distribution=" + dist_str +
              " --norandommap --output-format=terse")
print("Running " + fio_string)
cmd_ret = subprocess.run(fio_string.split(' '), stdout = subprocess.PIPE, stderr = subprocess.PIPE)

if cmd_ret.stderr != b"":
    print("FIO errors:")
    print(cmd_ret.stderr.decode(sys.stderr.encoding))
print("FIO stdout:")
print(cmd_ret.stdout.decode(sys.stdout.encoding))

# Terminate is how blktrace expects to end, don't use kill or you'll lose commands near the end
blktrace.terminate()
try:
    stdout, stderr = blktrace.communicate(timeout = 20)
except subprocess.TimeoutExpired:
    blktrace.kill()
    stdout, stderr = blktrace.communicate()
print("blktrace errors:")
print(stderr)
# This will give you the raw blktrace output
# print(stdout)
blkparse_format_str = '%D %2c %8s %5T.%9t %5p %2a %3d command = %C sectors = %S\n'
blkparse_ret = subprocess.run(["blkparse", "-i", "-", "-f", blkparse_format_str, "-a", "issue"],
                              input = stdout, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
print("blkparse errors:")
print(blkparse_ret.stderr)
# print(blkparse_ret.stdout)
blkparse_str = blkparse_ret.stdout.decode(sys.stdout.encoding)

if args.save_block_parse:
    with open("blkparse_output.txt", 'w') as output_file:
        output_file.write(blkparse_str)

# Parse blktrace result into bins
blkline_re = re.compile(r"(\d+,\d+)\s+(\d+)\s+(\d+)\s+(?P<timestamp>\d+\.\d+)\s+(\d+)\s+D\s+(R|W)"
                        r"\s+command = fio\s+sectors = (?P<sector>\d+)")
total_ios = 0
avg_lba = 0
max_lba = 0
min_lba = None
# Parse out the sectors from the blocktrace output to get some preliminary statistics
match_iter = blkline_re.finditer(blkparse_str)
for match_obj in match_iter:
    sector_num = int(match_obj.groupdict()["sector"])
    total_ios += 1
    avg_lba += sector_num
    if min_lba is None or sector_num < min_lba:
        min_lba = sector_num
    if sector_num > max_lba:
        max_lba = sector_num

print("total IOs = " + str(total_ios))
print("avg: {:.2f}, min: {}, max: {}".format(avg_lba / total_ios, min_lba, max_lba))
hist_num = 20
hist_bins = [0] * hist_num
hist_div = max_lba / hist_num
hist_edges = []
for ind in range(hist_num):
    hist_edges.append(hist_div * (ind + 1))

# Sort the data into a histogram
match_iter = blkline_re.finditer(blkparse_str)
for match_obj in match_iter:
    sector_num = int(match_obj.groupdict()["sector"])
    hist_ind = math.floor(sector_num / hist_div)
    if hist_ind == hist_num:
        hist_ind -= 1
    hist_bins[hist_ind] += 1
    # print("{}: bin {}".format(sector_num, hist_ind))

hist_perc = []
for hist_bin in hist_bins:
    hist_perc.append(100 * hist_bin / total_ios)

print("histogram bins = " + str(hist_bins))
print("histogram percents = " + str(hist_perc))
print("histogram edges = " + str(hist_edges))
# print FIO version and distribution string
print(dist_str)
cmd_ret = subprocess.run([args.fio_path, "-v"])

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-20 17:17 Random distribution: zoned argument Phillip Chen
@ 2017-11-21  1:57 ` Jens Axboe
  2017-11-27 23:18   ` Phillip Chen
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-21  1:57 UTC (permalink / raw)
  To: Phillip Chen, fio

On 11/20/2017 10:17 AM, Phillip Chen wrote:
> Hello,
> I'm a test engineer at Seagate and we're using FIO to gather some
> performance data. This email has two parts: a bug report and a feature
> request.
> The bug that I'm seeing is that when using the
> random_distribution=zoned argument, the zone order is not honored. So
> using zoned:18/90:7/5:75/5 will not weight IO towards the end of the
> disk but rather towards the beginning. Using zoned:75/5:7/5:18/90
> apparently gives the same distribution, but also not the correct
> distribution. I've attached a python3.6 script that shows this
> behaviour. Here is the histogram information from running the two
> zoned arguments as described above:
> 
> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
> --iodepth=1 --random_distribution=zoned:18/90:7/5:75/5 --norandommap
> --output-format=terse
> histogram bins = [2302, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
> 32, 27, 49, 34, 24, 36, 26, 184]
> histogram percents = [75.99867943215582, 0.8253549026081215,
> 0.8253549026081215, 0.9904258831297458, 1.0894684714427203,
> 1.188511059755695, 0.8583690987124464, 1.0564542753383954,
> 0.9574116870254209, 1.2215252558600198, 0.6932981181908221,
> 0.6932981181908221, 1.0564542753383954, 0.8913832948167713,
> 1.6176956091119181, 1.1224826675470452, 0.7923407065037966,
> 1.188511059755695, 0.8583690987124464, 6.074612083195774]
> 
> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
> --iodepth=1 --random_distribu   tion=zoned:75/5:7/5:18/90
> --norandommap --output-format=terse
> histogram bins = [2306, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
> 32, 27, 49, 34, 24, 36, 26, 184]
> histogram percents = [76.03033300362677, 0.8242664029014177,
> 0.8242664029014177, 0.9891196834817013, 1.0880316518298714,
> 1.1869436201780414, 0.8572370590174745, 1.0550609957138146,
> 0.9561490273656446, 1.2199142762940982, 0.6923837784371909,
> 0.6923837784371909, 1.0550609957138146, 0.8902077151335311,
> 1.6155621496867787, 1.1210023079459281, 0.7912957467853611,
> 1.1869436201780414, 0.8572370590174745, 6.0666007253544345]
> 
> To run the script, use the -h flag to see usage, but at a minimum
> you'll need to give the device handle to run on as the first argument
> (the workload only does reads). The random_distribution argument is
> set at the top of the file.

I'll take a look at this tomorrow, that does seem very fishy.

> Here is my environment information:
> # cat /etc/centos-release
> CentOS Linux release 7.3.1611 (Core)
> # uname -r
> 3.10.0-514.21.1.el7.x86_64
> I used fio-3.2-13-g40e5f which was the newest version I could see as of today.
> 
> As for the feature request:
> I am trying to adapt our current FIO job files for FLEX testing which
> is a new protocol we announced recently
> (http://blog.seagate.com/intelligent/new-flex-dynamic-recording-method-redefines-data-center-hard-drive/)
> that has some requirements on where writes/reads are allowed. I would
> like to have better control where random reads and writes are going
> using the zoned random_distribution setting using sector numbers
> rather than capacity percentages. Would that be a possible feature to
> add? Or is there an existing way to randomly read/write to
> non-contiguous zones on the disk with varying sizes?

The best way to request something like that is to come up with a logical
way to describe it. That's usually the hardest part, implementing it is
usually not that hard. This is especially important since it has to be
intuitively easy to use for the user, not requiring them to pour too
much over man pages.

For yours, the zoned setup already supports split ranges for
reads/writes/trims. The change seems to be that you want to give the
zones in absolute sizes instead. It'd be the easiest to extend the
zoning to allow sizes instead.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-21  1:57 ` Jens Axboe
@ 2017-11-27 23:18   ` Phillip Chen
  2017-11-29 19:33     ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Phillip Chen @ 2017-11-27 23:18 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

I agree that changing the zoned random distribution parameter to allow
absolute sizes would be an elegant way to address my use case. I'm not
sure if it would be easier to add that functionality to the zoned
distribution or create a new distribution named something like
zoned_sectors. I'd also like the ability to have zones that no I/O
will fall into. I think the easiest way to do that would be to allow 0
as a valid distribution percentage (I.E. something like
random_distribution=zoned:10/10:0/50:30/20:8/30:2/40). Currently it
seems I can specify 0 as a distribution percentage, but it doesn't
create a zero I/O zone like I would like it to. Although that might be
addressed just by making the zoned distribution randomize I/O as
expected.
Thanks for looking into this,
Phillip Chen

On Mon, Nov 20, 2017 at 6:57 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On 11/20/2017 10:17 AM, Phillip Chen wrote:
>> Hello,
>> I'm a test engineer at Seagate and we're using FIO to gather some
>> performance data. This email has two parts: a bug report and a feature
>> request.
>> The bug that I'm seeing is that when using the
>> random_distribution=zoned argument, the zone order is not honored. So
>> using zoned:18/90:7/5:75/5 will not weight IO towards the end of the
>> disk but rather towards the beginning. Using zoned:75/5:7/5:18/90
>> apparently gives the same distribution, but also not the correct
>> distribution. I've attached a python3.6 script that shows this
>> behaviour. Here is the histogram information from running the two
>> zoned arguments as described above:
>>
>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>> --iodepth=1 --random_distribution=zoned:18/90:7/5:75/5 --norandommap
>> --output-format=terse
>> histogram bins = [2302, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>> 32, 27, 49, 34, 24, 36, 26, 184]
>> histogram percents = [75.99867943215582, 0.8253549026081215,
>> 0.8253549026081215, 0.9904258831297458, 1.0894684714427203,
>> 1.188511059755695, 0.8583690987124464, 1.0564542753383954,
>> 0.9574116870254209, 1.2215252558600198, 0.6932981181908221,
>> 0.6932981181908221, 1.0564542753383954, 0.8913832948167713,
>> 1.6176956091119181, 1.1224826675470452, 0.7923407065037966,
>> 1.188511059755695, 0.8583690987124464, 6.074612083195774]
>>
>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>> --iodepth=1 --random_distribu   tion=zoned:75/5:7/5:18/90
>> --norandommap --output-format=terse
>> histogram bins = [2306, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>> 32, 27, 49, 34, 24, 36, 26, 184]
>> histogram percents = [76.03033300362677, 0.8242664029014177,
>> 0.8242664029014177, 0.9891196834817013, 1.0880316518298714,
>> 1.1869436201780414, 0.8572370590174745, 1.0550609957138146,
>> 0.9561490273656446, 1.2199142762940982, 0.6923837784371909,
>> 0.6923837784371909, 1.0550609957138146, 0.8902077151335311,
>> 1.6155621496867787, 1.1210023079459281, 0.7912957467853611,
>> 1.1869436201780414, 0.8572370590174745, 6.0666007253544345]
>>
>> To run the script, use the -h flag to see usage, but at a minimum
>> you'll need to give the device handle to run on as the first argument
>> (the workload only does reads). The random_distribution argument is
>> set at the top of the file.
>
> I'll take a look at this tomorrow, that does seem very fishy.
>
>> Here is my environment information:
>> # cat /etc/centos-release
>> CentOS Linux release 7.3.1611 (Core)
>> # uname -r
>> 3.10.0-514.21.1.el7.x86_64
>> I used fio-3.2-13-g40e5f which was the newest version I could see as of today.
>>
>> As for the feature request:
>> I am trying to adapt our current FIO job files for FLEX testing which
>> is a new protocol we announced recently
>> (http://blog.seagate.com/intelligent/new-flex-dynamic-recording-method-redefines-data-center-hard-drive/)
>> that has some requirements on where writes/reads are allowed. I would
>> like to have better control where random reads and writes are going
>> using the zoned random_distribution setting using sector numbers
>> rather than capacity percentages. Would that be a possible feature to
>> add? Or is there an existing way to randomly read/write to
>> non-contiguous zones on the disk with varying sizes?
>
> The best way to request something like that is to come up with a logical
> way to describe it. That's usually the hardest part, implementing it is
> usually not that hard. This is especially important since it has to be
> intuitively easy to use for the user, not requiring them to pour too
> much over man pages.
>
> For yours, the zoned setup already supports split ranges for
> reads/writes/trims. The change seems to be that you want to give the
> zones in absolute sizes instead. It'd be the easiest to extend the
> zoning to allow sizes instead.
>
> --
> Jens Axboe
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-27 23:18   ` Phillip Chen
@ 2017-11-29 19:33     ` Jens Axboe
  2017-11-29 19:37       ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-29 19:33 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

We can add empty zones, that's fine, if that already doesn't work.

For your case, can you try and kill this line:

        qsort(o->zone_split[ddir], o->zone_split_nr[ddir], sizeof(struct zone_split), zone_cmp);

in options.c:zone_split_ddir(), I have a feeling that's screwing us over.


On 11/27/2017 04:18 PM, Phillip Chen wrote:
> I agree that changing the zoned random distribution parameter to allow
> absolute sizes would be an elegant way to address my use case. I'm not
> sure if it would be easier to add that functionality to the zoned
> distribution or create a new distribution named something like
> zoned_sectors. I'd also like the ability to have zones that no I/O
> will fall into. I think the easiest way to do that would be to allow 0
> as a valid distribution percentage (I.E. something like
> random_distribution=zoned:10/10:0/50:30/20:8/30:2/40). Currently it
> seems I can specify 0 as a distribution percentage, but it doesn't
> create a zero I/O zone like I would like it to. Although that might be
> addressed just by making the zoned distribution randomize I/O as
> expected.
> Thanks for looking into this,
> Phillip Chen
> 
> On Mon, Nov 20, 2017 at 6:57 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> On 11/20/2017 10:17 AM, Phillip Chen wrote:
>>> Hello,
>>> I'm a test engineer at Seagate and we're using FIO to gather some
>>> performance data. This email has two parts: a bug report and a feature
>>> request.
>>> The bug that I'm seeing is that when using the
>>> random_distribution=zoned argument, the zone order is not honored. So
>>> using zoned:18/90:7/5:75/5 will not weight IO towards the end of the
>>> disk but rather towards the beginning. Using zoned:75/5:7/5:18/90
>>> apparently gives the same distribution, but also not the correct
>>> distribution. I've attached a python3.6 script that shows this
>>> behaviour. Here is the histogram information from running the two
>>> zoned arguments as described above:
>>>
>>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>>> --iodepth=1 --random_distribution=zoned:18/90:7/5:75/5 --norandommap
>>> --output-format=terse
>>> histogram bins = [2302, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>>> 32, 27, 49, 34, 24, 36, 26, 184]
>>> histogram percents = [75.99867943215582, 0.8253549026081215,
>>> 0.8253549026081215, 0.9904258831297458, 1.0894684714427203,
>>> 1.188511059755695, 0.8583690987124464, 1.0564542753383954,
>>> 0.9574116870254209, 1.2215252558600198, 0.6932981181908221,
>>> 0.6932981181908221, 1.0564542753383954, 0.8913832948167713,
>>> 1.6176956091119181, 1.1224826675470452, 0.7923407065037966,
>>> 1.188511059755695, 0.8583690987124464, 6.074612083195774]
>>>
>>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>>> --iodepth=1 --random_distribu   tion=zoned:75/5:7/5:18/90
>>> --norandommap --output-format=terse
>>> histogram bins = [2306, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>>> 32, 27, 49, 34, 24, 36, 26, 184]
>>> histogram percents = [76.03033300362677, 0.8242664029014177,
>>> 0.8242664029014177, 0.9891196834817013, 1.0880316518298714,
>>> 1.1869436201780414, 0.8572370590174745, 1.0550609957138146,
>>> 0.9561490273656446, 1.2199142762940982, 0.6923837784371909,
>>> 0.6923837784371909, 1.0550609957138146, 0.8902077151335311,
>>> 1.6155621496867787, 1.1210023079459281, 0.7912957467853611,
>>> 1.1869436201780414, 0.8572370590174745, 6.0666007253544345]
>>>
>>> To run the script, use the -h flag to see usage, but at a minimum
>>> you'll need to give the device handle to run on as the first argument
>>> (the workload only does reads). The random_distribution argument is
>>> set at the top of the file.
>>
>> I'll take a look at this tomorrow, that does seem very fishy.
>>
>>> Here is my environment information:
>>> # cat /etc/centos-release
>>> CentOS Linux release 7.3.1611 (Core)
>>> # uname -r
>>> 3.10.0-514.21.1.el7.x86_64
>>> I used fio-3.2-13-g40e5f which was the newest version I could see as of today.
>>>
>>> As for the feature request:
>>> I am trying to adapt our current FIO job files for FLEX testing which
>>> is a new protocol we announced recently
>>> (http://blog.seagate.com/intelligent/new-flex-dynamic-recording-method-redefines-data-center-hard-drive/)
>>> that has some requirements on where writes/reads are allowed. I would
>>> like to have better control where random reads and writes are going
>>> using the zoned random_distribution setting using sector numbers
>>> rather than capacity percentages. Would that be a possible feature to
>>> add? Or is there an existing way to randomly read/write to
>>> non-contiguous zones on the disk with varying sizes?
>>
>> The best way to request something like that is to come up with a logical
>> way to describe it. That's usually the hardest part, implementing it is
>> usually not that hard. This is especially important since it has to be
>> intuitively easy to use for the user, not requiring them to pour too
>> much over man pages.
>>
>> For yours, the zoned setup already supports split ranges for
>> reads/writes/trims. The change seems to be that you want to give the
>> zones in absolute sizes instead. It'd be the easiest to extend the
>> zoning to allow sizes instead.
>>
>> --
>> Jens Axboe
>>


-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-29 19:33     ` Jens Axboe
@ 2017-11-29 19:37       ` Jens Axboe
  2017-11-29 20:39         ` Phillip Chen
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-29 19:37 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

OK, I checked that fix, it seems to do the trick. On top of that,
fio does support empty zones. If I run your script and change
the distribution to:

dist_str = "zoned:50/5:0/90:50/5"                                               

to says "50% of access to the first 5% of the drive, nothing to
the middle 90% of the drive, and 50% to the last 5% of the drive",
I get:

histogram percents = [50.02041783178578, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 49.97958216821422]

which looks pretty spot on to me.

I'm going to commit the fix.

On 11/29/2017 12:33 PM, Jens Axboe wrote:
> We can add empty zones, that's fine, if that already doesn't work.
> 
> For your case, can you try and kill this line:
> 
>         qsort(o->zone_split[ddir], o->zone_split_nr[ddir], sizeof(struct zone_split), zone_cmp);
> 
> in options.c:zone_split_ddir(), I have a feeling that's screwing us over.
> 
> 
> On 11/27/2017 04:18 PM, Phillip Chen wrote:
>> I agree that changing the zoned random distribution parameter to allow
>> absolute sizes would be an elegant way to address my use case. I'm not
>> sure if it would be easier to add that functionality to the zoned
>> distribution or create a new distribution named something like
>> zoned_sectors. I'd also like the ability to have zones that no I/O
>> will fall into. I think the easiest way to do that would be to allow 0
>> as a valid distribution percentage (I.E. something like
>> random_distribution=zoned:10/10:0/50:30/20:8/30:2/40). Currently it
>> seems I can specify 0 as a distribution percentage, but it doesn't
>> create a zero I/O zone like I would like it to. Although that might be
>> addressed just by making the zoned distribution randomize I/O as
>> expected.
>> Thanks for looking into this,
>> Phillip Chen
>>
>> On Mon, Nov 20, 2017 at 6:57 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>> On 11/20/2017 10:17 AM, Phillip Chen wrote:
>>>> Hello,
>>>> I'm a test engineer at Seagate and we're using FIO to gather some
>>>> performance data. This email has two parts: a bug report and a feature
>>>> request.
>>>> The bug that I'm seeing is that when using the
>>>> random_distribution=zoned argument, the zone order is not honored. So
>>>> using zoned:18/90:7/5:75/5 will not weight IO towards the end of the
>>>> disk but rather towards the beginning. Using zoned:75/5:7/5:18/90
>>>> apparently gives the same distribution, but also not the correct
>>>> distribution. I've attached a python3.6 script that shows this
>>>> behaviour. Here is the histogram information from running the two
>>>> zoned arguments as described above:
>>>>
>>>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>>>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>>>> --iodepth=1 --random_distribution=zoned:18/90:7/5:75/5 --norandommap
>>>> --output-format=terse
>>>> histogram bins = [2302, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>>>> 32, 27, 49, 34, 24, 36, 26, 184]
>>>> histogram percents = [75.99867943215582, 0.8253549026081215,
>>>> 0.8253549026081215, 0.9904258831297458, 1.0894684714427203,
>>>> 1.188511059755695, 0.8583690987124464, 1.0564542753383954,
>>>> 0.9574116870254209, 1.2215252558600198, 0.6932981181908221,
>>>> 0.6932981181908221, 1.0564542753383954, 0.8913832948167713,
>>>> 1.6176956091119181, 1.1224826675470452, 0.7923407065037966,
>>>> 1.188511059755695, 0.8583690987124464, 6.074612083195774]
>>>>
>>>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>>>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>>>> --iodepth=1 --random_distribu   tion=zoned:75/5:7/5:18/90
>>>> --norandommap --output-format=terse
>>>> histogram bins = [2306, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>>>> 32, 27, 49, 34, 24, 36, 26, 184]
>>>> histogram percents = [76.03033300362677, 0.8242664029014177,
>>>> 0.8242664029014177, 0.9891196834817013, 1.0880316518298714,
>>>> 1.1869436201780414, 0.8572370590174745, 1.0550609957138146,
>>>> 0.9561490273656446, 1.2199142762940982, 0.6923837784371909,
>>>> 0.6923837784371909, 1.0550609957138146, 0.8902077151335311,
>>>> 1.6155621496867787, 1.1210023079459281, 0.7912957467853611,
>>>> 1.1869436201780414, 0.8572370590174745, 6.0666007253544345]
>>>>
>>>> To run the script, use the -h flag to see usage, but at a minimum
>>>> you'll need to give the device handle to run on as the first argument
>>>> (the workload only does reads). The random_distribution argument is
>>>> set at the top of the file.
>>>
>>> I'll take a look at this tomorrow, that does seem very fishy.
>>>
>>>> Here is my environment information:
>>>> # cat /etc/centos-release
>>>> CentOS Linux release 7.3.1611 (Core)
>>>> # uname -r
>>>> 3.10.0-514.21.1.el7.x86_64
>>>> I used fio-3.2-13-g40e5f which was the newest version I could see as of today.
>>>>
>>>> As for the feature request:
>>>> I am trying to adapt our current FIO job files for FLEX testing which
>>>> is a new protocol we announced recently
>>>> (http://blog.seagate.com/intelligent/new-flex-dynamic-recording-method-redefines-data-center-hard-drive/)
>>>> that has some requirements on where writes/reads are allowed. I would
>>>> like to have better control where random reads and writes are going
>>>> using the zoned random_distribution setting using sector numbers
>>>> rather than capacity percentages. Would that be a possible feature to
>>>> add? Or is there an existing way to randomly read/write to
>>>> non-contiguous zones on the disk with varying sizes?
>>>
>>> The best way to request something like that is to come up with a logical
>>> way to describe it. That's usually the hardest part, implementing it is
>>> usually not that hard. This is especially important since it has to be
>>> intuitively easy to use for the user, not requiring them to pour too
>>> much over man pages.
>>>
>>> For yours, the zoned setup already supports split ranges for
>>> reads/writes/trims. The change seems to be that you want to give the
>>> zones in absolute sizes instead. It'd be the easiest to extend the
>>> zoning to allow sizes instead.
>>>
>>> --
>>> Jens Axboe
>>>
> 
> 


-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-29 19:37       ` Jens Axboe
@ 2017-11-29 20:39         ` Phillip Chen
  2017-11-29 20:58           ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Phillip Chen @ 2017-11-29 20:39 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

Looks good, thanks for fixing this! If you find the time to add
support to use block addresses in addition to percentage for
specifying the zones, I think it would be quite useful. Thanks for the
quick update,
Phillip Chen

On Wed, Nov 29, 2017 at 12:37 PM, Jens Axboe <axboe@kernel.dk> wrote:
> OK, I checked that fix, it seems to do the trick. On top of that,
> fio does support empty zones. If I run your script and change
> the distribution to:
>
> dist_str = "zoned:50/5:0/90:50/5"
>
> to says "50% of access to the first 5% of the drive, nothing to
> the middle 90% of the drive, and 50% to the last 5% of the drive",
> I get:
>
> histogram percents = [50.02041783178578, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 49.97958216821422]
>
> which looks pretty spot on to me.
>
> I'm going to commit the fix.
>
> On 11/29/2017 12:33 PM, Jens Axboe wrote:
>> We can add empty zones, that's fine, if that already doesn't work.
>>
>> For your case, can you try and kill this line:
>>
>>         qsort(o->zone_split[ddir], o->zone_split_nr[ddir], sizeof(struct zone_split), zone_cmp);
>>
>> in options.c:zone_split_ddir(), I have a feeling that's screwing us over.
>>
>>
>> On 11/27/2017 04:18 PM, Phillip Chen wrote:
>>> I agree that changing the zoned random distribution parameter to allow
>>> absolute sizes would be an elegant way to address my use case. I'm not
>>> sure if it would be easier to add that functionality to the zoned
>>> distribution or create a new distribution named something like
>>> zoned_sectors. I'd also like the ability to have zones that no I/O
>>> will fall into. I think the easiest way to do that would be to allow 0
>>> as a valid distribution percentage (I.E. something like
>>> random_distribution=zoned:10/10:0/50:30/20:8/30:2/40). Currently it
>>> seems I can specify 0 as a distribution percentage, but it doesn't
>>> create a zero I/O zone like I would like it to. Although that might be
>>> addressed just by making the zoned distribution randomize I/O as
>>> expected.
>>> Thanks for looking into this,
>>> Phillip Chen
>>>
>>> On Mon, Nov 20, 2017 at 6:57 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>>> On 11/20/2017 10:17 AM, Phillip Chen wrote:
>>>>> Hello,
>>>>> I'm a test engineer at Seagate and we're using FIO to gather some
>>>>> performance data. This email has two parts: a bug report and a feature
>>>>> request.
>>>>> The bug that I'm seeing is that when using the
>>>>> random_distribution=zoned argument, the zone order is not honored. So
>>>>> using zoned:18/90:7/5:75/5 will not weight IO towards the end of the
>>>>> disk but rather towards the beginning. Using zoned:75/5:7/5:18/90
>>>>> apparently gives the same distribution, but also not the correct
>>>>> distribution. I've attached a python3.6 script that shows this
>>>>> behaviour. Here is the histogram information from running the two
>>>>> zoned arguments as described above:
>>>>>
>>>>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>>>>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>>>>> --iodepth=1 --random_distribution=zoned:18/90:7/5:75/5 --norandommap
>>>>> --output-format=terse
>>>>> histogram bins = [2302, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>>>>> 32, 27, 49, 34, 24, 36, 26, 184]
>>>>> histogram percents = [75.99867943215582, 0.8253549026081215,
>>>>> 0.8253549026081215, 0.9904258831297458, 1.0894684714427203,
>>>>> 1.188511059755695, 0.8583690987124464, 1.0564542753383954,
>>>>> 0.9574116870254209, 1.2215252558600198, 0.6932981181908221,
>>>>> 0.6932981181908221, 1.0564542753383954, 0.8913832948167713,
>>>>> 1.6176956091119181, 1.1224826675470452, 0.7923407065037966,
>>>>> 1.188511059755695, 0.8583690987124464, 6.074612083195774]
>>>>>
>>>>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>>>>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>>>>> --iodepth=1 --random_distribu   tion=zoned:75/5:7/5:18/90
>>>>> --norandommap --output-format=terse
>>>>> histogram bins = [2306, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>>>>> 32, 27, 49, 34, 24, 36, 26, 184]
>>>>> histogram percents = [76.03033300362677, 0.8242664029014177,
>>>>> 0.8242664029014177, 0.9891196834817013, 1.0880316518298714,
>>>>> 1.1869436201780414, 0.8572370590174745, 1.0550609957138146,
>>>>> 0.9561490273656446, 1.2199142762940982, 0.6923837784371909,
>>>>> 0.6923837784371909, 1.0550609957138146, 0.8902077151335311,
>>>>> 1.6155621496867787, 1.1210023079459281, 0.7912957467853611,
>>>>> 1.1869436201780414, 0.8572370590174745, 6.0666007253544345]
>>>>>
>>>>> To run the script, use the -h flag to see usage, but at a minimum
>>>>> you'll need to give the device handle to run on as the first argument
>>>>> (the workload only does reads). The random_distribution argument is
>>>>> set at the top of the file.
>>>>
>>>> I'll take a look at this tomorrow, that does seem very fishy.
>>>>
>>>>> Here is my environment information:
>>>>> # cat /etc/centos-release
>>>>> CentOS Linux release 7.3.1611 (Core)
>>>>> # uname -r
>>>>> 3.10.0-514.21.1.el7.x86_64
>>>>> I used fio-3.2-13-g40e5f which was the newest version I could see as of today.
>>>>>
>>>>> As for the feature request:
>>>>> I am trying to adapt our current FIO job files for FLEX testing which
>>>>> is a new protocol we announced recently
>>>>> (http://blog.seagate.com/intelligent/new-flex-dynamic-recording-method-redefines-data-center-hard-drive/)
>>>>> that has some requirements on where writes/reads are allowed. I would
>>>>> like to have better control where random reads and writes are going
>>>>> using the zoned random_distribution setting using sector numbers
>>>>> rather than capacity percentages. Would that be a possible feature to
>>>>> add? Or is there an existing way to randomly read/write to
>>>>> non-contiguous zones on the disk with varying sizes?
>>>>
>>>> The best way to request something like that is to come up with a logical
>>>> way to describe it. That's usually the hardest part, implementing it is
>>>> usually not that hard. This is especially important since it has to be
>>>> intuitively easy to use for the user, not requiring them to pour too
>>>> much over man pages.
>>>>
>>>> For yours, the zoned setup already supports split ranges for
>>>> reads/writes/trims. The change seems to be that you want to give the
>>>> zones in absolute sizes instead. It'd be the easiest to extend the
>>>> zoning to allow sizes instead.
>>>>
>>>> --
>>>> Jens Axboe
>>>>
>>
>>
>
>
> --
> Jens Axboe
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-29 20:39         ` Phillip Chen
@ 2017-11-29 20:58           ` Jens Axboe
  2017-11-30  1:37             ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-29 20:58 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

I'll take a look at adding absolute zoning, should be pretty
trivial.


On 11/29/2017 01:39 PM, Phillip Chen wrote:
> Looks good, thanks for fixing this! If you find the time to add
> support to use block addresses in addition to percentage for
> specifying the zones, I think it would be quite useful. Thanks for the
> quick update,
> Phillip Chen
> 
> On Wed, Nov 29, 2017 at 12:37 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> OK, I checked that fix, it seems to do the trick. On top of that,
>> fio does support empty zones. If I run your script and change
>> the distribution to:
>>
>> dist_str = "zoned:50/5:0/90:50/5"
>>
>> to says "50% of access to the first 5% of the drive, nothing to
>> the middle 90% of the drive, and 50% to the last 5% of the drive",
>> I get:
>>
>> histogram percents = [50.02041783178578, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 49.97958216821422]
>>
>> which looks pretty spot on to me.
>>
>> I'm going to commit the fix.
>>
>> On 11/29/2017 12:33 PM, Jens Axboe wrote:
>>> We can add empty zones, that's fine, if that already doesn't work.
>>>
>>> For your case, can you try and kill this line:
>>>
>>>         qsort(o->zone_split[ddir], o->zone_split_nr[ddir], sizeof(struct zone_split), zone_cmp);
>>>
>>> in options.c:zone_split_ddir(), I have a feeling that's screwing us over.
>>>
>>>
>>> On 11/27/2017 04:18 PM, Phillip Chen wrote:
>>>> I agree that changing the zoned random distribution parameter to allow
>>>> absolute sizes would be an elegant way to address my use case. I'm not
>>>> sure if it would be easier to add that functionality to the zoned
>>>> distribution or create a new distribution named something like
>>>> zoned_sectors. I'd also like the ability to have zones that no I/O
>>>> will fall into. I think the easiest way to do that would be to allow 0
>>>> as a valid distribution percentage (I.E. something like
>>>> random_distribution=zoned:10/10:0/50:30/20:8/30:2/40). Currently it
>>>> seems I can specify 0 as a distribution percentage, but it doesn't
>>>> create a zero I/O zone like I would like it to. Although that might be
>>>> addressed just by making the zoned distribution randomize I/O as
>>>> expected.
>>>> Thanks for looking into this,
>>>> Phillip Chen
>>>>
>>>> On Mon, Nov 20, 2017 at 6:57 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>>>> On 11/20/2017 10:17 AM, Phillip Chen wrote:
>>>>>> Hello,
>>>>>> I'm a test engineer at Seagate and we're using FIO to gather some
>>>>>> performance data. This email has two parts: a bug report and a feature
>>>>>> request.
>>>>>> The bug that I'm seeing is that when using the
>>>>>> random_distribution=zoned argument, the zone order is not honored. So
>>>>>> using zoned:18/90:7/5:75/5 will not weight IO towards the end of the
>>>>>> disk but rather towards the beginning. Using zoned:75/5:7/5:18/90
>>>>>> apparently gives the same distribution, but also not the correct
>>>>>> distribution. I've attached a python3.6 script that shows this
>>>>>> behaviour. Here is the histogram information from running the two
>>>>>> zoned arguments as described above:
>>>>>>
>>>>>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>>>>>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>>>>>> --iodepth=1 --random_distribution=zoned:18/90:7/5:75/5 --norandommap
>>>>>> --output-format=terse
>>>>>> histogram bins = [2302, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>>>>>> 32, 27, 49, 34, 24, 36, 26, 184]
>>>>>> histogram percents = [75.99867943215582, 0.8253549026081215,
>>>>>> 0.8253549026081215, 0.9904258831297458, 1.0894684714427203,
>>>>>> 1.188511059755695, 0.8583690987124464, 1.0564542753383954,
>>>>>> 0.9574116870254209, 1.2215252558600198, 0.6932981181908221,
>>>>>> 0.6932981181908221, 1.0564542753383954, 0.8913832948167713,
>>>>>> 1.6176956091119181, 1.1224826675470452, 0.7923407065037966,
>>>>>> 1.188511059755695, 0.8583690987124464, 6.074612083195774]
>>>>>>
>>>>>> Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
>>>>>> --thread --filename=/dev/sdc --runtime=30 --readwrite=randread
>>>>>> --iodepth=1 --random_distribu   tion=zoned:75/5:7/5:18/90
>>>>>> --norandommap --output-format=terse
>>>>>> histogram bins = [2306, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
>>>>>> 32, 27, 49, 34, 24, 36, 26, 184]
>>>>>> histogram percents = [76.03033300362677, 0.8242664029014177,
>>>>>> 0.8242664029014177, 0.9891196834817013, 1.0880316518298714,
>>>>>> 1.1869436201780414, 0.8572370590174745, 1.0550609957138146,
>>>>>> 0.9561490273656446, 1.2199142762940982, 0.6923837784371909,
>>>>>> 0.6923837784371909, 1.0550609957138146, 0.8902077151335311,
>>>>>> 1.6155621496867787, 1.1210023079459281, 0.7912957467853611,
>>>>>> 1.1869436201780414, 0.8572370590174745, 6.0666007253544345]
>>>>>>
>>>>>> To run the script, use the -h flag to see usage, but at a minimum
>>>>>> you'll need to give the device handle to run on as the first argument
>>>>>> (the workload only does reads). The random_distribution argument is
>>>>>> set at the top of the file.
>>>>>
>>>>> I'll take a look at this tomorrow, that does seem very fishy.
>>>>>
>>>>>> Here is my environment information:
>>>>>> # cat /etc/centos-release
>>>>>> CentOS Linux release 7.3.1611 (Core)
>>>>>> # uname -r
>>>>>> 3.10.0-514.21.1.el7.x86_64
>>>>>> I used fio-3.2-13-g40e5f which was the newest version I could see as of today.
>>>>>>
>>>>>> As for the feature request:
>>>>>> I am trying to adapt our current FIO job files for FLEX testing which
>>>>>> is a new protocol we announced recently
>>>>>> (http://blog.seagate.com/intelligent/new-flex-dynamic-recording-method-redefines-data-center-hard-drive/)
>>>>>> that has some requirements on where writes/reads are allowed. I would
>>>>>> like to have better control where random reads and writes are going
>>>>>> using the zoned random_distribution setting using sector numbers
>>>>>> rather than capacity percentages. Would that be a possible feature to
>>>>>> add? Or is there an existing way to randomly read/write to
>>>>>> non-contiguous zones on the disk with varying sizes?
>>>>>
>>>>> The best way to request something like that is to come up with a logical
>>>>> way to describe it. That's usually the hardest part, implementing it is
>>>>> usually not that hard. This is especially important since it has to be
>>>>> intuitively easy to use for the user, not requiring them to pour too
>>>>> much over man pages.
>>>>>
>>>>> For yours, the zoned setup already supports split ranges for
>>>>> reads/writes/trims. The change seems to be that you want to give the
>>>>> zones in absolute sizes instead. It'd be the easiest to extend the
>>>>> zoning to allow sizes instead.
>>>>>
>>>>> --
>>>>> Jens Axboe
>>>>>
>>>
>>>
>>
>>
>> --
>> Jens Axboe
>>


-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-29 20:58           ` Jens Axboe
@ 2017-11-30  1:37             ` Jens Axboe
  2017-11-30  2:16               ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-30  1:37 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

On 11/29/2017 01:58 PM, Jens Axboe wrote:
> I'll take a look at adding absolute zoning, should be pretty
> trivial.

Quick and dirty here, can you see if this works with your magic
script?

Basically you just do:

random_distribution=zoned_abs:60/100m:10/200m:30/700m                           

like you would for 'zoned' - the first is a percentage, the
other part is a size. So the above would be:

- 60% of accesss to the first 100M
- 10% of access to the next 200M
- 30% of access to the next 700m

No checking for whether or not we exceed device/file size or anything
like that in this version, but everything else should work with the
existing code pretty nicely. Not tested...

diff --git a/fio.h b/fio.h
index 8ca934d14a4c..a44f1aae4721 100644
--- a/fio.h
+++ b/fio.h
@@ -158,6 +158,8 @@ void sk_out_drop(void);
 struct zone_split_index {
 	uint8_t size_perc;
 	uint8_t size_perc_prev;
+	uint64_t size;
+	uint64_t size_prev;
 };
 
 /*
@@ -813,6 +815,7 @@ enum {
 	FIO_RAND_DIST_PARETO,
 	FIO_RAND_DIST_GAUSS,
 	FIO_RAND_DIST_ZONED,
+	FIO_RAND_DIST_ZONED_ABS,
 };
 
 #define FIO_DEF_ZIPF		1.1
diff --git a/io_u.c b/io_u.c
index 81ee724b7357..0a4ba435fb65 100644
--- a/io_u.c
+++ b/io_u.c
@@ -157,6 +157,72 @@ static int __get_next_rand_offset_gauss(struct thread_data *td,
 	return 0;
 }
 
+static int __get_next_rand_offset_zoned_abs(struct thread_data *td,
+					    struct fio_file *f,
+					    enum fio_ddir ddir, uint64_t *b)
+{
+	struct zone_split_index *zsi;
+	uint64_t offset, lastb;
+	uint64_t send, stotal;
+	static int warned;
+	unsigned int v;
+
+	lastb = last_block(td, f, ddir);
+	if (!lastb)
+		return 1;
+
+	if (!td->o.zone_split_nr[ddir]) {
+bail:
+		return __get_next_rand_offset(td, f, ddir, b, lastb);
+	}
+
+	/*
+	 * Generate a value, v, between 1 and 100, both inclusive
+	 */
+	v = rand32_between(&td->zone_state, 1, 100);
+
+	zsi = &td->zone_state_index[ddir][v - 1];
+	stotal = zsi->size_prev / td->o.ba[ddir];
+	send = zsi->size / td->o.ba[ddir];
+
+	/*
+	 * Should never happen
+	 */
+	if (send == -1U) {
+		if (!warned) {
+			log_err("fio: bug in zoned generation\n");
+			warned = 1;
+		}
+		goto bail;
+	}
+
+	/*
+	 * 'send' is some percentage below or equal to 100 that
+	 * marks the end of the current IO range. 'stotal' marks
+	 * the start, in percent.
+	 */
+	if (stotal)
+		offset = stotal;
+	else
+		offset = 0;
+
+	lastb = send - stotal;
+
+	/*
+	 * Generate index from 0..send-of-lastb
+	 */
+	if (__get_next_rand_offset(td, f, ddir, b, lastb) == 1)
+		return 1;
+
+	/*
+	 * Add our start offset, if any
+	 */
+	if (offset)
+		*b += offset;
+
+	return 0;
+}
+
 static int __get_next_rand_offset_zoned(struct thread_data *td,
 					struct fio_file *f, enum fio_ddir ddir,
 					uint64_t *b)
@@ -249,6 +315,8 @@ static int get_off_from_method(struct thread_data *td, struct fio_file *f,
 		return __get_next_rand_offset_gauss(td, f, ddir, b);
 	else if (td->o.random_distribution == FIO_RAND_DIST_ZONED)
 		return __get_next_rand_offset_zoned(td, f, ddir, b);
+	else if (td->o.random_distribution == FIO_RAND_DIST_ZONED_ABS)
+		return __get_next_rand_offset_zoned_abs(td, f, ddir, b);
 
 	log_err("fio: unknown random distribution: %d\n", td->o.random_distribution);
 	return 1;
diff --git a/options.c b/options.c
index 4bea8f781304..b49674217e2b 100644
--- a/options.c
+++ b/options.c
@@ -54,16 +54,19 @@ static int bs_cmp(const void *p1, const void *p2)
 	return (int) bsp1->perc - (int) bsp2->perc;
 }
 
+#define SPLIT_MAX_ENTRY	100
+
 struct split {
 	unsigned int nr;
-	unsigned int val1[100];
-	unsigned int val2[100];
+	unsigned int val1[SPLIT_MAX_ENTRY];
+	unsigned long long val2[SPLIT_MAX_ENTRY];
 };
 
 static int split_parse_ddir(struct thread_options *o, struct split *split,
-			    enum fio_ddir ddir, char *str)
+			    enum fio_ddir ddir, char *str, bool absolute)
 {
-	unsigned int i, perc;
+	unsigned long long perc;
+	unsigned int i;
 	long long val;
 	char *fname;
 
@@ -80,23 +83,35 @@ static int split_parse_ddir(struct thread_options *o, struct split *split,
 		if (perc_str) {
 			*perc_str = '\0';
 			perc_str++;
-			perc = atoi(perc_str);
-			if (perc > 100)
-				perc = 100;
-			else if (!perc)
+			if (absolute) {
+				if (str_to_decimal(perc_str, &val, 1, o, 0, 0)) {
+					log_err("fio: split conversion failed\n");
+					return 1;
+				}
+				perc = val;
+			} else {
+				perc = atoi(perc_str);
+				if (perc > 100)
+					perc = 100;
+				else if (!perc)
+					perc = -1U;
+			}
+		} else {
+			if (absolute)
+				perc = 0;
+			else
 				perc = -1U;
-		} else
-			perc = -1U;
+		}
 
 		if (str_to_decimal(fname, &val, 1, o, 0, 0)) {
-			log_err("fio: bssplit conversion failed\n");
+			log_err("fio: split conversion failed\n");
 			return 1;
 		}
 
 		split->val1[i] = val;
 		split->val2[i] = perc;
 		i++;
-		if (i == 100)
+		if (i == SPLIT_MAX_ENTRY)
 			break;
 	}
 
@@ -104,7 +119,8 @@ static int split_parse_ddir(struct thread_options *o, struct split *split,
 	return 0;
 }
 
-static int bssplit_ddir(struct thread_options *o, enum fio_ddir ddir, char *str)
+static int bssplit_ddir(struct thread_options *o, enum fio_ddir ddir, char *str,
+			bool data)
 {
 	unsigned int i, perc, perc_missing;
 	unsigned int max_bs, min_bs;
@@ -112,7 +128,7 @@ static int bssplit_ddir(struct thread_options *o, enum fio_ddir ddir, char *str)
 
 	memset(&split, 0, sizeof(split));
 
-	if (split_parse_ddir(o, &split, ddir, str))
+	if (split_parse_ddir(o, &split, ddir, str, data))
 		return 1;
 	if (!split.nr)
 		return 0;
@@ -176,9 +192,10 @@ static int bssplit_ddir(struct thread_options *o, enum fio_ddir ddir, char *str)
 	return 0;
 }
 
-typedef int (split_parse_fn)(struct thread_options *, enum fio_ddir, char *);
+typedef int (split_parse_fn)(struct thread_options *, enum fio_ddir, char *, bool);
 
-static int str_split_parse(struct thread_data *td, char *str, split_parse_fn *fn)
+static int str_split_parse(struct thread_data *td, char *str,
+			   split_parse_fn *fn, bool data)
 {
 	char *odir, *ddir;
 	int ret = 0;
@@ -187,37 +204,37 @@ static int str_split_parse(struct thread_data *td, char *str, split_parse_fn *fn
 	if (odir) {
 		ddir = strchr(odir + 1, ',');
 		if (ddir) {
-			ret = fn(&td->o, DDIR_TRIM, ddir + 1);
+			ret = fn(&td->o, DDIR_TRIM, ddir + 1, data);
 			if (!ret)
 				*ddir = '\0';
 		} else {
 			char *op;
 
 			op = strdup(odir + 1);
-			ret = fn(&td->o, DDIR_TRIM, op);
+			ret = fn(&td->o, DDIR_TRIM, op, data);
 
 			free(op);
 		}
 		if (!ret)
-			ret = fn(&td->o, DDIR_WRITE, odir + 1);
+			ret = fn(&td->o, DDIR_WRITE, odir + 1, data);
 		if (!ret) {
 			*odir = '\0';
-			ret = fn(&td->o, DDIR_READ, str);
+			ret = fn(&td->o, DDIR_READ, str, data);
 		}
 	} else {
 		char *op;
 
 		op = strdup(str);
-		ret = fn(&td->o, DDIR_WRITE, op);
+		ret = fn(&td->o, DDIR_WRITE, op, data);
 		free(op);
 
 		if (!ret) {
 			op = strdup(str);
-			ret = fn(&td->o, DDIR_TRIM, op);
+			ret = fn(&td->o, DDIR_TRIM, op, data);
 			free(op);
 		}
 		if (!ret)
-			ret = fn(&td->o, DDIR_READ, str);
+			ret = fn(&td->o, DDIR_READ, str, data);
 	}
 
 	return ret;
@@ -234,7 +251,7 @@ static int str_bssplit_cb(void *data, const char *input)
 	strip_blank_front(&str);
 	strip_blank_end(str);
 
-	ret = str_split_parse(td, str, bssplit_ddir);
+	ret = str_split_parse(td, str, bssplit_ddir, false);
 
 	if (parse_dryrun()) {
 		int i;
@@ -824,14 +841,14 @@ static int str_sfr_cb(void *data, const char *str)
 #endif
 
 static int zone_split_ddir(struct thread_options *o, enum fio_ddir ddir,
-			   char *str)
+			   char *str, bool absolute)
 {
 	unsigned int i, perc, perc_missing, sperc, sperc_missing;
 	struct split split;
 
 	memset(&split, 0, sizeof(split));
 
-	if (split_parse_ddir(o, &split, ddir, str))
+	if (split_parse_ddir(o, &split, ddir, str, absolute))
 		return 1;
 	if (!split.nr)
 		return 0;
@@ -840,7 +857,10 @@ static int zone_split_ddir(struct thread_options *o, enum fio_ddir ddir,
 	o->zone_split_nr[ddir] = split.nr;
 	for (i = 0; i < split.nr; i++) {
 		o->zone_split[ddir][i].access_perc = split.val1[i];
-		o->zone_split[ddir][i].size_perc = split.val2[i];
+		if (absolute)
+			o->zone_split[ddir][i].size = split.val2[i];
+		else
+			o->zone_split[ddir][i].size_perc = split.val2[i];
 	}
 
 	/*
@@ -856,11 +876,12 @@ static int zone_split_ddir(struct thread_options *o, enum fio_ddir ddir,
 		else
 			perc += zsp->access_perc;
 
-		if (zsp->size_perc == (uint8_t) -1U)
-			sperc_missing++;
-		else
-			sperc += zsp->size_perc;
-
+		if (!absolute) {
+			if (zsp->size_perc == (uint8_t) -1U)
+				sperc_missing++;
+			else
+				sperc += zsp->size_perc;
+		}
 	}
 
 	if (perc > 100 || sperc > 100) {
@@ -908,10 +929,11 @@ static int zone_split_ddir(struct thread_options *o, enum fio_ddir ddir,
 static void __td_zone_gen_index(struct thread_data *td, enum fio_ddir ddir)
 {
 	unsigned int i, j, sprev, aprev;
+	uint64_t sprev_sz;
 
 	td->zone_state_index[ddir] = malloc(sizeof(struct zone_split_index) * 100);
 
-	sprev = aprev = 0;
+	sprev_sz = sprev = aprev = 0;
 	for (i = 0; i < td->o.zone_split_nr[ddir]; i++) {
 		struct zone_split *zsp = &td->o.zone_split[ddir][i];
 
@@ -920,10 +942,14 @@ static void __td_zone_gen_index(struct thread_data *td, enum fio_ddir ddir)
 
 			zsi->size_perc = sprev + zsp->size_perc;
 			zsi->size_perc_prev = sprev;
+
+			zsi->size = sprev_sz + zsp->size;
+			zsi->size_prev = sprev_sz;
 		}
 
 		aprev += zsp->access_perc;
 		sprev += zsp->size_perc;
+		sprev_sz += zsp->size;
 	}
 }
 
@@ -942,8 +968,10 @@ static void td_zone_gen_index(struct thread_data *td)
 		__td_zone_gen_index(td, i);
 }
 
-static int parse_zoned_distribution(struct thread_data *td, const char *input)
+static int parse_zoned_distribution(struct thread_data *td, const char *input,
+				    bool absolute)
 {
+	const char *pre = absolute ? "zoned_abs:" : "zoned:";
 	char *str, *p;
 	int i, ret = 0;
 
@@ -953,14 +981,14 @@ static int parse_zoned_distribution(struct thread_data *td, const char *input)
 	strip_blank_end(str);
 
 	/* We expect it to start like that, bail if not */
-	if (strncmp(str, "zoned:", 6)) {
+	if (strncmp(str, pre, strlen(pre))) {
 		log_err("fio: mismatch in zoned input <%s>\n", str);
 		free(p);
 		return 1;
 	}
-	str += strlen("zoned:");
+	str += strlen(pre);
 
-	ret = str_split_parse(td, str, zone_split_ddir);
+	ret = str_split_parse(td, str, zone_split_ddir, absolute);
 
 	free(p);
 
@@ -972,8 +1000,15 @@ static int parse_zoned_distribution(struct thread_data *td, const char *input)
 		for (j = 0; j < td->o.zone_split_nr[i]; j++) {
 			struct zone_split *zsp = &td->o.zone_split[i][j];
 
-			dprint(FD_PARSE, "\t%d: %u/%u\n", j, zsp->access_perc,
-								zsp->size_perc);
+			if (absolute) {
+				dprint(FD_PARSE, "\t%d: %u/%llu\n", j,
+						zsp->access_perc,
+						(unsigned long long) zsp->size);
+			} else {
+				dprint(FD_PARSE, "\t%d: %u/%u\n", j,
+						zsp->access_perc,
+						zsp->size_perc);
+			}
 		}
 	}
 
@@ -1012,7 +1047,9 @@ static int str_random_distribution_cb(void *data, const char *str)
 	else if (td->o.random_distribution == FIO_RAND_DIST_GAUSS)
 		val = 0.0;
 	else if (td->o.random_distribution == FIO_RAND_DIST_ZONED)
-		return parse_zoned_distribution(td, str);
+		return parse_zoned_distribution(td, str, false);
+	else if (td->o.random_distribution == FIO_RAND_DIST_ZONED_ABS)
+		return parse_zoned_distribution(td, str, true);
 	else
 		return 0;
 
@@ -2241,7 +2278,10 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 			    .oval = FIO_RAND_DIST_ZONED,
 			    .help = "Zoned random distribution",
 			  },
-
+			  { .ival = "zoned_abs",
+			    .oval = FIO_RAND_DIST_ZONED_ABS,
+			    .help = "Zoned random absolute distribution",
+			  },
 		},
 		.category = FIO_OPT_C_IO,
 		.group	= FIO_OPT_G_RANDOM,
diff --git a/server.h b/server.h
index ba3abfeb3228..dbd5c277de6b 100644
--- a/server.h
+++ b/server.h
@@ -49,7 +49,7 @@ struct fio_net_cmd_reply {
 };
 
 enum {
-	FIO_SERVER_VER			= 66,
+	FIO_SERVER_VER			= 67,
 
 	FIO_SERVER_MAX_FRAGMENT_PDU	= 1024,
 	FIO_SERVER_MAX_CMD_MB		= 2048,
diff --git a/thread_options.h b/thread_options.h
index ca549b542703..050cd3822914 100644
--- a/thread_options.h
+++ b/thread_options.h
@@ -36,6 +36,8 @@ struct bssplit {
 struct zone_split {
 	uint8_t access_perc;
 	uint8_t size_perc;
+	uint8_t pad[6];
+	uint64_t size;
 };
 
 #define NR_OPTS_SZ	(FIO_MAX_OPTS / (8 * sizeof(uint64_t)))

-- 
Jens Axboe



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-30  1:37             ` Jens Axboe
@ 2017-11-30  2:16               ` Jens Axboe
  2017-11-30  2:30                 ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-30  2:16 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

On 11/29/2017 06:37 PM, Jens Axboe wrote:
> On 11/29/2017 01:58 PM, Jens Axboe wrote:
>> I'll take a look at adding absolute zoning, should be pretty
>> trivial.
> 
> Quick and dirty here, can you see if this works with your magic
> script?
> 
> Basically you just do:
> 
> random_distribution=zoned_abs:60/100m:10/200m:30/700m                           
> 
> like you would for 'zoned' - the first is a percentage, the
> other part is a size. So the above would be:
> 
> - 60% of accesss to the first 100M
> - 10% of access to the next 200M
> - 30% of access to the next 700m
> 
> No checking for whether or not we exceed device/file size or anything
> like that in this version, but everything else should work with the
> existing code pretty nicely. Not tested...

Added documentation, and modified your script do have:

dist_str = "zoned_abs:50/35770m:10/286166m:40/35770m"                           

for a 375G nvme drive I have, this yields:

histogram percents = [25.011592135525543, 25.014596077621864, 0.6177337336535484, 0.6206606515935528, 0.6198133858740779, 0.6175796853409167, 0.6187350476856551, 0.6266685357861933, 0.6370667968888403, 0.6393004974220015, 0.6279009222872477, 0.6302886711330408, 0.6230484004393458, 0.6274387773493523, 0.6131122842745942, 0.6258982942230342, 0.614036574150385, 0.6185809993730234, 19.971208370369116, 20.02474015900867]
histogram edges = [36629084.0, 73258168.0, 109887252.0, 146516336.0, 183145420.0, 219774504.0, 256403588.0, 293032672.0, 329661756.0, 366290840.0, 402919924.0, 439549008.0, 476178092.0, 512807176.0, 549436260.0, 586065344.0, 622694428.0, 659323512.0, 695952596.0, 732581680.0]
zoned_abs:50/35770m:10/286166m:40/35770m

which looks correct to me - 50% in the first two buckets, which is defined as
the first 10% of the drive. 40% in the last two buckets, which is the last 10%
of the drive. And the rest in the middle, which should add up to 10 (quick
eyeballing says it does).

Updated patch below. Only change is the added documentation, and a check
for exceeding device/file size.


diff --git a/HOWTO b/HOWTO
index 164ba2bbdea2..dc99e9989a47 100644
--- a/HOWTO
+++ b/HOWTO
@@ -1254,6 +1254,9 @@ I/O type
 		**zoned**
 				Zoned random distribution
 
+		**zoned_abs**
+				Zone absolute random distribution
+
 	When using a **zipf** or **pareto** distribution, an input value is also
 	needed to define the access pattern. For **zipf**, this is the `Zipf
 	theta`. For **pareto**, it's the `Pareto power`. Fio includes a test
@@ -1278,10 +1281,23 @@ I/O type
 
 		random_distribution=zoned:60/10:30/20:8/30:2/40
 
-	similarly to how :option:`bssplit` works for setting ranges and percentages
-	of block sizes. Like :option:`bssplit`, it's possible to specify separate
-	zones for reads, writes, and trims. If just one set is given, it'll apply to
-	all of them.
+	A **zoned_abs** distribution works exactly like the **zoned**, except
+	that it takes absolute sizes. For example, let's say you wanted to
+	define access according to the following criteria:
+
+		* 60% of accesses should be to the first 20G
+		* 30% of accesses should be to the next 100G
+		* 10% of accesses should be to the next 500G
+
+	we can define an absolute zoning distribution with:
+
+		random_distribution=zoned_abs=60/20G:30/100G:10/500g
+
+	Similarly to how :option:`bssplit` works for setting ranges and
+	percentages of block sizes. Like :option:`bssplit`, it's possible to
+	specify separate zones for reads, writes, and trims. If just one set
+	is given, it'll apply to all of them. This goes for both **zoned**
+	**zoned_abs** distributions.
 
 .. option:: percentage_random=int[,int][,int]
 
diff --git a/fio.1 b/fio.1
index a4b0ea6af750..01b4db6f8c9e 100644
--- a/fio.1
+++ b/fio.1
@@ -1033,6 +1033,8 @@ Normal (Gaussian) distribution
 .TP
 .B zoned
 Zoned random distribution
+.B zoned_abs
+Zoned absolute random distribution
 .RE
 .P
 When using a \fBzipf\fR or \fBpareto\fR distribution, an input value is also
@@ -1068,7 +1070,27 @@ example, the user would do:
 random_distribution=zoned:60/10:30/20:8/30:2/40
 .RE
 .P
-similarly to how \fBbssplit\fR works for setting ranges and percentages
+A \fBzoned_abs\fR distribution works exactly like the\fBzoned\fR, except that
+it takes absolute sizes. For example, let's say you wanted to define access
+according to the following criteria:
+.RS
+.P
+.PD 0
+60% of accesses should be to the first 20G
+.P
+30% of accesses should be to the next 100G
+.P
+10% of accesses should be to the next 500G
+.PD
+.RE
+.P
+we can define an absolute zoning distribution with:
+.RS
+.P
+random_distribution=zoned:60/10:30/20:8/30:2/40
+.RE
+.P
+Similarly to how \fBbssplit\fR works for setting ranges and percentages
 of block sizes. Like \fBbssplit\fR, it's possible to specify separate
 zones for reads, writes, and trims. If just one set is given, it'll apply to
 all of them.
diff --git a/fio.h b/fio.h
index 8ca934d14a4c..a44f1aae4721 100644
--- a/fio.h
+++ b/fio.h
@@ -158,6 +158,8 @@ void sk_out_drop(void);
 struct zone_split_index {
 	uint8_t size_perc;
 	uint8_t size_perc_prev;
+	uint64_t size;
+	uint64_t size_prev;
 };
 
 /*
@@ -813,6 +815,7 @@ enum {
 	FIO_RAND_DIST_PARETO,
 	FIO_RAND_DIST_GAUSS,
 	FIO_RAND_DIST_ZONED,
+	FIO_RAND_DIST_ZONED_ABS,
 };
 
 #define FIO_DEF_ZIPF		1.1
diff --git a/io_u.c b/io_u.c
index 81ee724b7357..6ec04fa30607 100644
--- a/io_u.c
+++ b/io_u.c
@@ -157,6 +157,80 @@ static int __get_next_rand_offset_gauss(struct thread_data *td,
 	return 0;
 }
 
+static int __get_next_rand_offset_zoned_abs(struct thread_data *td,
+					    struct fio_file *f,
+					    enum fio_ddir ddir, uint64_t *b)
+{
+	struct zone_split_index *zsi;
+	uint64_t offset, lastb;
+	uint64_t send, stotal;
+	static int warned;
+	unsigned int v;
+
+	lastb = last_block(td, f, ddir);
+	if (!lastb)
+		return 1;
+
+	if (!td->o.zone_split_nr[ddir]) {
+bail:
+		return __get_next_rand_offset(td, f, ddir, b, lastb);
+	}
+
+	/*
+	 * Generate a value, v, between 1 and 100, both inclusive
+	 */
+	v = rand32_between(&td->zone_state, 1, 100);
+
+	zsi = &td->zone_state_index[ddir][v - 1];
+	stotal = zsi->size_prev / td->o.ba[ddir];
+	send = zsi->size / td->o.ba[ddir];
+
+	/*
+	 * Should never happen
+	 */
+	if (send == -1U) {
+		if (!warned) {
+			log_err("fio: bug in zoned generation\n");
+			warned = 1;
+		}
+		goto bail;
+	} else if (send > lastb) {
+		/*
+		 * This happens if the user specifies ranges that exceed
+		 * the file/device size. We can't handle that gracefully,
+		 * so error and exit.
+		 */
+		log_err("fio: zoned_abs sizes exceed file size\n");
+		return 1;
+	}
+
+	/*
+	 * 'send' is some percentage below or equal to 100 that
+	 * marks the end of the current IO range. 'stotal' marks
+	 * the start, in percent.
+	 */
+	if (stotal)
+		offset = stotal;
+	else
+		offset = 0;
+
+	lastb = send - stotal;
+
+	/*
+	 * Generate index from 0..send-of-lastb
+	 */
+	if (__get_next_rand_offset(td, f, ddir, b, lastb) == 1)
+		return 1;
+
+	/*
+	 * Add our start offset, if any
+	 */
+	if (offset)
+		*b += offset;
+
+	return 0;
+}
+
 static int __get_next_rand_offset_zoned(struct thread_data *td,
 					struct fio_file *f, enum fio_ddir ddir,
 					uint64_t *b)
@@ -249,6 +323,8 @@ static int get_off_from_method(struct thread_data *td, struct fio_file *f,
 		return __get_next_rand_offset_gauss(td, f, ddir, b);
 	else if (td->o.random_distribution == FIO_RAND_DIST_ZONED)
 		return __get_next_rand_offset_zoned(td, f, ddir, b);
+	else if (td->o.random_distribution == FIO_RAND_DIST_ZONED_ABS)
+		return __get_next_rand_offset_zoned_abs(td, f, ddir, b);
 
 	log_err("fio: unknown random distribution: %d\n", td->o.random_distribution);
 	return 1;
diff --git a/options.c b/options.c
index 4bea8f781304..d979f804d76c 100644
--- a/options.c
+++ b/options.c
@@ -54,16 +54,19 @@ static int bs_cmp(const void *p1, const void *p2)
 	return (int) bsp1->perc - (int) bsp2->perc;
 }
 
+#define SPLIT_MAX_ENTRY	100
+
 struct split {
 	unsigned int nr;
-	unsigned int val1[100];
-	unsigned int val2[100];
+	unsigned int val1[SPLIT_MAX_ENTRY];
+	unsigned long long val2[SPLIT_MAX_ENTRY];
 };
 
 static int split_parse_ddir(struct thread_options *o, struct split *split,
-			    enum fio_ddir ddir, char *str)
+			    enum fio_ddir ddir, char *str, bool absolute)
 {
-	unsigned int i, perc;
+	unsigned long long perc;
+	unsigned int i;
 	long long val;
 	char *fname;
 
@@ -80,23 +83,35 @@ static int split_parse_ddir(struct thread_options *o, struct split *split,
 		if (perc_str) {
 			*perc_str = '\0';
 			perc_str++;
-			perc = atoi(perc_str);
-			if (perc > 100)
-				perc = 100;
-			else if (!perc)
+			if (absolute) {
+				if (str_to_decimal(perc_str, &val, 1, o, 0, 0)) {
+					log_err("fio: split conversion failed\n");
+					return 1;
+				}
+				perc = val;
+			} else {
+				perc = atoi(perc_str);
+				if (perc > 100)
+					perc = 100;
+				else if (!perc)
+					perc = -1U;
+			}
+		} else {
+			if (absolute)
+				perc = 0;
+			else
 				perc = -1U;
-		} else
-			perc = -1U;
+		}
 
 		if (str_to_decimal(fname, &val, 1, o, 0, 0)) {
-			log_err("fio: bssplit conversion failed\n");
+			log_err("fio: split conversion failed\n");
 			return 1;
 		}
 
 		split->val1[i] = val;
 		split->val2[i] = perc;
 		i++;
-		if (i == 100)
+		if (i == SPLIT_MAX_ENTRY)
 			break;
 	}
 
@@ -104,7 +119,8 @@ static int split_parse_ddir(struct thread_options *o, struct split *split,
 	return 0;
 }
 
-static int bssplit_ddir(struct thread_options *o, enum fio_ddir ddir, char *str)
+static int bssplit_ddir(struct thread_options *o, enum fio_ddir ddir, char *str,
+			bool data)
 {
 	unsigned int i, perc, perc_missing;
 	unsigned int max_bs, min_bs;
@@ -112,7 +128,7 @@ static int bssplit_ddir(struct thread_options *o, enum fio_ddir ddir, char *str)
 
 	memset(&split, 0, sizeof(split));
 
-	if (split_parse_ddir(o, &split, ddir, str))
+	if (split_parse_ddir(o, &split, ddir, str, data))
 		return 1;
 	if (!split.nr)
 		return 0;
@@ -176,9 +192,10 @@ static int bssplit_ddir(struct thread_options *o, enum fio_ddir ddir, char *str)
 	return 0;
 }
 
-typedef int (split_parse_fn)(struct thread_options *, enum fio_ddir, char *);
+typedef int (split_parse_fn)(struct thread_options *, enum fio_ddir, char *, bool);
 
-static int str_split_parse(struct thread_data *td, char *str, split_parse_fn *fn)
+static int str_split_parse(struct thread_data *td, char *str,
+			   split_parse_fn *fn, bool data)
 {
 	char *odir, *ddir;
 	int ret = 0;
@@ -187,37 +204,37 @@ static int str_split_parse(struct thread_data *td, char *str, split_parse_fn *fn
 	if (odir) {
 		ddir = strchr(odir + 1, ',');
 		if (ddir) {
-			ret = fn(&td->o, DDIR_TRIM, ddir + 1);
+			ret = fn(&td->o, DDIR_TRIM, ddir + 1, data);
 			if (!ret)
 				*ddir = '\0';
 		} else {
 			char *op;
 
 			op = strdup(odir + 1);
-			ret = fn(&td->o, DDIR_TRIM, op);
+			ret = fn(&td->o, DDIR_TRIM, op, data);
 
 			free(op);
 		}
 		if (!ret)
-			ret = fn(&td->o, DDIR_WRITE, odir + 1);
+			ret = fn(&td->o, DDIR_WRITE, odir + 1, data);
 		if (!ret) {
 			*odir = '\0';
-			ret = fn(&td->o, DDIR_READ, str);
+			ret = fn(&td->o, DDIR_READ, str, data);
 		}
 	} else {
 		char *op;
 
 		op = strdup(str);
-		ret = fn(&td->o, DDIR_WRITE, op);
+		ret = fn(&td->o, DDIR_WRITE, op, data);
 		free(op);
 
 		if (!ret) {
 			op = strdup(str);
-			ret = fn(&td->o, DDIR_TRIM, op);
+			ret = fn(&td->o, DDIR_TRIM, op, data);
 			free(op);
 		}
 		if (!ret)
-			ret = fn(&td->o, DDIR_READ, str);
+			ret = fn(&td->o, DDIR_READ, str, data);
 	}
 
 	return ret;
@@ -234,7 +251,7 @@ static int str_bssplit_cb(void *data, const char *input)
 	strip_blank_front(&str);
 	strip_blank_end(str);
 
-	ret = str_split_parse(td, str, bssplit_ddir);
+	ret = str_split_parse(td, str, bssplit_ddir, false);
 
 	if (parse_dryrun()) {
 		int i;
@@ -824,14 +841,14 @@ static int str_sfr_cb(void *data, const char *str)
 #endif
 
 static int zone_split_ddir(struct thread_options *o, enum fio_ddir ddir,
-			   char *str)
+			   char *str, bool absolute)
 {
 	unsigned int i, perc, perc_missing, sperc, sperc_missing;
 	struct split split;
 
 	memset(&split, 0, sizeof(split));
 
-	if (split_parse_ddir(o, &split, ddir, str))
+	if (split_parse_ddir(o, &split, ddir, str, absolute))
 		return 1;
 	if (!split.nr)
 		return 0;
@@ -840,7 +857,10 @@ static int zone_split_ddir(struct thread_options *o, enum fio_ddir ddir,
 	o->zone_split_nr[ddir] = split.nr;
 	for (i = 0; i < split.nr; i++) {
 		o->zone_split[ddir][i].access_perc = split.val1[i];
-		o->zone_split[ddir][i].size_perc = split.val2[i];
+		if (absolute)
+			o->zone_split[ddir][i].size = split.val2[i];
+		else
+			o->zone_split[ddir][i].size_perc = split.val2[i];
 	}
 
 	/*
@@ -856,11 +876,12 @@ static int zone_split_ddir(struct thread_options *o, enum fio_ddir ddir,
 		else
 			perc += zsp->access_perc;
 
-		if (zsp->size_perc == (uint8_t) -1U)
-			sperc_missing++;
-		else
-			sperc += zsp->size_perc;
-
+		if (!absolute) {
+			if (zsp->size_perc == (uint8_t) -1U)
+				sperc_missing++;
+			else
+				sperc += zsp->size_perc;
+		}
 	}
 
 	if (perc > 100 || sperc > 100) {
@@ -908,10 +929,11 @@ static int zone_split_ddir(struct thread_options *o, enum fio_ddir ddir,
 static void __td_zone_gen_index(struct thread_data *td, enum fio_ddir ddir)
 {
 	unsigned int i, j, sprev, aprev;
+	uint64_t sprev_sz;
 
 	td->zone_state_index[ddir] = malloc(sizeof(struct zone_split_index) * 100);
 
-	sprev = aprev = 0;
+	sprev_sz = sprev = aprev = 0;
 	for (i = 0; i < td->o.zone_split_nr[ddir]; i++) {
 		struct zone_split *zsp = &td->o.zone_split[ddir][i];
 
@@ -920,10 +942,14 @@ static void __td_zone_gen_index(struct thread_data *td, enum fio_ddir ddir)
 
 			zsi->size_perc = sprev + zsp->size_perc;
 			zsi->size_perc_prev = sprev;
+
+			zsi->size = sprev_sz + zsp->size;
+			zsi->size_prev = sprev_sz;
 		}
 
 		aprev += zsp->access_perc;
 		sprev += zsp->size_perc;
+		sprev_sz += zsp->size;
 	}
 }
 
@@ -942,8 +968,10 @@ static void td_zone_gen_index(struct thread_data *td)
 		__td_zone_gen_index(td, i);
 }
 
-static int parse_zoned_distribution(struct thread_data *td, const char *input)
+static int parse_zoned_distribution(struct thread_data *td, const char *input,
+				    bool absolute)
 {
+	const char *pre = absolute ? "zoned_abs:" : "zoned:";
 	char *str, *p;
 	int i, ret = 0;
 
@@ -953,14 +981,14 @@ static int parse_zoned_distribution(struct thread_data *td, const char *input)
 	strip_blank_end(str);
 
 	/* We expect it to start like that, bail if not */
-	if (strncmp(str, "zoned:", 6)) {
+	if (strncmp(str, pre, strlen(pre))) {
 		log_err("fio: mismatch in zoned input <%s>\n", str);
 		free(p);
 		return 1;
 	}
-	str += strlen("zoned:");
+	str += strlen(pre);
 
-	ret = str_split_parse(td, str, zone_split_ddir);
+	ret = str_split_parse(td, str, zone_split_ddir, absolute);
 
 	free(p);
 
@@ -972,8 +1000,15 @@ static int parse_zoned_distribution(struct thread_data *td, const char *input)
 		for (j = 0; j < td->o.zone_split_nr[i]; j++) {
 			struct zone_split *zsp = &td->o.zone_split[i][j];
 
-			dprint(FD_PARSE, "\t%d: %u/%u\n", j, zsp->access_perc,
-								zsp->size_perc);
+			if (absolute) {
+				dprint(FD_PARSE, "\t%d: %u/%llu\n", j,
+						zsp->access_perc,
+						(unsigned long long) zsp->size);
+			} else {
+				dprint(FD_PARSE, "\t%d: %u/%u\n", j,
+						zsp->access_perc,
+						zsp->size_perc);
+			}
 		}
 	}
 
@@ -1012,7 +1047,9 @@ static int str_random_distribution_cb(void *data, const char *str)
 	else if (td->o.random_distribution == FIO_RAND_DIST_GAUSS)
 		val = 0.0;
 	else if (td->o.random_distribution == FIO_RAND_DIST_ZONED)
-		return parse_zoned_distribution(td, str);
+		return parse_zoned_distribution(td, str, false);
+	else if (td->o.random_distribution == FIO_RAND_DIST_ZONED_ABS)
+		return parse_zoned_distribution(td, str, true);
 	else
 		return 0;
 
@@ -2241,7 +2278,10 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 			    .oval = FIO_RAND_DIST_ZONED,
 			    .help = "Zoned random distribution",
 			  },
-
+			  { .ival = "zoned_abs",
+			    .oval = FIO_RAND_DIST_ZONED_ABS,
+			    .help = "Zoned absolute random distribution",
+			  },
 		},
 		.category = FIO_OPT_C_IO,
 		.group	= FIO_OPT_G_RANDOM,
diff --git a/server.h b/server.h
index ba3abfeb3228..dbd5c277de6b 100644
--- a/server.h
+++ b/server.h
@@ -49,7 +49,7 @@ struct fio_net_cmd_reply {
 };
 
 enum {
-	FIO_SERVER_VER			= 66,
+	FIO_SERVER_VER			= 67,
 
 	FIO_SERVER_MAX_FRAGMENT_PDU	= 1024,
 	FIO_SERVER_MAX_CMD_MB		= 2048,
diff --git a/thread_options.h b/thread_options.h
index ca549b542703..050cd3822914 100644
--- a/thread_options.h
+++ b/thread_options.h
@@ -36,6 +36,8 @@ struct bssplit {
 struct zone_split {
 	uint8_t access_perc;
 	uint8_t size_perc;
+	uint8_t pad[6];
+	uint64_t size;
 };
 
 #define NR_OPTS_SZ	(FIO_MAX_OPTS / (8 * sizeof(uint64_t)))

-- 
Jens Axboe



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-30  2:16               ` Jens Axboe
@ 2017-11-30  2:30                 ` Jens Axboe
  2017-11-30 21:19                   ` Phillip Chen
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-30  2:30 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

On 11/29/2017 07:16 PM, Jens Axboe wrote:
> On 11/29/2017 06:37 PM, Jens Axboe wrote:
>> On 11/29/2017 01:58 PM, Jens Axboe wrote:
>>> I'll take a look at adding absolute zoning, should be pretty
>>> trivial.
>>
>> Quick and dirty here, can you see if this works with your magic
>> script?
>>
>> Basically you just do:
>>
>> random_distribution=zoned_abs:60/100m:10/200m:30/700m                           
>>
>> like you would for 'zoned' - the first is a percentage, the
>> other part is a size. So the above would be:
>>
>> - 60% of accesss to the first 100M
>> - 10% of access to the next 200M
>> - 30% of access to the next 700m
>>
>> No checking for whether or not we exceed device/file size or anything
>> like that in this version, but everything else should work with the
>> existing code pretty nicely. Not tested...
> 
> Added documentation, and modified your script do have:
> 
> dist_str = "zoned_abs:50/35770m:10/286166m:40/35770m"                           
> 
> for a 375G nvme drive I have, this yields:
> 
> histogram percents = [25.011592135525543, 25.014596077621864, 0.6177337336535484, 0.6206606515935528, 0.6198133858740779, 0.6175796853409167, 0.6187350476856551, 0.6266685357861933, 0.6370667968888403, 0.6393004974220015, 0.6279009222872477, 0.6302886711330408, 0.6230484004393458, 0.6274387773493523, 0.6131122842745942, 0.6258982942230342, 0.614036574150385, 0.6185809993730234, 19.971208370369116, 20.02474015900867]
> histogram edges = [36629084.0, 73258168.0, 109887252.0, 146516336.0, 183145420.0, 219774504.0, 256403588.0, 293032672.0, 329661756.0, 366290840.0, 402919924.0, 439549008.0, 476178092.0, 512807176.0, 549436260.0, 586065344.0, 622694428.0, 659323512.0, 695952596.0, 732581680.0]
> zoned_abs:50/35770m:10/286166m:40/35770m
> 
> which looks correct to me - 50% in the first two buckets, which is defined as
> the first 10% of the drive. 40% in the last two buckets, which is the last 10%
> of the drive. And the rest in the middle, which should add up to 10 (quick
> eyeballing says it does).
> 
> Updated patch below. Only change is the added documentation, and a check
> for exceeding device/file size.

Did some basic testing, and it seems fine to me. I've committed it, so
may be easier for you to just git update and run with that.

Do let me know when you test it and what your findings are, so we can
ensure that it works the way it should for all cases.
 

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-30  2:30                 ` Jens Axboe
@ 2017-11-30 21:19                   ` Phillip Chen
  2017-11-30 21:22                     ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Phillip Chen @ 2017-11-30 21:19 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

It looks like it's working great! I haven't found any problems in any
of the cases I've tried, and I'll let you know if I find any issues as
I work on more test cases.
Thanks again for the help and quick turn-around,
Phillip Chen

On Wed, Nov 29, 2017 at 7:30 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On 11/29/2017 07:16 PM, Jens Axboe wrote:
>> On 11/29/2017 06:37 PM, Jens Axboe wrote:
>>> On 11/29/2017 01:58 PM, Jens Axboe wrote:
>>>> I'll take a look at adding absolute zoning, should be pretty
>>>> trivial.
>>>
>>> Quick and dirty here, can you see if this works with your magic
>>> script?
>>>
>>> Basically you just do:
>>>
>>> random_distribution=zoned_abs:60/100m:10/200m:30/700m
>>>
>>> like you would for 'zoned' - the first is a percentage, the
>>> other part is a size. So the above would be:
>>>
>>> - 60% of accesss to the first 100M
>>> - 10% of access to the next 200M
>>> - 30% of access to the next 700m
>>>
>>> No checking for whether or not we exceed device/file size or anything
>>> like that in this version, but everything else should work with the
>>> existing code pretty nicely. Not tested...
>>
>> Added documentation, and modified your script do have:
>>
>> dist_str = "zoned_abs:50/35770m:10/286166m:40/35770m"
>>
>> for a 375G nvme drive I have, this yields:
>>
>> histogram percents = [25.011592135525543, 25.014596077621864, 0.6177337336535484, 0.6206606515935528, 0.6198133858740779, 0.6175796853409167, 0.6187350476856551, 0.6266685357861933, 0.6370667968888403, 0.6393004974220015, 0.6279009222872477, 0.6302886711330408, 0.6230484004393458, 0.6274387773493523, 0.6131122842745942, 0.6258982942230342, 0.614036574150385, 0.6185809993730234, 19.971208370369116, 20.02474015900867]
>> histogram edges = [36629084.0, 73258168.0, 109887252.0, 146516336.0, 183145420.0, 219774504.0, 256403588.0, 293032672.0, 329661756.0, 366290840.0, 402919924.0, 439549008.0, 476178092.0, 512807176.0, 549436260.0, 586065344.0, 622694428.0, 659323512.0, 695952596.0, 732581680.0]
>> zoned_abs:50/35770m:10/286166m:40/35770m
>>
>> which looks correct to me - 50% in the first two buckets, which is defined as
>> the first 10% of the drive. 40% in the last two buckets, which is the last 10%
>> of the drive. And the rest in the middle, which should add up to 10 (quick
>> eyeballing says it does).
>>
>> Updated patch below. Only change is the added documentation, and a check
>> for exceeding device/file size.
>
> Did some basic testing, and it seems fine to me. I've committed it, so
> may be easier for you to just git update and run with that.
>
> Do let me know when you test it and what your findings are, so we can
> ensure that it works the way it should for all cases.
>
>
> --
> Jens Axboe
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-30 21:19                   ` Phillip Chen
@ 2017-11-30 21:22                     ` Jens Axboe
  2017-11-30 23:41                       ` Phillip Chen
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-30 21:22 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

On 11/30/2017 02:19 PM, Phillip Chen wrote:
> It looks like it's working great! I haven't found any problems in any
> of the cases I've tried, and I'll let you know if I find any issues as
> I work on more test cases.

Great, thanks for testing!

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-30 21:22                     ` Jens Axboe
@ 2017-11-30 23:41                       ` Phillip Chen
  2017-11-30 23:47                         ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Phillip Chen @ 2017-11-30 23:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

It appears that there is a limitation of 64 maximum zones. I tried to
use 200 zones (100 empty and 100 with 1%) and I got the following
error: "fio: access percentage don't add up to 100 for zoned random
distribution (got=32)". Would it be possible to extend the number of
zones up to 256 (since 200 is the maximum that you'd be able to use
since the percentage is the smallest chance usable)?
Here's the string I was using:
fio --name=rand_reads --ioengine=libaio --direct=1 --exitall --thread
--filename=/dev/sde --runtime=30 --readwrite=randread --iodepth=1
--random_distribution=zoned_abs:0/1879048192:1/256m:0/44023414784:1/256m:0/2415919104:1/256m:0/4563402752:1/256m:0/2415919104:1/256m:0/116500987904:1/256m:0/18253611008:1/256m:0/107642617856:1/256m:0/82946555904:1/256m:0/34359738368:1/256m:0/53687091200:1/256m:0/98247376896:1/256m:0/74088185856:1/256m:0/28185722880:1/256m:0/28722593792:1/256m:0/2415919104:1/256m:0/27380416512:1/256m:0/116769423360:1/256m:0/27380416512:1/256m:0/24159191040:1/256m:0/3221225472:1/256m:0/33554432000:1/256m:0/63619203072:1/256m:0/13958643712:1/256m:0/37312528384:1/256m:0/8589934592:1/256m:0/53687091200:1/256m:0/36507222016:1/256m:0/48586817536:1/256m:0/3489660928:1/256m:0/86436216832:1/256m:0/70866960384:1/256m:0/163477192704:1/256m:0/96099893248:1/256m:0/17985175552:1/256m:0/22817013760:1/256m:0/30064771072:1/256m:0/15300820992:1/256m:0/61740154880:1/256m:0/16911433728:1/256m:0/64961380352:1/256m:0/21206401024:1/256m:0/30870077440:1/256m:0/49660559360:1/256m:0/47513075712:1/256m:0/6710886400:1/256m:0/5637144576:1/256m:0/1879048192:1/256m:0/71940702208:1/256m:0/34896609280:1/256m:0/25232932864:1/256m:0/42949672960:1/256m:0/12079595520:1/256m:0/58787364864:1/256m:0/11005853696:1/256m:0/31943819264:1/256m:0/15837691904:1/256m:0/76772540416:1/256m:0/24427626496:1/256m:0/16642998272:1/256m:0/4831838208:1/256m:0/17179869184:1/256m:0/34628173824:1/256m:0/70330089472:1/256m:0/20937965568:1/256m:0/21474836480:1/256m:0/22548578304:1/256m:0/8321499136:1/256m:0/87509958656:1/256m:0/33017561088:1/256m:0/2952790016:1/256m:0/2415919104:1/256m:0/42949672960:1/256m:0/79725330432:1/256m:0/48586817536:1/256m:0/4563402752:1/256m:0/5905580032:1/256m:0/20669530112:1/256m:0/17179869184:1/256m:0/4563402752:1/256m:0/121064390656:1/256m:0/41875931136:1/256m:0/63082332160:1/256m:0/13958643712:1/256m:0/17985175552:1/256m:0/46707769344:1/256m:0/1342177280:1/256m:0/23085449216:1/256m:0/38654705664:1/256m:0/47244640256:1/256m:0/5100273664:1/256m:0/77846282240:1/256m:0/17179869184:1/256m:0/18522046464:1/256m:0/40533753856:1/256m:0/83483426816:1/256m:0/1342177280:1/256m:0/61471719424:1/256m:0/61740154880:1/256m:0/100126425088:1/256m
--output-format=terse

Thanks,
Phillip Chen

On Thu, Nov 30, 2017 at 2:22 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On 11/30/2017 02:19 PM, Phillip Chen wrote:
>> It looks like it's working great! I haven't found any problems in any
>> of the cases I've tried, and I'll let you know if I find any issues as
>> I work on more test cases.
>
> Great, thanks for testing!
>
> --
> Jens Axboe
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-30 23:41                       ` Phillip Chen
@ 2017-11-30 23:47                         ` Jens Axboe
  2017-11-30 23:48                           ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-30 23:47 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

On 11/30/2017 04:41 PM, Phillip Chen wrote:
> It appears that there is a limitation of 64 maximum zones. I tried to
> use 200 zones (100 empty and 100 with 1%) and I got the following
> error: "fio: access percentage don't add up to 100 for zoned random
> distribution (got=32)". Would it be possible to extend the number of
> zones up to 256 (since 200 is the maximum that you'd be able to use
> since the percentage is the smallest chance usable)?
> Here's the string I was using:
> fio --name=rand_reads --ioengine=libaio --direct=1 --exitall --thread
> --filename=/dev/sde --runtime=30 --readwrite=randread --iodepth=1
> --random_distribution=zoned_abs:0/1879048192:1/256m:0/44023414784:1/256m:0/2415919104:1/256m:0/4563402752:1/256m:0/2415919104:1/256m:0/116500987904:1/256m:0/18253611008:1/256m:0/107642617856:1/256m:0/82946555904:1/256m:0/34359738368:1/256m:0/53687091200:1/256m:0/98247376896:1/256m:0/74088185856:1/256m:0/28185722880:1/256m:0/28722593792:1/256m:0/2415919104:1/256m:0/27380416512:1/256m:0/116769423360:1/256m:0/27380416512:1/256m:0/24159191040:1/256m:0/3221225472:1/256m:0/33554432000:1/256m:0/63619203072:1/256m:0/13958643712:1/256m:0/37312528384:1/256m:0/8589934592:1/256m:0/53687091200:1/256m:0/36507222016:1/256m:0/48586817536:1/256m:0/3489660928:1/256m:0/86436216832:1/256m:0/70866960384:1/256m:0/163477192704:1/256m:0/96099893248:1/256m:0/17985175552:1/256m:0/22817013760:1/256m:0/30064771072:1/256m:0/15300820992:1/256m:0/61740154880:1/256m:0/16911433728:1/256m:0/64961380352:1/256m:0/21206401024:1/256m:0/30870077440:1/256m:0/49660559360:1/256m:0/47513075712:1/256m:0/6710886400:1/256m:0/5637144576:1/256m:0/1879048192:1/256m:0/71940702208:1/256m:0/34896609280:1/256m:0/25232932864:1/256m:0/42949672960:1/256m:0/12079595520:1/256m:0/58787364864:1/256m:0/11005853696:1/256m:0/31943819264:1/256m:0/15837691904:1/256m:0/76772540416:1/256m:0/24427626496:1/256m:0/16642998272:1/256m:0/4831838208:1/256m:0/17179869184:1/256m:0/34628173824:1/256m:0/70330089472:1/256m:0/20937965568:1/256m:0/21474836480:1/256m:0/22548578304:1/256m:0/8321499136:1/256m:0/87509958656:1/256m:0/33017561088:1/256m:0/2952790016:1/256m:0/2415919104:1/256m:0/42949672960:1/256m:0/79725330432:1/256m:0/48586817536:1/256m:0/4563402752:1/256m:0/5905580032:1/256m:0/20669530112:1/256m:0/17179869184:1/256m:0/4563402752:1/256m:0/121064390656:1/256m:0/41875931136:1/256m:0/63082332160:1/256m:0/13958643712:1/256m:0/17985175552:1/256m:0/46707769344:1/256m:0/1342177280:1/256m:0/23085449216:1/256m:0/38654705664:1/256m:0/47244640256:1/256m:0/5100273664:1/256m:0/77846282240:1/256m:0/17179869184:1/256m:0/18522046464:1/256m:0/40533753856:1/256m:0/83483426816:1/256m:0/1342177280:1/256m:0/61471719424:1/256m:0/61740154880:1/256m:0/100126425088:1/256m
> --output-format=terse

That's correct, there's an imposed limit of 64 zones. The only issue with
lifting that limit is that it severely bumps the size of the packed
variant of the thread options. Going from 64 to 256 would make that
about 9k larger. Just an on-wire thing, so not a huge concern.

I'll make the change.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-30 23:47                         ` Jens Axboe
@ 2017-11-30 23:48                           ` Jens Axboe
  2017-12-01  0:04                             ` Phillip Chen
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2017-11-30 23:48 UTC (permalink / raw)
  To: Phillip Chen; +Cc: fio

On 11/30/2017 04:47 PM, Jens Axboe wrote:
> On 11/30/2017 04:41 PM, Phillip Chen wrote:
>> It appears that there is a limitation of 64 maximum zones. I tried to
>> use 200 zones (100 empty and 100 with 1%) and I got the following
>> error: "fio: access percentage don't add up to 100 for zoned random
>> distribution (got=32)". Would it be possible to extend the number of
>> zones up to 256 (since 200 is the maximum that you'd be able to use
>> since the percentage is the smallest chance usable)?
>> Here's the string I was using:
>> fio --name=rand_reads --ioengine=libaio --direct=1 --exitall --thread
>> --filename=/dev/sde --runtime=30 --readwrite=randread --iodepth=1
>> --random_distribution=zoned_abs:0/1879048192:1/256m:0/44023414784:1/256m:0/2415919104:1/256m:0/4563402752:1/256m:0/2415919104:1/256m:0/116500987904:1/256m:0/18253611008:1/256m:0/107642617856:1/256m:0/82946555904:1/256m:0/34359738368:1/256m:0/53687091200:1/256m:0/98247376896:1/256m:0/74088185856:1/256m:0/28185722880:1/256m:0/28722593792:1/256m:0/2415919104:1/256m:0/27380416512:1/256m:0/116769423360:1/256m:0/27380416512:1/256m:0/24159191040:1/256m:0/3221225472:1/256m:0/33554432000:1/256m:0/63619203072:1/256m:0/13958643712:1/256m:0/37312528384:1/256m:0/8589934592:1/256m:0/53687091200:1/256m:0/36507222016:1/256m:0/48586817536:1/256m:0/3489660928:1/256m:0/86436216832:1/256m:0/70866960384:1/256m:0/163477192704:1/256m:0/96099893248:1/256m:0/17985175552:1/256m:0/22817013760:1/256m:0/30064771072:1/256m:0/15300820992:1/256m:0/61740154880:1/256m:0/16911433728:1/256m:0/64961380352:1/256m:0/21206401024:1/256m:0/30870077440:1/256m:0/49660559360:1/256m:0/47513075712:1/256m:0/6710886400:1/256m:0/5637144576:1/256m:0/1879048192:1/256m:0/71940702208:1/256m:0/34896609280:1/256m:0/25232932864:1/256m:0/42949672960:1/256m:0/12079595520:1/256m:0/58787364864:1/256m:0/11005853696:1/256m:0/31943819264:1/256m:0/15837691904:1/256m:0/76772540416:1/256m:0/24427626496:1/256m:0/16642998272:1/256m:0/4831838208:1/256m:0/17179869184:1/256m:0/34628173824:1/256m:0/70330089472:1/256m:0/20937965568:1/256m:0/21474836480:1/256m:0/22548578304:1/256m:0/8321499136:1/256m:0/87509958656:1/256m:0/33017561088:1/256m:0/2952790016:1/256m:0/2415919104:1/256m:0/42949672960:1/256m:0/79725330432:1/256m:0/48586817536:1/256m:0/4563402752:1/256m:0/5905580032:1/256m:0/20669530112:1/256m:0/17179869184:1/256m:0/4563402752:1/256m:0/121064390656:1/256m:0/41875931136:1/256m:0/63082332160:1/256m:0/13958643712:1/256m:0/17985175552:1/256m:0/46707769344:1/256m:0/1342177280:1/256m:0/23085449216:1/256m:0/38654705664:1/256m:0/47244640256:1/256m:0/5100273664:1/256m:0/77846282240:1/256m:0/17179869184:1/256m:0/18522046464:1/256m:0/40533753856:1/256m:0/83483426816:1/256m:0/1342177280:1/256m:0/61471719424:1/256m:0/61740154880:1/256m:0/100126425088:1/256m
>> --output-format=terse
> 
> That's correct, there's an imposed limit of 64 zones. The only issue with
> lifting that limit is that it severely bumps the size of the packed
> variant of the thread options. Going from 64 to 256 would make that
> about 9k larger. Just an on-wire thing, so not a huge concern.

If you git pull, it should support up to 256.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Random distribution: zoned argument
  2017-11-30 23:48                           ` Jens Axboe
@ 2017-12-01  0:04                             ` Phillip Chen
  0 siblings, 0 replies; 16+ messages in thread
From: Phillip Chen @ 2017-12-01  0:04 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

Excellent, thank you for the change,
Phillip Chen

On Thu, Nov 30, 2017 at 4:48 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On 11/30/2017 04:47 PM, Jens Axboe wrote:
>> On 11/30/2017 04:41 PM, Phillip Chen wrote:
>>> It appears that there is a limitation of 64 maximum zones. I tried to
>>> use 200 zones (100 empty and 100 with 1%) and I got the following
>>> error: "fio: access percentage don't add up to 100 for zoned random
>>> distribution (got=32)". Would it be possible to extend the number of
>>> zones up to 256 (since 200 is the maximum that you'd be able to use
>>> since the percentage is the smallest chance usable)?
>>> Here's the string I was using:
>>> fio --name=rand_reads --ioengine=libaio --direct=1 --exitall --thread
>>> --filename=/dev/sde --runtime=30 --readwrite=randread --iodepth=1
>>> --random_distribution=zoned_abs:0/1879048192:1/256m:0/44023414784:1/256m:0/2415919104:1/256m:0/4563402752:1/256m:0/2415919104:1/256m:0/116500987904:1/256m:0/18253611008:1/256m:0/107642617856:1/256m:0/82946555904:1/256m:0/34359738368:1/256m:0/53687091200:1/256m:0/98247376896:1/256m:0/74088185856:1/256m:0/28185722880:1/256m:0/28722593792:1/256m:0/2415919104:1/256m:0/27380416512:1/256m:0/116769423360:1/256m:0/27380416512:1/256m:0/24159191040:1/256m:0/3221225472:1/256m:0/33554432000:1/256m:0/63619203072:1/256m:0/13958643712:1/256m:0/37312528384:1/256m:0/8589934592:1/256m:0/53687091200:1/256m:0/36507222016:1/256m:0/48586817536:1/256m:0/3489660928:1/256m:0/86436216832:1/256m:0/70866960384:1/256m:0/163477192704:1/256m:0/96099893248:1/256m:0/17985175552:1/256m:0/22817013760:1/256m:0/30064771072:1/256m:0/15300820992:1/256m:0/61740154880:1/256m:0/16911433728:1/256m:0/64961380352:1/256m:0/21206401024:1/256m:0/30870077440:1/256m:0/49660559360:1/256m:0/47513075712:1/256m:0/6710886400:1/256m:0/5637144576:1/256m:0/1879048192:1/256m:0/71940702208:1/256m:0/34896609280:1/256m:0/25232932864:1/256m:0/42949672960:1/256m:0/12079595520:1/256m:0/58787364864:1/256m:0/11005853696:1/256m:0/31943819264:1/256m:0/15837691904:1/256m:0/76772540416:1/256m:0/24427626496:1/256m:0/16642998272:1/256m:0/4831838208:1/256m:0/17179869184:1/256m:0/34628173824:1/256m:0/70330089472:1/256m:0/20937965568:1/256m:0/21474836480:1/256m:0/22548578304:1/256m:0/8321499136:1/256m:0/87509958656:1/256m:0/33017561088:1/256m:0/2952790016:1/256m:0/2415919104:1/256m:0/42949672960:1/256m:0/79725330432:1/256m:0/48586817536:1/256m:0/4563402752:1/256m:0/5905580032:1/256m:0/20669530112:1/256m:0/17179869184:1/256m:0/4563402752:1/256m:0/121064390656:1/256m:0/41875931136:1/256m:0/63082332160:1/256m:0/13958643712:1/256m:0/17985175552:1/256m:0/46707769344:1/256m:0/1342177280:1/256m:0/23085449216:1/256m:0/38654705664:1/256m:0/47244640256:1/256m:0/5100273664:1/256m:0/77846282240:1/256m:0/17179869184:1/256m:0/18522046464:1/256m:0/40533753856:1/256m:0/83483426816:1/256m:0/1342177280:1/256m:0/61471719424:1/256m:0/61740154880:1/256m:0/100126425088:1/256m
>>> --output-format=terse
>>
>> That's correct, there's an imposed limit of 64 zones. The only issue with
>> lifting that limit is that it severely bumps the size of the packed
>> variant of the thread options. Going from 64 to 256 would make that
>> about 9k larger. Just an on-wire thing, so not a huge concern.
>
> If you git pull, it should support up to 256.
>
> --
> Jens Axboe
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-12-01  0:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-20 17:17 Random distribution: zoned argument Phillip Chen
2017-11-21  1:57 ` Jens Axboe
2017-11-27 23:18   ` Phillip Chen
2017-11-29 19:33     ` Jens Axboe
2017-11-29 19:37       ` Jens Axboe
2017-11-29 20:39         ` Phillip Chen
2017-11-29 20:58           ` Jens Axboe
2017-11-30  1:37             ` Jens Axboe
2017-11-30  2:16               ` Jens Axboe
2017-11-30  2:30                 ` Jens Axboe
2017-11-30 21:19                   ` Phillip Chen
2017-11-30 21:22                     ` Jens Axboe
2017-11-30 23:41                       ` Phillip Chen
2017-11-30 23:47                         ` Jens Axboe
2017-11-30 23:48                           ` Jens Axboe
2017-12-01  0:04                             ` Phillip Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.