public inbox for fio@vger.kernel.org
 help / color / mirror / Atom feed
* Evenly distribute jobs and iodepth over a 1 TiB device so that every byte is written to in parallel
@ 2025-07-15  5:17 Thomas Glanzmann
  2025-07-15 20:44 ` Sitsofe Wheeler
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Glanzmann @ 2025-07-15  5:17 UTC (permalink / raw)
  To: fio

Hello,
I have a 1 TiB NVMe namespace from a NetApp connected via two distinct
direct links to a Linux system over NVMe/TCP. I would like to generate
read and write I/O using multiple jobs/iodepth so that every byte of the
device is being written to in parallel with the maximum number of
available parallel inflight I/Os. The NetApp does deduplication
and compression by default so I want to generate random data. Because I
think if I don't do refill_buffers, the NetApp gets that the same data
is used over and over again and dedups it. I tried:

fio --ioengine=libaio --refill_buffers --filesize=25G --ramp_time=2s \
--runtime=1m --numjobs=40 --direct=1 --verify=0 --randrepeat=0 \
--group_reporting --filename=/dev/nvme0n1 --name=1mhqd --blocksize=1m \
--iodepth=1638 --readwrite=write

So I'm on a Linux system with 40 hyperthreads and a mellanox 2x 25
Gbit/s card hooked up to the NetApp:

(live) [~] ip -br a s
...
eth6             UP             192.168.0.100/24
eth7             UP             192.168.1.100/24

(live) [~] nvme list-subsys /dev/nvme0n1
nvme-subsys0 - NQN=nqn.1992-08.com.netapp:sn.e0a0273a60b711f09deed039ead647e8:subsystem.svm1_subsystem_553
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:20f011e6-9ab8-584f-abb0-a260d2d685c4
\
 +- nvme0 tcp traddr=192.168.0.2,trsvcid=4420,src_addr=192.168.0.100 live optimized
 +- nvme1 tcp traddr=192.168.1.2,trsvcid=4420,src_addr=192.168.1.100 live optimized

na2501::*> network interface show
            Logical    Status     Network            Current       Current Is
Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home
----------- ---------- ---------- ------------------ ------------- ------- ----
...
svm1
            lif_svm1_2660 up/up   192.168.1.2/24     na2501-02     e4c     true
            lif_svm1_9354 up/up   192.168.0.2/24     na2501-01     e4c     true

So when I run the above command the NetApp only reports a few hundered GiB of
physically allocated space:

na2501::*> aggr show -fields physical-used
aggregate      physical-used
-------------- -------------
dataFA_4_p0_i1 169.5GB

So, I ran:

(live) [~] pv < /dev/urandom > /dev/nvme0n1
1.00TiB 0:59:14 [ 294MiB/s] [======================>] 100%

And afterwards more physical space was used:

na2501::*> aggr show -fields physical-used
aggregate      physical-used
-------------- -------------
dataFA_4_p0_i1 1.15TB

So, what is the best way to use fio to write random data to every byte of this
1 TiB device in parallel?

	- Is there a command line parameter?
	- Or should I create 40 25.6 GiB (1024/40) partitions and give them as
	  colon separated list to fio?

I also would like to determine the number of queues and queue depth? Is there a
command available. When I run:

fio --ioengine=libaio --refill_buffers --filesize=8G --ramp_time=2s \
--runtime=1m --numjobs=40 --direct=1 --verify=0 --randrepeat=0 \
--group_reporting --filename=/dev/nvme0n1 --name=4khqd --blocksize=4k \
--iodepth=1638 --readwrite=randwrite

And also watch 'iostat -xm 2' I can see aqu-sz is 194.87 per path and
391.96 for the multipathed device nvme0n1. So I kind of know it but
would like to have a command on Linux that shows me the available queues
and queue depths.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.67    0.00    6.48   85.13    0.00    6.72

Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
nvme0c0n1        0.00      0.00     0.00   0.00    0.00     0.00 60855.00    237.71     0.00   0.00    3.20     4.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00  194.87 100.00
nvme0c1n1        0.00      0.00     0.00   0.00    0.00     0.00 58719.00    229.37     0.00   0.00    3.32     4.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00  194.94 100.00
nvme0n1          0.00      0.00     0.00   0.00    0.00     0.00 119570.50    467.07     0.00   0.00    3.28     4.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00  391.96 100.00

Cheers,
        Thomas


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-07-17  7:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-15  5:17 Evenly distribute jobs and iodepth over a 1 TiB device so that every byte is written to in parallel Thomas Glanzmann
2025-07-15 20:44 ` Sitsofe Wheeler
2025-07-17  7:52   ` Thomas Glanzmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox