All of lore.kernel.org
 help / color / mirror / Atom feed
* bad IOPS when running multiple btest/fio in parallel
@ 2018-10-12  4:44 Yao Lin
  2018-10-12 14:39 ` Keith Busch
  2018-10-12 15:49 ` Bart Van Assche
  0 siblings, 2 replies; 8+ messages in thread
From: Yao Lin @ 2018-10-12  4:44 UTC (permalink / raw)


Today I changed to a much simpler setup and the same issue persists.

Directly connect 2 PCs (identical hardware) with a pair of 100G rNICs. Create a null block device on the target PC and configure it as the NVMeOF target. So, there is no switch or SSD in this setup. And this is a single FIO, not the 4 FIO in parallel I mentioned earlier.

Start fio test against that null block device from the host, the best IOPS is 1550K. That's the best IOPS after I try out many different QD, # of job, and CPU affinity setting. Run the same fio test on the target, I get 2250K IOPS (it jumps to 3650K when I increased the number of threads). ?

So it seems to me that Linux NVMe stack is quite good and can support 100Gb/s + throughput. But the same can not be said of the NVMeOF stack. Any tuning possible?

^ permalink raw reply	[flat|nested] 8+ messages in thread
* bad IOPS when running multiple btest/fio in parallel
@ 2018-10-10 21:52 Yao Lin
  2018-10-15  7:55 ` Sagi Grimberg
  0 siblings, 1 reply; 8+ messages in thread
From: Yao Lin @ 2018-10-10 21:52 UTC (permalink / raw)


Host: Ubuntu 18.04 (4.15 kernel). I9-7940X (14C/28T) with 32G DRAM. Has a single-port 100G rNIC. No OFED driver is installed. 

1.	When I insert 4 Intel Optane 905P into that host and run 4 btest in parallel (one btest for each Optane, random read, bs=4K, 6 thread, qd = 32), I am able to get aggregated IOPS of 2380K.
2.	Then I move those 4 Optane into 4 NVMeOF targets (RoCEv2). Each target has a 25G rNIC. All 4 25G rNICs and that 100G rNIC are connected to a switch.
3.	Start iperf from all 4 targets toward the host, the aggregated throughput is 92Gbps. So this means the data path between the host and the targets is clean.
4.	From the host, use "nvme connect" to link up with all 4 targets.
5.	Run non-overlapping btest against each target, IOPS is around 595K each. So this is good.
6.	Run 4 btest in parallel (one btest for each target). This is basically the same as #1, except it's now over the fabric. But the aggregate IOPS is only 1500K. Assign CPU affinity so that each btest uses exclusive 3C/6T doesn't help. Replacing btest by fio doesn't help either.
7.	Replace that 100G rNIC by a model from a different vendor and repeat test #6. The aggregated IOPS is better, but it's still nowhere close to the expected 2380K IOPS.

So I am wondering if there is any known limitation with Linux inbox NVMeOF driver regarding support of multiple sessions in parallel. Any tuning?

Thanks,
Yao

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-10-15  7:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-12  4:44 bad IOPS when running multiple btest/fio in parallel Yao Lin
2018-10-12 14:39 ` Keith Busch
2018-10-12 15:37   ` [EXT] " Yao Lin
2018-10-12 15:49 ` Bart Van Assche
2018-10-12 16:02   ` [EXT] " Yao Lin
2018-10-15  7:50     ` Sagi Grimberg
  -- strict thread matches above, loose matches on Subject: below --
2018-10-10 21:52 Yao Lin
2018-10-15  7:55 ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.