From: Ali Gholami Rudi <aligrudi@gmail.com>
To: Yu Kuai <yukuai1@huaweicloud.com>
Cc: Xiao Ni <xni@redhat.com>,
linux-raid@vger.kernel.org, song@kernel.org,
"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: Unacceptably Poor RAID1 Performance with Many CPU Cores
Date: Mon, 19 Jun 2023 00:00:51 +0330 [thread overview]
Message-ID: <20231906000051@laper.mirepesht> (raw)
In-Reply-To: <cbc45f91-c341-2207-b3ec-81701a8651b5@huaweicloud.com>
Hi,
I tested raid10 with NVMe disks on Debian 12 (Linux 6.1.0), which
includes your first patch only.
The speed was very bad:
READ: IOPS=360K BW=1412MiB/s
WRITE: IOPS=154K BW= 606MiB/s
Perf's output:
+ 98.90% 0.00% fio [unknown] [.] 0xffffffffffffffff
+ 98.71% 0.00% fio fio [.] 0x0000563ae0f62117
+ 97.69% 0.02% fio [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe
+ 97.66% 0.02% fio [kernel.kallsyms] [k] do_syscall_64
+ 97.29% 0.00% fio fio [.] 0x0000563ae0f5fceb
+ 97.29% 0.05% fio fio [.] td_io_queue
+ 97.20% 0.01% fio fio [.] td_io_commit
+ 97.20% 0.02% fio libc.so.6 [.] syscall
+ 96.94% 0.05% fio libaio.so.1.0.2 [.] io_submit
+ 96.94% 0.00% fio fio [.] 0x0000563ae0f84e5e
+ 96.50% 0.02% fio [kernel.kallsyms] [k] __x64_sys_io_submit
- 96.44% 0.03% fio [kernel.kallsyms] [k] io_submit_one
- 96.41% io_submit_one
- 65.16% aio_read
- 65.07% xfs_file_read_iter
- 65.06% xfs_file_dio_read
- 60.21% iomap_dio_rw
- 60.21% __iomap_dio_rw
- 49.84% iomap_dio_bio_iter
- 49.39% submit_bio_noacct_nocheck
- 49.08% __submit_bio
- 48.80% md_handle_request
- 48.40% raid10_make_request
- 48.14% raid10_read_request
- 47.63% regular_request_wait
- 47.62% wait_barrier
- 44.17% _raw_spin_lock_irq
44.14% native_queued_spin_lock_slowpath
- 2.39% schedule
- 2.38% __schedule
+ 1.99% pick_next_task_fair
- 9.78% iomap_iter
- 9.77% xfs_read_iomap_begin
- 9.30% xfs_ilock_for_iomap
- 9.29% down_read
- 9.18% rwsem_down_read_slowpath
- 4.67% schedule_preempt_disabled
- 4.67% schedule
- 4.67% __schedule
- 4.08% pick_next_task_fair
- 4.08% newidle_balance
- 3.94% load_balance
- 3.60% find_busiest_group
3.59% update_sd_lb_stats.constprop.0
- 4.12% _raw_spin_lock_irq
4.11% native_queued_spin_lock_slowpath
+ 4.56% touch_atime
- 31.12% aio_write
- 31.06% xfs_file_write_iter
- 31.00% xfs_file_dio_write_aligned
- 27.41% iomap_dio_rw
- 27.40% __iomap_dio_rw
- 23.29% iomap_dio_bio_iter
- 23.14% submit_bio_noacct_nocheck
- 23.11% __submit_bio
- 23.02% md_handle_request
- 22.85% raid10_make_request
- 20.45% regular_request_wait
- 20.44% wait_barrier
- 18.97% _raw_spin_lock_irq
18.96% native_queued_spin_lock_slowpath
- 1.02% schedule
- 1.02% __schedule
- 0.85% pick_next_task_fair
+ 0.84% newidle_balance
+ 1.85% md_bitmap_startwrite
- 3.20% iomap_iter
- 3.19% xfs_direct_write_iomap_begin
- 3.00% xfs_ilock_for_iomap
- 2.99% down_read
- 2.95% rwsem_down_read_slowpath
+ 1.70% schedule_preempt_disabled
+ 1.13% _raw_spin_lock_irq
+ 0.81% blk_finish_plug
+ 3.47% xfs_file_write_checks
+ 87.62% 0.01% fio [kernel.kallsyms] [k] iomap_dio_rw
+ 87.61% 0.14% fio [kernel.kallsyms] [k] __iomap_dio_rw
+ 74.85% 74.85% fio [kernel.kallsyms] [k] native_queued_spin_lock_slowpath
+ 73.13% 0.10% fio [kernel.kallsyms] [k] iomap_dio_bio_iter
+ 72.99% 0.11% fio [kernel.kallsyms] [k] _raw_spin_lock_irq
+ 72.76% 0.02% fio [kernel.kallsyms] [k] submit_bio_noacct_nocheck
+ 72.20% 0.01% fio [kernel.kallsyms] [k] __submit_bio
+ 71.82% 0.43% fio [kernel.kallsyms] [k] md_handle_request
+ 71.25% 0.15% fio [kernel.kallsyms] [k] raid10_make_request
+ 68.08% 0.02% fio [kernel.kallsyms] [k] regular_request_wait
+ 68.06% 0.57% fio [kernel.kallsyms] [k] wait_barrier
+ 65.16% 0.01% fio [kernel.kallsyms] [k] aio_read
+ 65.07% 0.01% fio [kernel.kallsyms] [k] xfs_file_read_iter
+ 65.06% 0.01% fio [kernel.kallsyms] [k] xfs_file_dio_read
+ 48.14% 0.12% fio [kernel.kallsyms] [k] raid10_read_request
Note that in the ramdisk tests, I gate whole ramdisks or raid devices
to fio. Here I used files on the filesystem.
Thanks,
Ali
# cat /proc/mdstat:
Personalities : [raid10] [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
md127 : active raid10 ram1[1] ram0[0]
1046528 blocks super 1.2 2 near-copies [2/2] [UU]
md3 : active raid10 nvme0n1p5[0] nvme1n1p5[1] nvme3n1p5[3] nvme4n1p5[4] nvme6n1p5[6] nvme5n1p5[5] nvme7n1p5[7] nvme2n1p5[2]
14887084032 blocks super 1.2 512K chunks 2 near-copies [8/8] [UUUUUUUU]
[=======>.............] resync = 37.2% (5549960960/14887084032) finish=754.4min speed=206272K/sec
bitmap: 70/111 pages [280KB], 65536KB chunk
md1 : active raid10 nvme1n1p3[1] nvme3n1p3[3] nvme0n1p3[0] nvme4n1p3[4] nvme5n1p3[5] nvme6n1p3[6] nvme7n1p3[7] nvme2n1p3[2]
41906176 blocks super 1.2 512K chunks 2 near-copies [8/8] [UUUUUUUU]
md0 : active raid10 nvme1n1p2[1] nvme3n1p2[3] nvme0n1p2[0] nvme6n1p2[6] nvme4n1p2[4] nvme5n1p2[5] nvme7n1p2[7] nvme2n1p2[2]
2084864 blocks super 1.2 512K chunks 2 near-copies [8/8] [UUUUUUUU]
md2 : active (auto-read-only) raid10 nvme4n1p4[4] nvme1n1p4[1] nvme3n1p4[3] nvme0n1p4[0] nvme6n1p4[6] nvme7n1p4[7] nvme5n1p4[5] nvme2n1p4[2]
67067904 blocks super 1.2 512K chunks 2 near-copies [8/8] [UUUUUUUU]
resync=PENDING
unused devices: <none>
# lspci | grep NVM
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
04:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
61:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
62:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
83:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
84:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
#
next prev parent reply other threads:[~2023-06-18 20:53 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-15 7:54 Unacceptably Poor RAID1 Performance with Many CPU Cores Ali Gholami Rudi
2023-06-15 9:16 ` Xiao Ni
2023-06-15 17:08 ` Ali Gholami Rudi
2023-06-15 17:36 ` Ali Gholami Rudi
2023-06-16 1:53 ` Xiao Ni
2023-06-16 5:20 ` Ali Gholami Rudi
2023-06-15 14:02 ` Yu Kuai
2023-06-16 2:14 ` Xiao Ni
2023-06-16 2:34 ` Yu Kuai
2023-06-16 5:52 ` Ali Gholami Rudi
[not found] ` <20231606091224@laper.mirepesht>
2023-06-16 7:31 ` Ali Gholami Rudi
2023-06-16 7:42 ` Yu Kuai
2023-06-16 8:21 ` Ali Gholami Rudi
2023-06-16 8:34 ` Yu Kuai
2023-06-16 8:52 ` Ali Gholami Rudi
2023-06-16 9:17 ` Yu Kuai
2023-06-16 11:51 ` Ali Gholami Rudi
2023-06-16 12:27 ` Yu Kuai
2023-06-18 20:30 ` Ali Gholami Rudi [this message]
2023-06-19 1:22 ` Yu Kuai
2023-06-19 5:19 ` Ali Gholami Rudi
2023-06-19 6:53 ` Yu Kuai
2023-06-21 8:05 ` Xiao Ni
2023-06-21 8:26 ` Yu Kuai
2023-06-21 8:55 ` Xiao Ni
2023-07-01 11:17 ` Ali Gholami Rudi
2023-07-03 12:39 ` Yu Kuai
2023-07-05 7:59 ` Ali Gholami Rudi
2023-06-21 19:34 ` Wols Lists
2023-06-23 0:52 ` Xiao Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231906000051@laper.mirepesht \
--to=aligrudi@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=song@kernel.org \
--cc=xni@redhat.com \
--cc=yukuai1@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox