From: Jens Axboe <axboe@kernel.dk> To: linux-fsdevel@vger.kernel.org, linux-aio@kvack.org, linux-block@vger.kernel.org, linux-arch@vger.kernel.org Cc: hch@lst.de, jmoyer@redhat.com, avi@scylladb.com Subject: [PATCHSET v5] io_uring IO interface Date: Wed, 16 Jan 2019 10:49:48 -0700 [thread overview] Message-ID: <20190116175003.17880-1-axboe@kernel.dk> (raw) Here's v5 of the io_uring interface. Mostly feels like putting some finishing touches on top of v4, though we do have a few user interface tweaks because of that. Arnd was kind enough to review the code with an eye towards 32-bit compatability, and that resulted in a few changes. See changelog below. I also cleaned up the internal ring handling, enabling us to batch writes to the SQ ring head and CQ ring tail. This reduces the number of write ordering barriers we need. I also dumped the io_submit_state intermediate poll list handling. This drops a patch, and also cleans up the block flush handling since we no longer have to tie into the deep internal of plug callbacks. The win of this just wasn't enough to warrant the complexity. LWN did a great write up of the API and internals, see that here: https://lwn.net/Articles/776703/ In terms of benchmarks, I ran some numbers comparing io_uring to libaio and spdk. The tldr is that io_uring is pretty close to spdk, in some cases faster. Latencies over spdk are generally better. The areas where we are still missing a bit of performance all lie in the block layer, and I'll be working on that to close the gap some more. Latency tests, 3d xpoint, 4k random read Interface QD Polled Latency IOPS -------------------------------------------------------------------------- io_uring 1 0 9.5usec 77K io_uring 2 0 8.2usec 183K io_uring 4 0 8.4usec 383K io_uring 8 0 13.3usec 449K libaio 1 0 9.7usec 74K libaio 2 0 8.5usec 181K libaio 4 0 8.5usec 373K libaio 8 0 15.4usec 402K io_uring 1 1 6.1usec 139K io_uring 2 1 6.1usec 272K io_uring 4 1 6.3usec 519K io_uring 8 1 11.5usec 592K spdk 1 1 6.1usec 151K spdk 2 1 6.2usec 293K spdk 4 1 6.7usec 536K spdk 8 1 12.6usec 586K io_uring vs libaio, non polled, io_uring has a slight lead. spdk slightly faster over io_uring polled, especially a lower queue depths. At QD=8, io_uring is faster. Peak IOPS, 512b random read Interface QD Polled Latency IOPS -------------------------------------------------------------------------- io_uring 4 1 6.8usec 513K io_uring 8 1 8.7usec 829K io_uring 16 1 13.1usec 1019K io_uring 32 1 20.6usec 1161K io_uring 64 1 32.4usec 1244K spdk 4 1 6.8usec 549K spdk 8 1 8.6usec 865K spdk 16 1 14.0usec 1105K spdk 32 1 25.0usec 1227K spdk 64 1 47.3usec 1251K io_uring lags spdk about 7% at lower queue depths, getting to within 1% of spdk at higher queue depths. Peak per-core, multiple devices, 4k random read Interface QD Polled IOPS -------------------------------------------------------------------------- io_uring 128 1 1620K libaio 128 0 608K spdk 128 1 1739K This is using multiple devices, all running on the same core, meant to test how much performance we can eke out out a single CPU core. spdk has a slight edge over io_uring, with libaio not able to compete at all. As usual, patches are against 5.0-rc2, and can also be found in my io_uring branch here: git://git.kernel.dk/linux-block io_uring Since v4: - Update some commit messages - Update some stale comments - Tweak polling efficiency - Avoid multiple SQ/CQ ring inc+barriers for batches of IO - Cache SQ head and CQ tail in the kernel - Fix buffered rw/work union issue for punted IO - Drop submit state request issue cache - Rework io_uring_register() for buffers and files to be more 32-bit friendly - Make sqe->addr an __u64 instead of playing padding tricks - Add compat conditional syscall entry for io_uring_setup() Documentation/filesystems/vfs.txt | 3 + arch/x86/entry/syscalls/syscall_32.tbl | 3 + arch/x86/entry/syscalls/syscall_64.tbl | 3 + block/bio.c | 59 +- fs/Makefile | 1 + fs/block_dev.c | 19 +- fs/file.c | 15 +- fs/file_table.c | 9 +- fs/gfs2/file.c | 2 + fs/io_uring.c | 2017 ++++++++++++++++++++++++ fs/iomap.c | 48 +- fs/xfs/xfs_file.c | 1 + include/linux/bio.h | 14 + include/linux/blk_types.h | 1 + include/linux/file.h | 2 + include/linux/fs.h | 6 +- include/linux/iomap.h | 1 + include/linux/sched/user.h | 2 +- include/linux/syscalls.h | 7 + include/uapi/linux/io_uring.h | 136 ++ init/Kconfig | 9 + kernel/sys_ni.c | 4 + 22 files changed, 2322 insertions(+), 40 deletions(-) -- Jens Axboe -- To unsubscribe, send a message with 'unsubscribe linux-aio' in the body to majordomo@kvack.org. For more info on Linux AIO, see: http://www.kvack.org/aio/ Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <axboe@kernel.dk> To: linux-fsdevel@vger.kernel.org, linux-aio@kvack.org, linux-block@vger.kernel.org, linux-arch@vger.kernel.org Cc: hch@lst.de, jmoyer@redhat.com, avi@scylladb.com Subject: [PATCHSET v5] io_uring IO interface Date: Wed, 16 Jan 2019 10:49:48 -0700 [thread overview] Message-ID: <20190116175003.17880-1-axboe@kernel.dk> (raw) Message-ID: <20190116174948.mxy8VrqZLj9c5ocmIAwA2VG_84dsfM4VAO-Qqa00K9g@z> (raw) Here's v5 of the io_uring interface. Mostly feels like putting some finishing touches on top of v4, though we do have a few user interface tweaks because of that. Arnd was kind enough to review the code with an eye towards 32-bit compatability, and that resulted in a few changes. See changelog below. I also cleaned up the internal ring handling, enabling us to batch writes to the SQ ring head and CQ ring tail. This reduces the number of write ordering barriers we need. I also dumped the io_submit_state intermediate poll list handling. This drops a patch, and also cleans up the block flush handling since we no longer have to tie into the deep internal of plug callbacks. The win of this just wasn't enough to warrant the complexity. LWN did a great write up of the API and internals, see that here: https://lwn.net/Articles/776703/ In terms of benchmarks, I ran some numbers comparing io_uring to libaio and spdk. The tldr is that io_uring is pretty close to spdk, in some cases faster. Latencies over spdk are generally better. The areas where we are still missing a bit of performance all lie in the block layer, and I'll be working on that to close the gap some more. Latency tests, 3d xpoint, 4k random read Interface QD Polled Latency IOPS -------------------------------------------------------------------------- io_uring 1 0 9.5usec 77K io_uring 2 0 8.2usec 183K io_uring 4 0 8.4usec 383K io_uring 8 0 13.3usec 449K libaio 1 0 9.7usec 74K libaio 2 0 8.5usec 181K libaio 4 0 8.5usec 373K libaio 8 0 15.4usec 402K io_uring 1 1 6.1usec 139K io_uring 2 1 6.1usec 272K io_uring 4 1 6.3usec 519K io_uring 8 1 11.5usec 592K spdk 1 1 6.1usec 151K spdk 2 1 6.2usec 293K spdk 4 1 6.7usec 536K spdk 8 1 12.6usec 586K io_uring vs libaio, non polled, io_uring has a slight lead. spdk slightly faster over io_uring polled, especially a lower queue depths. At QD=8, io_uring is faster. Peak IOPS, 512b random read Interface QD Polled Latency IOPS -------------------------------------------------------------------------- io_uring 4 1 6.8usec 513K io_uring 8 1 8.7usec 829K io_uring 16 1 13.1usec 1019K io_uring 32 1 20.6usec 1161K io_uring 64 1 32.4usec 1244K spdk 4 1 6.8usec 549K spdk 8 1 8.6usec 865K spdk 16 1 14.0usec 1105K spdk 32 1 25.0usec 1227K spdk 64 1 47.3usec 1251K io_uring lags spdk about 7% at lower queue depths, getting to within 1% of spdk at higher queue depths. Peak per-core, multiple devices, 4k random read Interface QD Polled IOPS -------------------------------------------------------------------------- io_uring 128 1 1620K libaio 128 0 608K spdk 128 1 1739K This is using multiple devices, all running on the same core, meant to test how much performance we can eke out out a single CPU core. spdk has a slight edge over io_uring, with libaio not able to compete at all. As usual, patches are against 5.0-rc2, and can also be found in my io_uring branch here: git://git.kernel.dk/linux-block io_uring Since v4: - Update some commit messages - Update some stale comments - Tweak polling efficiency - Avoid multiple SQ/CQ ring inc+barriers for batches of IO - Cache SQ head and CQ tail in the kernel - Fix buffered rw/work union issue for punted IO - Drop submit state request issue cache - Rework io_uring_register() for buffers and files to be more 32-bit friendly - Make sqe->addr an __u64 instead of playing padding tricks - Add compat conditional syscall entry for io_uring_setup() Documentation/filesystems/vfs.txt | 3 + arch/x86/entry/syscalls/syscall_32.tbl | 3 + arch/x86/entry/syscalls/syscall_64.tbl | 3 + block/bio.c | 59 +- fs/Makefile | 1 + fs/block_dev.c | 19 +- fs/file.c | 15 +- fs/file_table.c | 9 +- fs/gfs2/file.c | 2 + fs/io_uring.c | 2017 ++++++++++++++++++++++++ fs/iomap.c | 48 +- fs/xfs/xfs_file.c | 1 + include/linux/bio.h | 14 + include/linux/blk_types.h | 1 + include/linux/file.h | 2 + include/linux/fs.h | 6 +- include/linux/iomap.h | 1 + include/linux/sched/user.h | 2 +- include/linux/syscalls.h | 7 + include/uapi/linux/io_uring.h | 136 ++ init/Kconfig | 9 + kernel/sys_ni.c | 4 + 22 files changed, 2322 insertions(+), 40 deletions(-) -- Jens Axboe
next reply other threads:[~2019-01-16 17:49 UTC|newest] Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-01-16 17:49 Jens Axboe [this message] 2019-01-16 17:49 ` [PATCHSET v5] io_uring IO interface Jens Axboe 2019-01-16 17:49 ` [PATCH 01/15] fs: add an iopoll method to struct file_operations Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 02/15] block: wire up block device iopoll method Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 03/15] block: add bio_set_polled() helper Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 04/15] iomap: wire up the iopoll method Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 05/15] Add io_uring IO interface Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-17 12:02 ` Roman Penyaev 2019-01-17 12:02 ` Roman Penyaev 2019-01-17 13:54 ` Jens Axboe 2019-01-17 13:54 ` Jens Axboe 2019-01-17 14:34 ` Roman Penyaev 2019-01-17 14:34 ` Roman Penyaev 2019-01-17 14:54 ` Jens Axboe 2019-01-17 14:54 ` Jens Axboe 2019-01-17 15:19 ` Roman Penyaev 2019-01-17 15:19 ` Roman Penyaev 2019-01-17 12:48 ` Roman Penyaev 2019-01-17 12:48 ` Roman Penyaev 2019-01-17 14:01 ` Jens Axboe 2019-01-17 14:01 ` Jens Axboe 2019-01-17 20:03 ` Jeff Moyer 2019-01-17 20:03 ` Jeff Moyer 2019-01-17 20:09 ` Jens Axboe 2019-01-17 20:09 ` Jens Axboe 2019-01-17 20:14 ` Jens Axboe 2019-01-17 20:14 ` Jens Axboe 2019-01-17 20:50 ` Jeff Moyer 2019-01-17 20:50 ` Jeff Moyer 2019-01-17 20:53 ` Jens Axboe 2019-01-17 20:53 ` Jens Axboe 2019-01-17 21:02 ` Jeff Moyer 2019-01-17 21:02 ` Jeff Moyer 2019-01-17 21:17 ` Jens Axboe 2019-01-17 21:21 ` Jeff Moyer 2019-01-17 21:27 ` Jens Axboe 2019-01-18 8:23 ` Roman Penyaev 2019-01-16 17:49 ` [PATCH 06/15] io_uring: add fsync support Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 07/15] io_uring: support for IO polling Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 08/15] fs: add fget_many() and fput_many() Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 09/15] io_uring: use fget/fput_many() for file references Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 10/15] io_uring: batch io_kiocb allocation Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:49 ` [PATCH 11/15] block: implement bio helper to add iter bvec pages to bio Jens Axboe 2019-01-16 17:49 ` Jens Axboe 2019-01-16 17:50 ` [PATCH 12/15] io_uring: add support for pre-mapped user IO buffers Jens Axboe 2019-01-16 17:50 ` Jens Axboe 2019-01-16 20:53 ` Dave Chinner 2019-01-16 21:20 ` Jens Axboe 2019-01-16 21:20 ` Jens Axboe 2019-01-16 22:09 ` Dave Chinner 2019-01-16 22:21 ` Jens Axboe 2019-01-16 22:21 ` Jens Axboe 2019-01-16 23:09 ` Dave Chinner 2019-01-16 23:17 ` Jens Axboe 2019-01-16 23:17 ` Jens Axboe 2019-01-16 22:13 ` Jens Axboe 2019-01-16 22:13 ` Jens Axboe 2019-01-16 17:50 ` [PATCH 13/15] io_uring: add submission polling Jens Axboe 2019-01-16 17:50 ` Jens Axboe 2019-01-16 17:50 ` [PATCH 14/15] io_uring: add file registration Jens Axboe 2019-01-16 17:50 ` Jens Axboe 2019-01-16 17:50 ` [PATCH 15/15] io_uring: add io_uring_event cache hit information Jens Axboe 2019-01-16 17:50 ` Jens Axboe -- strict thread matches above, loose matches on Subject: below -- 2023-10-09 7:27 [PATCHSET v5] io_uring IO interface Corey Anderson
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190116175003.17880-1-axboe@kernel.dk \ --to=axboe@kernel.dk \ --cc=avi@scylladb.com \ --cc=hch@lst.de \ --cc=jmoyer@redhat.com \ --cc=linux-aio@kvack.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-block@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).