From: Jens Axboe <axboe@kernel.dk> To: linux-fsdevel@vger.kernel.org, linux-aio@kvack.org, linux-block@vger.kernel.org, linux-arch@vger.kernel.org Cc: hch@lst.de, jmoyer@redhat.com, avi@scylladb.com Subject: [PATCHSET v1] io_uring IO interface Date: Tue, 8 Jan 2019 09:56:29 -0700 [thread overview] Message-ID: <20190108165645.19311-1-axboe@kernel.dk> (raw) After some arm twisting from Christoph, I finally caved and divorced the aio-poll patches from aio/libaio itself. The io_uring interface itself is useful and efficient, and after rebasing all the new goodies on top of that, there was little reason to retail the aio connection. Hence io_uring was born. This is what I previously called scqring for aio, but now as a standalone entity. Patch #5 adds the core of this interface, but in short, it has two main data structures: struct io_uring_iocb { __u8 opcode; __u8 flags; __u16 ioprio; __s32 fd; __u64 off; union { void *addr; __u64 __pad; }; __u32 len; union { __kernel_rwf_t rw_flags; __u32 __resv; }; }; struct io_uring_event { __u64 index; /* what iocb this event came from */ __s32 res; /* result code for this event */ __u32 flags; }; The SQ ring is an array of indexes into an array of io_uring_iocbs, which describe the IO to be done. The SQ ring is an array of io_uring_events, which describe a completion event. Both of these rings are mapped into the application through mmap(2), at special magic offsets. The application manipulates the ring directly, and then communicates with the kernel through these two system calls: io_uring_setup(entries, iovecs, params) Sets up a context for doing async IO. On success, returns a file descriptor that the application can mmap to gain access to the SQ ring, CQ ring, and io_uring_iocbs. io_uring_enter(fd, to_submit, min_complete, flags) Initiates IO against the rings mapped to this fd, or waits for them to complete, or both The behavior is controlled by the parameters passed in. If 'min_complete' is non-zero, then we'll try and submit new IO. If IORING_ENTER_GETEVENTS is set, the kernel will wait for 'min_complete' events, if they aren't already available. I've started a liburing git repo for this, which contains some helpers for doing IO without having to muck with the ring directly, setting up an io_uring context, etc. Clone that here: git://git.kernel.dk/liburing In terms of usage, there's also a small test app here: http://git.kernel.dk/cgit/fio/plain/t/io_uring.c and if you're into fio, there's a io_uring engine included with that as well for test purposes. In terms of features, this has everything that the prior aio-poll postings did. Later patches add support for polled IO, fixed buffers, kernel side submission and polling, buffered aio, etc. Also a number of bug fixes in here from previous postings. Series is against 5.0-rc1, and can also be found in my io_uring branch. For now just x86-64 has the system calls wired up, and liburing also only supports x86-64. The latter just needs system call numbers and reasonable read/write barrier defines to work, however. Documentation/filesystems/vfs.txt | 3 + arch/x86/entry/syscalls/syscall_64.tbl | 2 + block/bio.c | 59 +- fs/Makefile | 2 +- fs/block_dev.c | 19 +- fs/file.c | 15 +- fs/file_table.c | 9 +- fs/gfs2/file.c | 2 + fs/io_uring.c | 1907 ++++++++++++++++++++++++ fs/iomap.c | 48 +- fs/xfs/xfs_file.c | 1 + include/linux/bio.h | 14 + include/linux/blk_types.h | 1 + include/linux/file.h | 2 + include/linux/fs.h | 6 +- include/linux/iomap.h | 1 + include/linux/syscalls.h | 5 + include/uapi/linux/io_uring.h | 115 ++ kernel/sys_ni.c | 2 + 19 files changed, 2173 insertions(+), 40 deletions(-) -- Jens Axboe -- To unsubscribe, send a message with 'unsubscribe linux-aio' in the body to majordomo@kvack.org. For more info on Linux AIO, see: http://www.kvack.org/aio/ Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <axboe@kernel.dk> To: linux-fsdevel@vger.kernel.org, linux-aio@kvack.org, linux-block@vger.kernel.org, linux-arch@vger.kernel.org Cc: hch@lst.de, jmoyer@redhat.com, avi@scylladb.com Subject: [PATCHSET v1] io_uring IO interface Date: Tue, 8 Jan 2019 09:56:29 -0700 [thread overview] Message-ID: <20190108165645.19311-1-axboe@kernel.dk> (raw) Message-ID: <20190108165629.pTE9KEvSvhuE4yuCrAbeLtMFbU3-VFOmdMfeGPdUUxI@z> (raw) After some arm twisting from Christoph, I finally caved and divorced the aio-poll patches from aio/libaio itself. The io_uring interface itself is useful and efficient, and after rebasing all the new goodies on top of that, there was little reason to retail the aio connection. Hence io_uring was born. This is what I previously called scqring for aio, but now as a standalone entity. Patch #5 adds the core of this interface, but in short, it has two main data structures: struct io_uring_iocb { __u8 opcode; __u8 flags; __u16 ioprio; __s32 fd; __u64 off; union { void *addr; __u64 __pad; }; __u32 len; union { __kernel_rwf_t rw_flags; __u32 __resv; }; }; struct io_uring_event { __u64 index; /* what iocb this event came from */ __s32 res; /* result code for this event */ __u32 flags; }; The SQ ring is an array of indexes into an array of io_uring_iocbs, which describe the IO to be done. The SQ ring is an array of io_uring_events, which describe a completion event. Both of these rings are mapped into the application through mmap(2), at special magic offsets. The application manipulates the ring directly, and then communicates with the kernel through these two system calls: io_uring_setup(entries, iovecs, params) Sets up a context for doing async IO. On success, returns a file descriptor that the application can mmap to gain access to the SQ ring, CQ ring, and io_uring_iocbs. io_uring_enter(fd, to_submit, min_complete, flags) Initiates IO against the rings mapped to this fd, or waits for them to complete, or both The behavior is controlled by the parameters passed in. If 'min_complete' is non-zero, then we'll try and submit new IO. If IORING_ENTER_GETEVENTS is set, the kernel will wait for 'min_complete' events, if they aren't already available. I've started a liburing git repo for this, which contains some helpers for doing IO without having to muck with the ring directly, setting up an io_uring context, etc. Clone that here: git://git.kernel.dk/liburing In terms of usage, there's also a small test app here: http://git.kernel.dk/cgit/fio/plain/t/io_uring.c and if you're into fio, there's a io_uring engine included with that as well for test purposes. In terms of features, this has everything that the prior aio-poll postings did. Later patches add support for polled IO, fixed buffers, kernel side submission and polling, buffered aio, etc. Also a number of bug fixes in here from previous postings. Series is against 5.0-rc1, and can also be found in my io_uring branch. For now just x86-64 has the system calls wired up, and liburing also only supports x86-64. The latter just needs system call numbers and reasonable read/write barrier defines to work, however. Documentation/filesystems/vfs.txt | 3 + arch/x86/entry/syscalls/syscall_64.tbl | 2 + block/bio.c | 59 +- fs/Makefile | 2 +- fs/block_dev.c | 19 +- fs/file.c | 15 +- fs/file_table.c | 9 +- fs/gfs2/file.c | 2 + fs/io_uring.c | 1907 ++++++++++++++++++++++++ fs/iomap.c | 48 +- fs/xfs/xfs_file.c | 1 + include/linux/bio.h | 14 + include/linux/blk_types.h | 1 + include/linux/file.h | 2 + include/linux/fs.h | 6 +- include/linux/iomap.h | 1 + include/linux/syscalls.h | 5 + include/uapi/linux/io_uring.h | 115 ++ kernel/sys_ni.c | 2 + 19 files changed, 2173 insertions(+), 40 deletions(-) -- Jens Axboe
next reply other threads:[~2019-01-08 16:56 UTC|newest] Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-01-08 16:56 Jens Axboe [this message] 2019-01-08 16:56 ` [PATCHSET v1] io_uring IO interface Jens Axboe 2019-01-08 16:56 ` [PATCH 01/16] fs: add an iopoll method to struct file_operations Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 02/16] block: wire up block device iopoll method Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 03/16] block: add bio_set_polled() helper Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-10 9:43 ` Ming Lei 2019-01-10 9:43 ` Ming Lei 2019-01-10 16:05 ` Jens Axboe 2019-01-10 16:05 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 04/16] iomap: wire up the iopoll method Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 05/16] Add io_uring IO interface Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-09 12:10 ` Christoph Hellwig 2019-01-09 15:53 ` Jens Axboe 2019-01-09 15:53 ` Jens Axboe 2019-01-09 18:30 ` Christoph Hellwig 2019-01-09 18:30 ` Christoph Hellwig 2019-01-09 20:07 ` Jens Axboe 2019-01-09 20:07 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 06/16] io_uring: support for IO polling Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-09 12:11 ` Christoph Hellwig 2019-01-09 15:53 ` Jens Axboe 2019-01-09 15:53 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 07/16] io_uring: add submission side request cache Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 08/16] fs: add fget_many() and fput_many() Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 09/16] io_uring: use fget/fput_many() for file references Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 10/16] io_uring: split kiocb init from allocation Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-09 12:12 ` Christoph Hellwig 2019-01-09 12:12 ` Christoph Hellwig 2019-01-09 16:56 ` Jens Axboe 2019-01-09 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 11/16] io_uring: batch io_kiocb allocation Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-09 12:13 ` Christoph Hellwig 2019-01-09 16:57 ` Jens Axboe 2019-01-09 16:57 ` Jens Axboe 2019-01-09 19:03 ` Christoph Hellwig 2019-01-09 20:08 ` Jens Axboe 2019-01-09 20:08 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 12/16] block: implement bio helper to add iter bvec pages to bio Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 13/16] io_uring: add support for pre-mapped user IO buffers Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-09 12:16 ` Christoph Hellwig 2019-01-09 17:06 ` Jens Axboe 2019-01-09 17:06 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 14/16] io_uring: support kernel side submission Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-09 19:06 ` Christoph Hellwig 2019-01-09 20:49 ` Jens Axboe 2019-01-09 20:49 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 15/16] io_uring: add submission polling Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-08 16:56 ` [PATCH 16/16] io_uring: add io_uring_event cache hit information Jens Axboe 2019-01-08 16:56 ` Jens Axboe 2019-01-09 16:00 ` [PATCHSET v1] io_uring IO interface Matthew Wilcox 2019-01-09 16:00 ` Matthew Wilcox 2019-01-09 16:27 ` Chris Mason 2019-01-09 16:27 ` Chris Mason
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190108165645.19311-1-axboe@kernel.dk \ --to=axboe@kernel.dk \ --cc=avi@scylladb.com \ --cc=hch@lst.de \ --cc=jmoyer@redhat.com \ --cc=linux-aio@kvack.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-block@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).