From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org,
viro@zeniv.linux.org.uk
Subject: Re: [PATCHSET v6b 0/11] Turn single segment imports into ITER_UBUF
Date: Thu, 30 Mar 2023 11:33:16 -0600 [thread overview]
Message-ID: <de35d11d-bce7-e976-7372-1f2caf417103@kernel.dk> (raw)
In-Reply-To: <CAHk-=wgmGBCO9QnBhheQDOHu+6k+OGHGCjHyHm4J=snowkSupQ@mail.gmail.com>
On 3/30/23 11:11?AM, Linus Torvalds wrote:
> On Thu, Mar 30, 2023 at 9:47?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Sadly, in absolute numbers, comparing read(2) and readv(2),
>> the latter takes 2.11x as long in the stock kernel, and 2.01x as long
>> with the patches. So while single segment is better now than before,
>> it's still waaaay slower than having to copy in a single iovec. Testing
>> was run with all security mitigations off.
>
> What does the profile say? Iis it all in import_iovec() or what?
>
> I do note that we have some completely horrid "helper" functions in
> the iter paths: things like "do_iter_readv_writev()" supposedly being
> a common function , but then it ends up doing some small setup and
> just doing a conditional on the "type" after all, so when it isn't
> inlined, you have those things that don't predict well at all.
>
> And the iter interfaces don't have just that iterator, they have the
> whole kiocb overhead too. All in the name of being generic. Most file
> descriptors don't even support the simpler ".read" interface, because
> they want the whole thing with IOCB_DIRECT flags etc.
>
> So to some degree it's unfair to compare read-vs-read_iter. The latter
> has all that disgusting support for O_DIRECT and friends, and testing
> with /dev/null just doesn't show that part.
Oh I agree, and particularly for the "read from /dev/zero" case it's
not very interesting, as it does too different things there as well.
It was just more of a "gah it's potentially this bad" outburst than
anything else, the numbers I did care about was readv before and
after patches, not read vs readv.
That said, there might be things to improve here. But that's a task
for another time. perf diff of a read vs readv run below.
# Event 'cycles'
#
# Baseline Delta Abs Shared Object Symbol
# ........ ......... .................... .....................................
#
+40.56% [kernel.vmlinux] [k] iov_iter_zero
+12.59% [kernel.vmlinux] [k] copy_user_enhanced_fast_string
21.56% -11.10% [kernel.vmlinux] [k] entry_SYSCALL_64
+7.67% [kernel.vmlinux] [k] _copy_from_user
+7.40% libc.so.6 [.] __GI___readv
3.76% -2.22% [kernel.vmlinux] [k] __fsnotify_parent
+2.13% [kernel.vmlinux] [k] do_iter_read
+2.02% [kernel.vmlinux] [k] do_iter_readv_writev
+1.89% [kernel.vmlinux] [k] __import_iovec
+1.59% [kernel.vmlinux] [k] do_readv
3.15% -1.43% [kernel.vmlinux] [k] __fget_light
+1.42% [kernel.vmlinux] [k] vfs_readv
+1.32% [kernel.vmlinux] [k] read_iter_zero
2.39% -1.30% [kernel.vmlinux] [k] syscall_exit_to_user_mode
1.89% -1.17% [kernel.vmlinux] [k] exit_to_user_mode_prepare
2.01% -1.10% [kernel.vmlinux] [k] do_syscall_64
2.04% -1.06% [kernel.vmlinux] [k] __fdget_pos
1.93% -0.99% [kernel.vmlinux] [k] syscall_enter_from_user_mode
+0.81% [kernel.vmlinux] [k] __get_task_ioprio
1.03% -0.56% [kernel.vmlinux] [k] fpregs_assert_state_consistent
0.85% -0.49% [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare
+0.45% [kernel.vmlinux] [k] import_iovec
+0.20% [kernel.vmlinux] [k] kfree
+0.18% [kernel.vmlinux] [k] __x64_sys_readv
+0.06% read-zero [.] readv@plt
+0.01% [kernel.vmlinux] [k] filemap_map_pages
+0.01% ld-linux-x86-64.so.2 [.] check_match
0.00% +0.00% [kernel.vmlinux] [k] memset_erms
0.00% -0.00% [kernel.vmlinux] [k] perf_iterate_ctx
+0.00% [kernel.vmlinux] [k] xfs_iunlock
0.49% -0.00% read-zero [.] main
+0.00% [kernel.vmlinux] [k] arch_get_unmapped_area_topdown
+0.00% [kernel.vmlinux] [k] pcpu_alloc
+0.00% [kernel.vmlinux] [k] perf_event_exec
+0.00% taskset [.] 0x0000000000002ebd
0.00% +0.00% [kernel.vmlinux] [k] perf_ibs_handle_irq
0.00% -0.00% [kernel.vmlinux] [k] perf_ibs_start
32.88% [kernel.vmlinux] [k] read_zero
15.22% libc.so.6 [.] read
6.27% [kernel.vmlinux] [k] vfs_read
2.60% [kernel.vmlinux] [k] ksys_read
1.02% [kernel.vmlinux] [k] __cond_resched
0.41% [kernel.vmlinux] [k] rcu_all_qs
0.35% [kernel.vmlinux] [k] __x64_sys_read
0.12% read-zero [.] read@plt
0.01% ld-linux-x86-64.so.2 [.] _dl_load_cache_lookup
0.01% ld-linux-x86-64.so.2 [.] _dl_check_map_versions
0.00% [kernel.vmlinux] [k] _find_next_or_bit
0.00% [kernel.vmlinux] [k] perf_event_update_userpage
0.00% [kernel.vmlinux] [k] native_sched_clock
--
Jens Axboe
next prev parent reply other threads:[~2023-03-30 17:33 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-30 16:46 [PATCHSET v6b 0/11] Turn single segment imports into ITER_UBUF Jens Axboe
2023-03-30 16:46 ` [PATCH 01/11] block: ensure bio_alloc_map_data() deals with ITER_UBUF correctly Jens Axboe
2023-03-30 16:46 ` [PATCH 02/11] iov_iter: add iter_iovec() helper Jens Axboe
2023-03-30 16:46 ` [PATCH 03/11] IB/hfi1: check for user backed iterator, not specific iterator type Jens Axboe
2023-03-30 16:46 ` [PATCH 04/11] IB/qib: " Jens Axboe
2023-03-30 16:46 ` [PATCH 05/11] ALSA: pcm: " Jens Axboe
2023-03-30 16:46 ` [PATCH 06/11] iov_iter: add iter_iov_addr() and iter_iov_len() helpers Jens Axboe
2023-03-30 16:46 ` [PATCH 07/11] iov_iter: remove iov_iter_iovec() Jens Axboe
2023-03-30 16:46 ` [PATCH 08/11] iov_iter: set nr_segs = 1 for ITER_UBUF Jens Axboe
2023-03-30 16:47 ` [PATCH 09/11] iov_iter: overlay struct iovec and ubuf/len Jens Axboe
2023-03-30 16:47 ` [PATCH 10/11] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
2023-03-30 16:47 ` [PATCH 11/11] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe
2023-03-30 17:11 ` [PATCHSET v6b 0/11] Turn single segment imports into ITER_UBUF Linus Torvalds
2023-03-30 17:33 ` Jens Axboe [this message]
2023-03-30 21:53 ` Linus Torvalds
2023-03-30 22:18 ` Jens Axboe
2023-04-02 22:22 ` Jens Axboe
2023-04-02 23:37 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de35d11d-bce7-e976-7372-1f2caf417103@kernel.dk \
--to=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).