linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET v2 0/4] Turn single segment imports into ITER_UBUF
@ 2023-03-27 18:04 Jens Axboe
  2023-03-27 18:04 ` [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF Jens Axboe
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Jens Axboe @ 2023-03-27 18:04 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro

Hi,

We've been doing a few conversions of ITER_IOVEC to ITER_UBUF in select
spots, as the latter is cheaper to iterate and hence saves some cycles.
I recently experimented [1] with io_uring converting single segment READV
and WRITEV into non-vectored variants, as we can save some cycles through
that as well.

But there's really no reason why we can't just do this further down,
enabling it for everyone. It's quite common to use vectored reads or
writes even with a single segment, unfortunately, even for cases where
there's no specific reason to do so. From a bit of non-scientific
testing on a vm on my laptop, I see about 60% of the import_iovec()
calls being for a single segment.

Ran some unscientific performanc tests [2] as well.

Changes since v1:

- Add two prep patches converting ITER_IOVEC assumptions
- Add missing import_ubuf() return value check in __import_iovec_ubuf()

[1] https://lore.kernel.org/io-uring/43cb1fb7-b30b-8df1-bba6-e50797d680c6@kernel.dk/
[2] https://lore.kernel.org/all/45c46ee9-8479-4114-6ce9-ae3082335cb8@kernel.dk/



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF
  2023-03-27 18:04 [PATCHSET v2 0/4] Turn single segment imports into ITER_UBUF Jens Axboe
@ 2023-03-27 18:04 ` Jens Axboe
  2023-03-27 19:03   ` Linus Torvalds
  2023-03-27 18:04 ` [PATCH 2/4] mm: make process_madvise() " Jens Axboe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2023-03-27 18:04 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

We caller this helper if we don't have a read_iter/write_iter available,
and it will go over the iov_iter manually copying data in/out as needed.
But it's currently assuming it's being passed an ITER_IOVEC. Enable
use of ITER_UBUF with this helper as well, in preparation for importing
single segment iovecs as ITER_UBUF.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/read_write.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 7a2ff6157eda..de9ce948d11a 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -748,10 +748,21 @@ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
 	if (flags & ~RWF_HIPRI)
 		return -EOPNOTSUPP;
 
+	if (WARN_ON_ONCE(iter->iter_type != ITER_IOVEC &&
+			 iter->iter_type != ITER_UBUF))
+		return -EINVAL;
+
 	while (iov_iter_count(iter)) {
-		struct iovec iovec = iov_iter_iovec(iter);
+		struct iovec iovec;
 		ssize_t nr;
 
+		if (iter_is_ubuf(iter)) {
+			iovec.iov_base = iter->ubuf + iter->iov_offset;
+			iovec.iov_len = iov_iter_count(iter);
+		} else {
+			iovec = iov_iter_iovec(iter);
+		}
+
 		if (type == READ) {
 			nr = filp->f_op->read(filp, iovec.iov_base,
 					      iovec.iov_len, ppos);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] mm: make process_madvise() deal with ITER_UBUF
  2023-03-27 18:04 [PATCHSET v2 0/4] Turn single segment imports into ITER_UBUF Jens Axboe
  2023-03-27 18:04 ` [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF Jens Axboe
@ 2023-03-27 18:04 ` Jens Axboe
  2023-03-27 18:04 ` [PATCH 3/4] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
  2023-03-27 18:04 ` [PATCH 4/4] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe
  3 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-03-27 18:04 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

In preparation for having single segment iovec imports be transformed
into ITER_UBUF, ensure that process_madvise() deals with those correctly
when iterating the iov_iter.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 mm/madvise.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 340125d08c03..2bd0f6c067bb 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1503,7 +1503,12 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 	total_len = iov_iter_count(&iter);
 
 	while (iov_iter_count(&iter)) {
-		iovec = iov_iter_iovec(&iter);
+		if (iter_is_ubuf(&iter)) {
+			iovec.iov_base = iter.ubuf + iter.iov_offset;
+			iovec.iov_len = iov_iter_count(&iter);
+		} else {
+			iovec = iov_iter_iovec(&iter);
+		}
 		ret = do_madvise(mm, (unsigned long)iovec.iov_base,
 					iovec.iov_len, behavior);
 		if (ret < 0)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] iov_iter: convert import_single_range() to ITER_UBUF
  2023-03-27 18:04 [PATCHSET v2 0/4] Turn single segment imports into ITER_UBUF Jens Axboe
  2023-03-27 18:04 ` [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF Jens Axboe
  2023-03-27 18:04 ` [PATCH 2/4] mm: make process_madvise() " Jens Axboe
@ 2023-03-27 18:04 ` Jens Axboe
  2023-03-27 18:04 ` [PATCH 4/4] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe
  3 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-03-27 18:04 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

Since we're just importing a single vector, we don't have to turn it
into an ITER_IOVEC. Instead turn it into an ITER_UBUF, which is cheaper
to iterate.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 lib/iov_iter.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 274014e4eafe..fc82cc42ffe6 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1866,9 +1866,7 @@ int import_single_range(int rw, void __user *buf, size_t len,
 	if (unlikely(!access_ok(buf, len)))
 		return -EFAULT;
 
-	iov->iov_base = buf;
-	iov->iov_len = len;
-	iov_iter_init(i, rw, iov, 1, len);
+	iov_iter_ubuf(i, rw, buf, len);
 	return 0;
 }
 EXPORT_SYMBOL(import_single_range);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] iov_iter: import single vector iovecs as ITER_UBUF
  2023-03-27 18:04 [PATCHSET v2 0/4] Turn single segment imports into ITER_UBUF Jens Axboe
                   ` (2 preceding siblings ...)
  2023-03-27 18:04 ` [PATCH 3/4] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
@ 2023-03-27 18:04 ` Jens Axboe
  3 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-03-27 18:04 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

Add a special case to __import_iovec(), which imports a single segment
iovec as an ITER_UBUF rather than an ITER_IOVEC. ITER_UBUF is cheaper
to iterate than ITER_IOVEC, and for a single segment iovec, there's no
point in using a segmented iterator.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 lib/iov_iter.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index fc82cc42ffe6..63cf9997bd50 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1780,6 +1780,30 @@ struct iovec *iovec_from_user(const struct iovec __user *uvec,
 	return iov;
 }
 
+/*
+ * Single segment iovec supplied by the user, import it as ITER_UBUF.
+ */
+static ssize_t __import_iovec_ubuf(int type, const struct iovec __user *uvec,
+				   struct iovec **iovp, struct iov_iter *i,
+				   bool compat)
+{
+	struct iovec *iov = *iovp;
+	ssize_t ret;
+
+	if (compat)
+		ret = copy_compat_iovec_from_user(iov, uvec, 1);
+	else
+		ret = copy_iovec_from_user(iov, uvec, 1);
+	if (unlikely(ret))
+		return ret;
+
+	ret = import_ubuf(type, iov->iov_base, iov->iov_len, i);
+	if (unlikely(ret))
+		return ret;
+	*iovp = NULL;
+	return i->count;
+}
+
 ssize_t __import_iovec(int type, const struct iovec __user *uvec,
 		 unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
 		 struct iov_iter *i, bool compat)
@@ -1788,6 +1812,9 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
 	unsigned long seg;
 	struct iovec *iov;
 
+	if (nr_segs == 1)
+		return __import_iovec_ubuf(type, uvec, iovp, i, compat);
+
 	iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
 	if (IS_ERR(iov)) {
 		*iovp = NULL;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF
  2023-03-27 18:04 ` [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF Jens Axboe
@ 2023-03-27 19:03   ` Linus Torvalds
  2023-03-27 19:06     ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2023-03-27 19:03 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, brauner, viro

On Mon, Mar 27, 2023 at 11:04 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -748,10 +748,21 @@ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
>         if (flags & ~RWF_HIPRI)
>                 return -EOPNOTSUPP;
>
> +       if (WARN_ON_ONCE(iter->iter_type != ITER_IOVEC &&
> +                        iter->iter_type != ITER_UBUF))
> +               return -EINVAL;

Hmm. I think it might actually be nicer for the "iter_is_ubuf(iter)"
case to be outside the loop entirely, and be done before this
WARN_ON_ONCE().

If it's a single ITER_UBUF, that code really shouldn't loop at all -
it's literally just the old case of "call ->read/write with a single
buffer".

So i think I'd prefer this patch to be something along the lines of
this instead:

  --- a/fs/read_write.c
  +++ b/fs/read_write.c
  @@ -748,6 +748,19 @@ static ssize_t do_loop_readv_writev(struct file
*filp, struct iov_iter *iter,
        if (flags & ~RWF_HIPRI)
                return -EOPNOTSUPP;

  +     /* Single user buffer? */
  +     if (iter_is_ubuf(iter)) {
  +             void __user *addr = iter->ubuf + iter->iov_offset;
  +             size_t len = iov_iter_count(iter);
  +
  +             if (type == READ)
  +                     return filp->f_op->read(filp, addr, len, ppos);
  +             return filp->f_op->write(filp, addr, len, ppos);
  +     }
  +
  +     if (WARN_ON_ONCE(!iter_is_iovec(iter)))
  +             return -EINVAL;
  +
        while (iov_iter_count(iter)) {
                struct iovec iovec = iov_iter_iovec(iter);
                ssize_t nr;

which keeps the existing iovec loop, and just adds that "is it a
simple buffer" case at the top (and the WARN_ON_ONCE() for any other
iter type).

Hmm?

                         Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF
  2023-03-27 19:03   ` Linus Torvalds
@ 2023-03-27 19:06     ` Jens Axboe
  2023-03-27 19:11       ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2023-03-27 19:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, brauner, viro

On 3/27/23 1:03?PM, Linus Torvalds wrote:
> On Mon, Mar 27, 2023 at 11:04?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> --- a/fs/read_write.c
>> +++ b/fs/read_write.c
>> @@ -748,10 +748,21 @@ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
>>         if (flags & ~RWF_HIPRI)
>>                 return -EOPNOTSUPP;
>>
>> +       if (WARN_ON_ONCE(iter->iter_type != ITER_IOVEC &&
>> +                        iter->iter_type != ITER_UBUF))
>> +               return -EINVAL;
> 
> Hmm. I think it might actually be nicer for the "iter_is_ubuf(iter)"
> case to be outside the loop entirely, and be done before this
> WARN_ON_ONCE().
> 
> If it's a single ITER_UBUF, that code really shouldn't loop at all -
> it's literally just the old case of "call ->read/write with a single
> buffer".
> 
> So i think I'd prefer this patch to be something along the lines of
> this instead:

See Al's suggestion in the other thread, I like that a lot better as it
avoids any potential wack-a-mole with potentially similar cases. There
are not a lot of hits for iov_iter_iovec() in the three and I spotted
two, which lead to these two prep patches. But if we do:

https://git.kernel.dk/cgit/linux-block/commit/?h=iter-ubuf&id=8a825a6f52e8fab74e936f15eea6c34ac67272f6

then we don't really have to worry about those.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF
  2023-03-27 19:06     ` Jens Axboe
@ 2023-03-27 19:11       ` Linus Torvalds
  2023-03-27 19:13         ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2023-03-27 19:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, brauner, viro

On Mon, Mar 27, 2023 at 12:06 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> See Al's suggestion in the other thread, I like that a lot better as it
> avoids any potential wack-a-mole with potentially similar cases.

Ok, this is all the legacy "no iter_read/write" case, so that looks
fine to me too.

               Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF
  2023-03-27 19:11       ` Linus Torvalds
@ 2023-03-27 19:13         ` Linus Torvalds
  2023-03-27 19:14           ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2023-03-27 19:13 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, brauner, viro

On Mon, Mar 27, 2023 at 12:11 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Ok, this is all the legacy "no iter_read/write" case, so that looks
> fine to me too.

.. and I guess that makes PATCH 2/4 go away too, which is a good sign.

            Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF
  2023-03-27 19:13         ` Linus Torvalds
@ 2023-03-27 19:14           ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-03-27 19:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, brauner, viro

On 3/27/23 1:13 PM, Linus Torvalds wrote:
> On Mon, Mar 27, 2023 at 12:11 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Ok, this is all the legacy "no iter_read/write" case, so that looks
>> fine to me too.
> 
> .. and I guess that makes PATCH 2/4 go away too, which is a good sign.

Exactly, both prep fixups can be dropped, leaving just that one prep
patch.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-27 19:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-27 18:04 [PATCHSET v2 0/4] Turn single segment imports into ITER_UBUF Jens Axboe
2023-03-27 18:04 ` [PATCH 1/4] fs: make do_loop_readv_writev() deal with ITER_UBUF Jens Axboe
2023-03-27 19:03   ` Linus Torvalds
2023-03-27 19:06     ` Jens Axboe
2023-03-27 19:11       ` Linus Torvalds
2023-03-27 19:13         ` Linus Torvalds
2023-03-27 19:14           ` Jens Axboe
2023-03-27 18:04 ` [PATCH 2/4] mm: make process_madvise() " Jens Axboe
2023-03-27 18:04 ` [PATCH 3/4] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
2023-03-27 18:04 ` [PATCH 4/4] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).