Linux kernel -stable discussions
 help / color / mirror / Atom feed
* + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
@ 2025-08-12  1:02 Andrew Morton
  2025-08-12  9:38 ` David Howells
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2025-08-12  1:02 UTC (permalink / raw)
  To: mm-commits, viro, stable, ryan, maximilian, dhowells, ct, brauner,
	arnout, asmadeus, akpm


The patch titled
     Subject: iov_iter: iterate_folioq: fix handling of offset >= folio size
has been added to the -mm mm-hotfixes-unstable branch.  Its filename is
     iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch

This patch will shortly appear at
     https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch

This patch will later appear in the mm-hotfixes-unstable branch at
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days

------------------------------------------------------
From: Dominique Martinet <asmadeus@codewreck.org>
Subject: iov_iter: iterate_folioq: fix handling of offset >= folio size
Date: Mon, 11 Aug 2025 16:39:05 +0900

So we've had this regression in 9p for..  almost a year, which is way too
long, but there was no "easy" reproducer until yesterday (thank you
again!!)

It turned out to be a bug with iov_iter on folios,
iov_iter_get_pages_alloc2() would advance the iov_iter correctly up to the
end edge of a folio and the later copy_to_iter() fails on the
iterate_folioq() bug.

Happy to consider alternative ways of fixing this, now there's a
reproducer it's all much clearer; for the bug to be visible we basically
need to make and IO with non-contiguous folios in the iov_iter which is
not obvious to test with synthetic VMs, with size that triggers a
zero-copy read followed by a non-zero-copy read.

It's apparently possible to get an iov forwarded all the way up to the end
of the current page we're looking at, e.g.

(gdb) p *iter
$24 = {iter_type = 4 '\004', nofault = false, data_source = false, iov_offset = 4096, {__ubuf_iovec = {
      iov_base = 0xffff88800f5bc000, iov_len = 655}, {{__iov = 0xffff88800f5bc000, kvec = 0xffff88800f5bc000,
        bvec = 0xffff88800f5bc000, folioq = 0xffff88800f5bc000, xarray = 0xffff88800f5bc000,
        ubuf = 0xffff88800f5bc000}, count = 655}}, {nr_segs = 2, folioq_slot = 2 '\002', xarray_start = 2}}

Where iov_offset is 4k with 4k-sized folios

This should have been because we're only in the 2nd slot and there's
another one after this, but iterate_folioq should not try to map a folio
that skips the whole size, and more importantly part here does not end up
zero (because 'PAGE_SIZE - skip % PAGE_SIZE' ends up PAGE_SIZE and not
zero..), so skip forward to the "advance to next folio" code.

Link: https://lkml.kernel.org/r/20250811-iot_iter_folio-v1-0-d9c223adf93c@codewreck.org
Link: https://lkml.kernel.org/r/20250811-iot_iter_folio-v1-1-d9c223adf93c@codewreck.org
Link: https://lkml.kernel.org/r/D4LHHUNLG79Y.12PI0X6BEHRHW@mbosch.me/
Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reported-by: Maximilian Bosch <maximilian@mbosch.me>
Reported-by: Ryan Lahfa <ryan@lahfa.xyz>
Reported-by: Christian Theune <ct@flyingcircus.io>
Reported-by: Arnout Engelen <arnout@bzzt.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Ryan Lahfa <ryan@lahfa.xyz>
Cc: <stable@vger.kernel.org>	[v6.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/iov_iter.h |    3 +++
 1 file changed, 3 insertions(+)

--- a/include/linux/iov_iter.h~iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size
+++ a/include/linux/iov_iter.h
@@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *i
 			break;
 
 		fsize = folioq_folio_size(folioq, slot);
+		if (skip >= fsize)
+			goto next;
 		base = kmap_local_folio(folio, skip);
 		part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
 		remain = step(base, progress, part, priv, priv2);
@@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *i
 		progress += consumed;
 		skip += consumed;
 		if (skip >= fsize) {
+next:
 			skip = 0;
 			slot++;
 			if (slot == folioq_nr_slots(folioq) && folioq->next) {
_

Patches currently in -mm which might be from asmadeus@codewreck.org are

iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch
iov_iter-iov_folioq_get_pages-dont-leave-empty-slot-behind.patch


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
  2025-08-12  1:02 Andrew Morton
@ 2025-08-12  9:38 ` David Howells
  2025-08-12 21:20   ` Dominique Martinet
  0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2025-08-12  9:38 UTC (permalink / raw)
  To: Andrew Morton, asmadeus
  Cc: dhowells, mm-commits, viro, stable, ryan, maximilian, ct, brauner,
	arnout

Thinking about it again, I think this:

> @@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *i
>  			break;
>  
>  		fsize = folioq_folio_size(folioq, slot);
> +		if (skip >= fsize)
> +			goto next;
>  		base = kmap_local_folio(folio, skip);
>  		part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
>  		remain = step(base, progress, part, priv, priv2);
> @@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *i
>  		progress += consumed;
>  		skip += consumed;
>  		if (skip >= fsize) {
> +next:
>  			skip = 0;
>  			slot++;
>  			if (slot == folioq_nr_slots(folioq) && folioq->next) {

Would be much better done as:

> @@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *i
>  			break;
>  
>  		fsize = folioq_folio_size(folioq, slot);
> +		if (skip < fsize) {
>  		base = kmap_local_folio(folio, skip);
>  		part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
>  		remain = step(base, progress, part, priv, priv2);
> @@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *i
>  		progress += consumed;
>  		skip += consumed;
> +		}
>  		if (skip >= fsize) {
>  			skip = 0;
>  			slot++;
>  			if (slot == folioq_nr_slots(folioq) && folioq->next) {

With the stuff inside the braces suitably indented.  The compiler should be
able to optimise away the extra comparison.

David


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
  2025-08-12  9:38 ` David Howells
@ 2025-08-12 21:20   ` Dominique Martinet
  2025-08-12 21:38     ` David Howells
  0 siblings, 1 reply; 6+ messages in thread
From: Dominique Martinet @ 2025-08-12 21:20 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, mm-commits, viro, stable, ryan, maximilian, ct,
	brauner, arnout

David Howells wrote on Tue, Aug 12, 2025 at 10:38:07AM +0100:
> > @@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *i
> >  			break;
> >  
> >  		fsize = folioq_folio_size(folioq, slot);
> > +		if (skip < fsize) {
> >  		base = kmap_local_folio(folio, skip);
> >  		part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
> >  		remain = step(base, progress, part, priv, priv2);
> > @@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *i
> >  		progress += consumed;
> >  		skip += consumed;
> > +		}
> >  		if (skip >= fsize) {
> >  			skip = 0;
> >  			slot++;
> >  			if (slot == folioq_nr_slots(folioq) && folioq->next) {
> 
> With the stuff inside the braces suitably indented.  The compiler should be
> able to optimise away the extra comparison.

skip is modified in the first if so I don't see how the compiler could
optimize it.
I just checked and at least iov_iter.o is slightly bigger with a second if:

(a.o = goto, b.o = if)
06:17:52 asmadeus@thor 0 ~/code/linux/bb$ size a.o b.o
   text	  data	   bss	   dec	   hex	filename
  26923	  1104	     0	 28027	  6d7b	a.o
  27019	  1104	     0	 28123	  6ddb	b.o

but honestly I'm happy to focus on readability here -- if you think two
if are easier to read I'll be happy to send a v3

-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
  2025-08-12 21:20   ` Dominique Martinet
@ 2025-08-12 21:38     ` David Howells
  2025-08-13  6:00       ` Dominique Martinet
  0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2025-08-12 21:38 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: dhowells, Andrew Morton, mm-commits, viro, stable, ryan,
	maximilian, ct, brauner, arnout

Dominique Martinet <asmadeus@codewreck.org> wrote:

> skip is modified in the first if so I don't see how the compiler could
> optimize it.

I mean that if you have:

		if (skip < fsize) {
			...
		}
  		if (skip >= fsize) {
			...
		}

then the compiler should be able to take the second check as true in the case
where the first fails.

>   26923	  1104	     0	 28027	  6d7b	a.o
>   27019	  1104	     0	 28123	  6ddb	b.o

That's a surprisingly large change.

> but honestly I'm happy to focus on readability here -- if you think two
> if are easier to read I'll be happy to send a v3

I must admit I dislike goto jumping *in* to a braced section.  It's not even
allowed in C++ and Java, IIRC.

David


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
  2025-08-12 21:38     ` David Howells
@ 2025-08-13  6:00       ` Dominique Martinet
  0 siblings, 0 replies; 6+ messages in thread
From: Dominique Martinet @ 2025-08-13  6:00 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, mm-commits, viro, stable, ryan, maximilian, ct,
	brauner, arnout

David Howells wrote on Tue, Aug 12, 2025 at 10:38:31PM +0100:
> >   26923	  1104	     0	 28027	  6d7b	a.o
> >   27019	  1104	     0	 28123	  6ddb	b.o
> 
> That's a surprisingly large change.

Right, because the function is inlined multiple times in iov_iter.o it
turned out rather big... Here's what it looks like from busybox's
bloat-o-meter, which gives a better picture:

function                                             old     new   delta
iov_iter_zero                                       1706    1738     +32
copy_page_to_iter_nofault                           2491    2512     +21
_copy_mc_to_iter                                    1604    1624     +20
copy_folio_from_iter_atomic                         2620    2633     +13
_copy_from_iter_nocache                             1706    1719     +13
_copy_from_iter_flushcache                          1409    1422     +13
_copy_to_iter                                       1885    1891      +6
_copy_from_iter                                     1874    1880      +6
iov_iter_extract_pages                              2182    2166     -16
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 8/1 up/down: 124/-16)           Total: 108 bytes

So the compiler obviously doesn't optimize the if (at least with
whatever default flags I'm using on gcc 15.2.1), but the impact is only
that big because it's copied so many times.

Anyway, as said before I'm happy to prioritize readability here, so
sending a v3 with just this changed.
-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
@ 2025-08-13 23:15 Andrew Morton
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2025-08-13 23:15 UTC (permalink / raw)
  To: mm-commits, willy, viro, stable, ryan, maximilian, dhowells, ct,
	brauner, arnout, asmadeus, akpm


The patch titled
     Subject: iov_iter: iterate_folioq: fix handling of offset >= folio size
has been added to the -mm mm-hotfixes-unstable branch.  Its filename is
     iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch

This patch will shortly appear at
     https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch

This patch will later appear in the mm-hotfixes-unstable branch at
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days

------------------------------------------------------
From: Dominique Martinet <asmadeus@codewreck.org>
Subject: iov_iter: iterate_folioq: fix handling of offset >= folio size
Date: Wed, 13 Aug 2025 15:04:55 +0900

It's apparently possible to get an iov advanced all the way up to the end
of the current page we're looking at, e.g.

(gdb) p *iter
$24 = {iter_type = 4 '\004', nofault = false, data_source = false, iov_offset = 4096, {__ubuf_iovec = {
      iov_base = 0xffff88800f5bc000, iov_len = 655}, {{__iov = 0xffff88800f5bc000, kvec = 0xffff88800f5bc000,
        bvec = 0xffff88800f5bc000, folioq = 0xffff88800f5bc000, xarray = 0xffff88800f5bc000,
        ubuf = 0xffff88800f5bc000}, count = 655}}, {nr_segs = 2, folioq_slot = 2 '\002', xarray_start = 2}}

Where iov_offset is 4k with 4k-sized folios

This should have been fine because we're only in the 2nd slot and there's
another one after this, but iterate_folioq should not try to map a folio
that skips the whole size, and more importantly part here does not end up
zero (because 'PAGE_SIZE - skip % PAGE_SIZE' ends up PAGE_SIZE and not
zero..), so skip forward to the "advance to next folio" code

Link: https://lkml.kernel.org/r/20250813-iot_iter_folio-v3-0-a0ffad2b665a@codewreck.org
Link: https://lkml.kernel.org/r/20250813-iot_iter_folio-v3-1-a0ffad2b665a@codewreck.org
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Reported-by: Maximilian Bosch <maximilian@mbosch.me>
Reported-by: Ryan Lahfa <ryan@lahfa.xyz>
Reported-by: Christian Theune <ct@flyingcircus.io>
Reported-by: Arnout Engelen <arnout@bzzt.net>
Link: https://lkml.kernel.org/r/D4LHHUNLG79Y.12PI0X6BEHRHW@mbosch.me/
Acked-by: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[6.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/iov_iter.h |   20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

--- a/include/linux/iov_iter.h~iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size
+++ a/include/linux/iov_iter.h
@@ -160,7 +160,7 @@ size_t iterate_folioq(struct iov_iter *i
 
 	do {
 		struct folio *folio = folioq_folio(folioq, slot);
-		size_t part, remain, consumed;
+		size_t part, remain = 0, consumed;
 		size_t fsize;
 		void *base;
 
@@ -168,14 +168,16 @@ size_t iterate_folioq(struct iov_iter *i
 			break;
 
 		fsize = folioq_folio_size(folioq, slot);
-		base = kmap_local_folio(folio, skip);
-		part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
-		remain = step(base, progress, part, priv, priv2);
-		kunmap_local(base);
-		consumed = part - remain;
-		len -= consumed;
-		progress += consumed;
-		skip += consumed;
+		if (skip < fsize) {
+			base = kmap_local_folio(folio, skip);
+			part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
+			remain = step(base, progress, part, priv, priv2);
+			kunmap_local(base);
+			consumed = part - remain;
+			len -= consumed;
+			progress += consumed;
+			skip += consumed;
+		}
 		if (skip >= fsize) {
 			skip = 0;
 			slot++;
_

Patches currently in -mm which might be from asmadeus@codewreck.org are

iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch
iov_iter-iov_folioq_get_pages-dont-leave-empty-slot-behind.patch


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-08-13 23:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13 23:15 + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2025-08-12  1:02 Andrew Morton
2025-08-12  9:38 ` David Howells
2025-08-12 21:20   ` Dominique Martinet
2025-08-12 21:38     ` David Howells
2025-08-13  6:00       ` Dominique Martinet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox