* + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
@ 2025-08-12 1:02 Andrew Morton
2025-08-12 9:38 ` David Howells
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2025-08-12 1:02 UTC (permalink / raw)
To: mm-commits, viro, stable, ryan, maximilian, dhowells, ct, brauner,
arnout, asmadeus, akpm
The patch titled
Subject: iov_iter: iterate_folioq: fix handling of offset >= folio size
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Dominique Martinet <asmadeus@codewreck.org>
Subject: iov_iter: iterate_folioq: fix handling of offset >= folio size
Date: Mon, 11 Aug 2025 16:39:05 +0900
So we've had this regression in 9p for.. almost a year, which is way too
long, but there was no "easy" reproducer until yesterday (thank you
again!!)
It turned out to be a bug with iov_iter on folios,
iov_iter_get_pages_alloc2() would advance the iov_iter correctly up to the
end edge of a folio and the later copy_to_iter() fails on the
iterate_folioq() bug.
Happy to consider alternative ways of fixing this, now there's a
reproducer it's all much clearer; for the bug to be visible we basically
need to make and IO with non-contiguous folios in the iov_iter which is
not obvious to test with synthetic VMs, with size that triggers a
zero-copy read followed by a non-zero-copy read.
It's apparently possible to get an iov forwarded all the way up to the end
of the current page we're looking at, e.g.
(gdb) p *iter
$24 = {iter_type = 4 '\004', nofault = false, data_source = false, iov_offset = 4096, {__ubuf_iovec = {
iov_base = 0xffff88800f5bc000, iov_len = 655}, {{__iov = 0xffff88800f5bc000, kvec = 0xffff88800f5bc000,
bvec = 0xffff88800f5bc000, folioq = 0xffff88800f5bc000, xarray = 0xffff88800f5bc000,
ubuf = 0xffff88800f5bc000}, count = 655}}, {nr_segs = 2, folioq_slot = 2 '\002', xarray_start = 2}}
Where iov_offset is 4k with 4k-sized folios
This should have been because we're only in the 2nd slot and there's
another one after this, but iterate_folioq should not try to map a folio
that skips the whole size, and more importantly part here does not end up
zero (because 'PAGE_SIZE - skip % PAGE_SIZE' ends up PAGE_SIZE and not
zero..), so skip forward to the "advance to next folio" code.
Link: https://lkml.kernel.org/r/20250811-iot_iter_folio-v1-0-d9c223adf93c@codewreck.org
Link: https://lkml.kernel.org/r/20250811-iot_iter_folio-v1-1-d9c223adf93c@codewreck.org
Link: https://lkml.kernel.org/r/D4LHHUNLG79Y.12PI0X6BEHRHW@mbosch.me/
Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reported-by: Maximilian Bosch <maximilian@mbosch.me>
Reported-by: Ryan Lahfa <ryan@lahfa.xyz>
Reported-by: Christian Theune <ct@flyingcircus.io>
Reported-by: Arnout Engelen <arnout@bzzt.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Ryan Lahfa <ryan@lahfa.xyz>
Cc: <stable@vger.kernel.org> [v6.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/iov_iter.h | 3 +++
1 file changed, 3 insertions(+)
--- a/include/linux/iov_iter.h~iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size
+++ a/include/linux/iov_iter.h
@@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *i
break;
fsize = folioq_folio_size(folioq, slot);
+ if (skip >= fsize)
+ goto next;
base = kmap_local_folio(folio, skip);
part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
remain = step(base, progress, part, priv, priv2);
@@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *i
progress += consumed;
skip += consumed;
if (skip >= fsize) {
+next:
skip = 0;
slot++;
if (slot == folioq_nr_slots(folioq) && folioq->next) {
_
Patches currently in -mm which might be from asmadeus@codewreck.org are
iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch
iov_iter-iov_folioq_get_pages-dont-leave-empty-slot-behind.patch
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
2025-08-12 1:02 Andrew Morton
@ 2025-08-12 9:38 ` David Howells
2025-08-12 21:20 ` Dominique Martinet
0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2025-08-12 9:38 UTC (permalink / raw)
To: Andrew Morton, asmadeus
Cc: dhowells, mm-commits, viro, stable, ryan, maximilian, ct, brauner,
arnout
Thinking about it again, I think this:
> @@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *i
> break;
>
> fsize = folioq_folio_size(folioq, slot);
> + if (skip >= fsize)
> + goto next;
> base = kmap_local_folio(folio, skip);
> part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
> remain = step(base, progress, part, priv, priv2);
> @@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *i
> progress += consumed;
> skip += consumed;
> if (skip >= fsize) {
> +next:
> skip = 0;
> slot++;
> if (slot == folioq_nr_slots(folioq) && folioq->next) {
Would be much better done as:
> @@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *i
> break;
>
> fsize = folioq_folio_size(folioq, slot);
> + if (skip < fsize) {
> base = kmap_local_folio(folio, skip);
> part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
> remain = step(base, progress, part, priv, priv2);
> @@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *i
> progress += consumed;
> skip += consumed;
> + }
> if (skip >= fsize) {
> skip = 0;
> slot++;
> if (slot == folioq_nr_slots(folioq) && folioq->next) {
With the stuff inside the braces suitably indented. The compiler should be
able to optimise away the extra comparison.
David
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
2025-08-12 9:38 ` David Howells
@ 2025-08-12 21:20 ` Dominique Martinet
2025-08-12 21:38 ` David Howells
0 siblings, 1 reply; 6+ messages in thread
From: Dominique Martinet @ 2025-08-12 21:20 UTC (permalink / raw)
To: David Howells
Cc: Andrew Morton, mm-commits, viro, stable, ryan, maximilian, ct,
brauner, arnout
David Howells wrote on Tue, Aug 12, 2025 at 10:38:07AM +0100:
> > @@ -168,6 +168,8 @@ size_t iterate_folioq(struct iov_iter *i
> > break;
> >
> > fsize = folioq_folio_size(folioq, slot);
> > + if (skip < fsize) {
> > base = kmap_local_folio(folio, skip);
> > part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
> > remain = step(base, progress, part, priv, priv2);
> > @@ -177,6 +179,7 @@ size_t iterate_folioq(struct iov_iter *i
> > progress += consumed;
> > skip += consumed;
> > + }
> > if (skip >= fsize) {
> > skip = 0;
> > slot++;
> > if (slot == folioq_nr_slots(folioq) && folioq->next) {
>
> With the stuff inside the braces suitably indented. The compiler should be
> able to optimise away the extra comparison.
skip is modified in the first if so I don't see how the compiler could
optimize it.
I just checked and at least iov_iter.o is slightly bigger with a second if:
(a.o = goto, b.o = if)
06:17:52 asmadeus@thor 0 ~/code/linux/bb$ size a.o b.o
text data bss dec hex filename
26923 1104 0 28027 6d7b a.o
27019 1104 0 28123 6ddb b.o
but honestly I'm happy to focus on readability here -- if you think two
if are easier to read I'll be happy to send a v3
--
Dominique Martinet | Asmadeus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
2025-08-12 21:20 ` Dominique Martinet
@ 2025-08-12 21:38 ` David Howells
2025-08-13 6:00 ` Dominique Martinet
0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2025-08-12 21:38 UTC (permalink / raw)
To: Dominique Martinet
Cc: dhowells, Andrew Morton, mm-commits, viro, stable, ryan,
maximilian, ct, brauner, arnout
Dominique Martinet <asmadeus@codewreck.org> wrote:
> skip is modified in the first if so I don't see how the compiler could
> optimize it.
I mean that if you have:
if (skip < fsize) {
...
}
if (skip >= fsize) {
...
}
then the compiler should be able to take the second check as true in the case
where the first fails.
> 26923 1104 0 28027 6d7b a.o
> 27019 1104 0 28123 6ddb b.o
That's a surprisingly large change.
> but honestly I'm happy to focus on readability here -- if you think two
> if are easier to read I'll be happy to send a v3
I must admit I dislike goto jumping *in* to a braced section. It's not even
allowed in C++ and Java, IIRC.
David
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
2025-08-12 21:38 ` David Howells
@ 2025-08-13 6:00 ` Dominique Martinet
0 siblings, 0 replies; 6+ messages in thread
From: Dominique Martinet @ 2025-08-13 6:00 UTC (permalink / raw)
To: David Howells
Cc: Andrew Morton, mm-commits, viro, stable, ryan, maximilian, ct,
brauner, arnout
David Howells wrote on Tue, Aug 12, 2025 at 10:38:31PM +0100:
> > 26923 1104 0 28027 6d7b a.o
> > 27019 1104 0 28123 6ddb b.o
>
> That's a surprisingly large change.
Right, because the function is inlined multiple times in iov_iter.o it
turned out rather big... Here's what it looks like from busybox's
bloat-o-meter, which gives a better picture:
function old new delta
iov_iter_zero 1706 1738 +32
copy_page_to_iter_nofault 2491 2512 +21
_copy_mc_to_iter 1604 1624 +20
copy_folio_from_iter_atomic 2620 2633 +13
_copy_from_iter_nocache 1706 1719 +13
_copy_from_iter_flushcache 1409 1422 +13
_copy_to_iter 1885 1891 +6
_copy_from_iter 1874 1880 +6
iov_iter_extract_pages 2182 2166 -16
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 8/1 up/down: 124/-16) Total: 108 bytes
So the compiler obviously doesn't optimize the if (at least with
whatever default flags I'm using on gcc 15.2.1), but the impact is only
that big because it's copied so many times.
Anyway, as said before I'm happy to prioritize readability here, so
sending a v3 with just this changed.
--
Dominique Martinet | Asmadeus
^ permalink raw reply [flat|nested] 6+ messages in thread
* + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch
@ 2025-08-13 23:15 Andrew Morton
0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2025-08-13 23:15 UTC (permalink / raw)
To: mm-commits, willy, viro, stable, ryan, maximilian, dhowells, ct,
brauner, arnout, asmadeus, akpm
The patch titled
Subject: iov_iter: iterate_folioq: fix handling of offset >= folio size
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Dominique Martinet <asmadeus@codewreck.org>
Subject: iov_iter: iterate_folioq: fix handling of offset >= folio size
Date: Wed, 13 Aug 2025 15:04:55 +0900
It's apparently possible to get an iov advanced all the way up to the end
of the current page we're looking at, e.g.
(gdb) p *iter
$24 = {iter_type = 4 '\004', nofault = false, data_source = false, iov_offset = 4096, {__ubuf_iovec = {
iov_base = 0xffff88800f5bc000, iov_len = 655}, {{__iov = 0xffff88800f5bc000, kvec = 0xffff88800f5bc000,
bvec = 0xffff88800f5bc000, folioq = 0xffff88800f5bc000, xarray = 0xffff88800f5bc000,
ubuf = 0xffff88800f5bc000}, count = 655}}, {nr_segs = 2, folioq_slot = 2 '\002', xarray_start = 2}}
Where iov_offset is 4k with 4k-sized folios
This should have been fine because we're only in the 2nd slot and there's
another one after this, but iterate_folioq should not try to map a folio
that skips the whole size, and more importantly part here does not end up
zero (because 'PAGE_SIZE - skip % PAGE_SIZE' ends up PAGE_SIZE and not
zero..), so skip forward to the "advance to next folio" code
Link: https://lkml.kernel.org/r/20250813-iot_iter_folio-v3-0-a0ffad2b665a@codewreck.org
Link: https://lkml.kernel.org/r/20250813-iot_iter_folio-v3-1-a0ffad2b665a@codewreck.org
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Reported-by: Maximilian Bosch <maximilian@mbosch.me>
Reported-by: Ryan Lahfa <ryan@lahfa.xyz>
Reported-by: Christian Theune <ct@flyingcircus.io>
Reported-by: Arnout Engelen <arnout@bzzt.net>
Link: https://lkml.kernel.org/r/D4LHHUNLG79Y.12PI0X6BEHRHW@mbosch.me/
Acked-by: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org> [6.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/iov_iter.h | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
--- a/include/linux/iov_iter.h~iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size
+++ a/include/linux/iov_iter.h
@@ -160,7 +160,7 @@ size_t iterate_folioq(struct iov_iter *i
do {
struct folio *folio = folioq_folio(folioq, slot);
- size_t part, remain, consumed;
+ size_t part, remain = 0, consumed;
size_t fsize;
void *base;
@@ -168,14 +168,16 @@ size_t iterate_folioq(struct iov_iter *i
break;
fsize = folioq_folio_size(folioq, slot);
- base = kmap_local_folio(folio, skip);
- part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
- remain = step(base, progress, part, priv, priv2);
- kunmap_local(base);
- consumed = part - remain;
- len -= consumed;
- progress += consumed;
- skip += consumed;
+ if (skip < fsize) {
+ base = kmap_local_folio(folio, skip);
+ part = umin(len, PAGE_SIZE - skip % PAGE_SIZE);
+ remain = step(base, progress, part, priv, priv2);
+ kunmap_local(base);
+ consumed = part - remain;
+ len -= consumed;
+ progress += consumed;
+ skip += consumed;
+ }
if (skip >= fsize) {
skip = 0;
slot++;
_
Patches currently in -mm which might be from asmadeus@codewreck.org are
iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch
iov_iter-iov_folioq_get_pages-dont-leave-empty-slot-behind.patch
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-08-13 23:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13 23:15 + iov_iter-iterate_folioq-fix-handling-of-offset-=-folio-size.patch added to mm-hotfixes-unstable branch Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2025-08-12 1:02 Andrew Morton
2025-08-12 9:38 ` David Howells
2025-08-12 21:20 ` Dominique Martinet
2025-08-12 21:38 ` David Howells
2025-08-13 6:00 ` Dominique Martinet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox