All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Malte Schröder" <malte.schroeder@tnxip.de>,
	"Kent Overstreet" <kent.overstreet@linux.dev>,
	"Miklos Szeredi" <mszeredi@redhat.com>,
	"Joanne Koong" <joannelkoong@gmail.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: silent data corruption in fuse in rc1
Date: Mon, 9 Dec 2024 10:48:50 -0500	[thread overview]
Message-ID: <20241209154850.GA2843669@perftesting> (raw)
In-Reply-To: <Z1cMjlWfehN6ssRb@casper.infradead.org>

On Mon, Dec 09, 2024 at 03:28:14PM +0000, Matthew Wilcox wrote:
> On Mon, Dec 09, 2024 at 09:49:48AM -0500, Josef Bacik wrote:
> > > Ha! This time I bisected from f03b296e8b51 to d1dfb5f52ffc. I ended up
> > > with 3b97c3652d91 as the culprit.
> > 
> > Willy, I've looked at this code and it does indeed look like a 1:1 conversion,
> > EXCEPT I'm fuzzy about how how this works with large folios.  Previously, if we
> > got a hugepage in, we'd get each individual struct page back for the whole range
> > of the hugepage, so if for example we had a 2M hugepage, we'd fill in the
> > ->offset for each "middle" struct page as 0, since obviously we're consuming
> > PAGE_SIZE chunks at a time.
> > 
> > But now we're doing this
> > 
> > 	for (i = 0; i < nfolios; i++)
> > 		ap->folios[i + ap->num_folios] = page_folio(pages[i]);
> > 
> > So if userspace handed us a 2M hugepage, page_folio() on each of the
> > intermediary struct page's would return the same folio, correct?  So we'd end up
> > with the wrong offsets for our fuse request, because they should be based from
> > the start of the folio, correct?
> 
> I think you're 100% right.  We could put in some nice asserts to check
> this is what's happening, but it does seem like a rather incautious
> conversion.  Yes, all folios _in the page cache_ for fuse are small, but
> that's not guaranteed to be the case for folios found in userspace for
> directio.  At least the comment is wrong, and I'd suggest the code is too.

Ok cool, Malte can you try the attached only compile tested patch and see if the
problem goes away?  Thanks,

Josef

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 88d0946b5bc9..c4b93ead99a5 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1562,9 +1562,19 @@ static int fuse_get_user_pages(struct fuse_args_pages *ap, struct iov_iter *ii,
 		nfolios = DIV_ROUND_UP(ret, PAGE_SIZE);
 
 		ap->descs[ap->num_folios].offset = start;
-		fuse_folio_descs_length_init(ap->descs, ap->num_folios, nfolios);
-		for (i = 0; i < nfolios; i++)
-			ap->folios[i + ap->num_folios] = page_folio(pages[i]);
+		for (i = 0; i < nfolios; i++) {
+			struct folio *folio = page_folio(pages[i]);
+			unsigned int offset = start +
+				(folio_page_idx(folio, pages[i]) << PAGE_SHIFT);
+			unsigned int len = min_t(unsigned int, ret, folio_size(folio) - offset);
+
+			len = min_t(unsigned int, len, PAGE_SIZE);
+
+			ap->descs[ap->num_folios + i].offset = offset;
+			ap->descs[ap->num_folios + i].length = len;
+			ap->folios[i + ap->num_folios] = folio;
+			start = 0;
+		}
 
 		ap->num_folios += nfolios;
 		ap->descs[ap->num_folios - 1].length -=

  reply	other threads:[~2024-12-09 15:48 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-07 21:58 silent data corruption in fuse in rc1 Kent Overstreet
2024-12-07 23:01 ` Malte Schröder
2024-12-07 23:39   ` Kent Overstreet
2024-12-08  0:11     ` Malte Schröder
2024-12-08  1:23   ` Matthew Wilcox
2024-12-08 20:02     ` Malte Schröder
2024-12-08 22:32       ` Malte Schröder
2024-12-09  1:57         ` Jingbo Xu
2024-12-09  6:42           ` Malte Schröder
2024-12-09  8:06             ` Bernd Schubert
2024-12-09  9:07               ` Malte Schröder
2024-12-09 13:06                 ` Bernd Schubert
2024-12-09 13:45                   ` Malte Schröder
2024-12-09 14:49         ` Josef Bacik
2024-12-09 15:28           ` Matthew Wilcox
2024-12-09 15:48             ` Josef Bacik [this message]
2024-12-09 17:07               ` Malte Schröder
2024-12-09 18:47                 ` Joanne Koong
2024-12-09 19:52                   ` Joanne Koong
2024-12-10  5:14                     ` Joanne Koong
2024-12-10 16:54                       ` Kent Overstreet
2024-12-10 17:08                         ` Matthew Wilcox
2024-12-10 17:43                           ` Kent Overstreet
2024-12-10 17:31                         ` Josef Bacik
2024-12-10 17:48                           ` Kent Overstreet
2024-12-10 17:53                       ` Malte Schröder
2024-12-11  2:57                         ` Joanne Koong
2024-12-11  6:43                           ` Kent Overstreet
2024-12-11 20:32                             ` Joanne Koong
2024-12-11 14:01                           ` Malte Schröder
2024-12-11 19:43                             ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241209154850.GA2843669@perftesting \
    --to=josef@toxicpanda.com \
    --cc=joannelkoong@gmail.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=malte.schroeder@tnxip.de \
    --cc=mszeredi@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.