From: Christoph Hellwig <hch@lst.de>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>,
cem@kernel.org, linux-fsdevel@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 11/11] xfs: add media verification ioctl
Date: Wed, 14 Jan 2026 07:02:14 +0100 [thread overview]
Message-ID: <20260114060214.GA10372@lst.de> (raw)
In-Reply-To: <20260113232113.GD15551@frogsfrogsfrogs>
On Tue, Jan 13, 2026 at 03:21:13PM -0800, Darrick J. Wong wrote:
> > > +#define XFS_VERIFY_TO_EOD (~0ULL) /* end of disk */
> >
> > Is there much of a point in this flag? scrub/healer really should
> > know the device size, shouldn't they?
>
> Yes, scrub and healer both know the size they want to verify. I put
> that in for the sake of xfs_io so that it wouldn't have to figure out
> the device size, but as the ioctl always decreases @end_daddr to the
> actual EOD, I think it'd be ok if xfs_io blindly wrote in ~0ULL.
That's the best of both worlds.
> > > + const unsigned int iosize = BIO_MAX_VECS << PAGE_SHIFT;
> > > + unsigned int bufsize = iosize;
> >
> > That's a pretty gigantic buffer size. In general a low number of
> > MB should max out most current devices, and for a background scrub
> > you generally do not want to actually max out the device..
>
> 256 * 4k (= 1MB) is too large a buffer?
No, my reading comprehension just sucks :) And of course the way
it's written isn't very helpful either.
> I guess that /is/ 16M on a 64k-page system.
Yeah, just stick to SZ_1M.
> > > + min(nr_sects, bufsize >> SECTOR_SHIFT);
> > > +
> > > + bio_add_folio_nofail(bio, folio,
> > > + vec_sects << SECTOR_SHIFT, 0);
> > > +
> > > + bio_daddr += vec_sects;
> > > + bio_bbcount -= vec_sects;
> > > + bio_submitted += vec_sects;
> > > + }
> >
> > A single folio is always just a single vetor in the bio. No need
> > for any of the looping here.
>
> If we have to fall back to a single base page, shouldn't we still try to
> create a larger bio?
How do you create a larger bio if you only have a single bio available?
> A subtle assumption here is that it's ok to have
> all the bvecs pointing to the same memory, and that the device won't
> screw up if someone asks it to DMA to the same page simultaneously.
Ooooh. Yes, that will screw up badly when using PI.
> > > + /* Don't let too many IOs accumulate */
> > > + if (bio_submitted > SZ_256M >> SECTOR_SHIFT) {
> > > + blk_finish_plug(&plug);
> > > + error = submit_bio_wait(bio);
> >
> > Also the building up and chaining here seems harmful. If you're
> > on SSDs you want to fire things off ASAP if you have large I/O.
> > On a HDD we'll take care of it below, but the bios will usually
> > actually be split, not merged anyway as they are beyond the
> > supported I/O size of the HBAs.
>
> Hrm, maybe I should query the block device for max_sectors_kb then?
No. max_sectors_kb is kida stupid. I think a sensible default and
a tunable is a better choice here at least for now.
> However, in the case where memory is fragmented and we can only get
> (say) a single base page, it'll still try to load up the bio with as
> many vecs as it can to try to keep the io size large, because issuing
> 256x 4k IOs is a lot slower than issuing 1x 1M IO with the same page
> added 256 times.
Yeah. But seriously, if the MM is pretty good and is getting better
at finding large allocations. We need to start relying on that.
> I wonder if nr_vecs ought to be capped by queue_max_segments?
No, leave all that splitting to the block layer. max_segments is
an implementation detail.
next prev parent reply other threads:[~2026-01-14 6:02 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-13 0:32 [PATCHSET v5] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-13 0:32 ` [PATCH 01/11] docs: discuss autonomous self healing in the xfs online repair design doc Darrick J. Wong
2026-01-13 16:00 ` Christoph Hellwig
2026-01-13 0:33 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
2026-01-13 16:03 ` Christoph Hellwig
2026-01-13 0:33 ` [PATCH 03/11] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2026-01-13 16:05 ` Christoph Hellwig
2026-01-13 0:33 ` [PATCH 04/11] xfs: convey filesystem unmount events to the health monitor Darrick J. Wong
2026-01-13 16:11 ` Christoph Hellwig
2026-01-13 18:48 ` Darrick J. Wong
2026-01-13 0:33 ` [PATCH 05/11] xfs: convey metadata health " Darrick J. Wong
2026-01-13 16:11 ` Christoph Hellwig
2026-01-13 0:34 ` [PATCH 06/11] xfs: convey filesystem shutdown " Darrick J. Wong
2026-01-13 16:14 ` Christoph Hellwig
2026-01-13 19:01 ` Darrick J. Wong
2026-01-13 0:34 ` [PATCH 07/11] xfs: convey externally discovered fsdax media errors " Darrick J. Wong
2026-01-13 16:15 ` Christoph Hellwig
2026-01-13 0:34 ` [PATCH 08/11] xfs: convey file I/O " Darrick J. Wong
2026-01-13 16:15 ` Christoph Hellwig
2026-01-13 0:34 ` [PATCH 09/11] xfs: allow reconfiguration of the health monitoring device Darrick J. Wong
2026-01-13 16:17 ` Christoph Hellwig
2026-01-13 18:28 ` Darrick J. Wong
2026-01-13 0:35 ` [PATCH 10/11] xfs: check if an open file is on the health monitored fs Darrick J. Wong
2026-01-13 16:17 ` Christoph Hellwig
2026-01-13 0:35 ` [PATCH 11/11] xfs: add media verification ioctl Darrick J. Wong
2026-01-13 15:57 ` Christoph Hellwig
2026-01-13 23:21 ` Darrick J. Wong
2026-01-14 5:40 ` Darrick J. Wong
2026-01-14 6:02 ` Christoph Hellwig [this message]
2026-01-14 6:07 ` Darrick J. Wong
2026-01-14 6:15 ` Christoph Hellwig
2026-01-14 6:19 ` Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2026-01-16 5:42 [PATCHSET v6] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-16 5:44 ` [PATCH 11/11] xfs: add media verification ioctl Darrick J. Wong
2026-01-19 15:56 ` Christoph Hellwig
2026-01-19 17:35 ` Darrick J. Wong
2026-01-21 6:34 [PATCHSET v7 1/3] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-21 6:37 ` [PATCH 11/11] xfs: add media verification ioctl Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260114060214.GA10372@lst.de \
--to=hch@lst.de \
--cc=cem@kernel.org \
--cc=djwong@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.