From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B757931C587; Thu, 11 Sep 2025 11:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757591072; cv=none; b=TYSvnahWkjEAMA0tFURpIPswR7MSNb5M1fKH+NsuhinrWbjl6Qvz4xpzdCVLnd+zA45WJiT5BDaYR3S8vWF7eFKMoKlQZqfTD7c5KOp+c+jR2NwxS8m7Zqhq6NJPoAyQ26MgLo1e+YnoPcDxuQ9CQRb8PYmmOBl//XwG75G4194= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757591072; c=relaxed/simple; bh=kZ3prMyI5i241zhb9QGgx5+5DQzkJzOtw9+Uzt04C1g=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VGp6C2Lh9i2gsdj9EzpVY77fEc1DN+A9E8YHaeuTE1r0IFc9C2reSPcLUNYNeh6FargPiLORqb9CqpB5ZQ2/cLTnOTcfqiih4TIcEiCo2IKSScsHsJgSm3qXNpoL+Dfx98SjojwCOCiUFTpxC4AX4scNZeJMP/s8GmmZYwdfeGk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=xCKKkeBk; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="xCKKkeBk" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=6iKezg0/3kaaDC2tOOFg1XDQx9qWJDF4MHqo/lI7WwE=; b=xCKKkeBkyLl3Y301OPAbDs9q+/ c0TrMP2OZ3vjnfR65IdCUA4LINaA4gTdckdaftu+0ZB4T3Jtcms3Jea9uK6vKul2yzMNKeH43IAfe pEnT6hxosUbMFZ774+vsW7CPh9m19pU6dLLH5NEq/rrBRpO2skQCbfeaZeA4DGluX1MuxNUekqWUB W/KJm6YcZYf47I+U1lAezOr87gcp2rops/FlM4Y5AkvY+FcIMr3xg9xBh3waI+S5ByPxxRS9dAfeJ FJQZt6RGJ4XTd6QnFVvOFFjd8/+28EBi/g9/vAaQEdysFr18HAT2K28d5n5d/4JVkktKriJlSAW5M VwOQ75fw==; Received: from hch by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1uwfj0-00000002mYc-3vAR; Thu, 11 Sep 2025 11:44:26 +0000 Date: Thu, 11 Sep 2025 04:44:26 -0700 From: Christoph Hellwig To: Joanne Koong Cc: brauner@kernel.org, miklos@szeredi.hu, hch@infradead.org, djwong@kernel.org, hsiangkao@linux.alibaba.com, linux-block@vger.kernel.org, gfs2@lists.linux.dev, linux-fsdevel@vger.kernel.org, kernel-team@meta.com, linux-xfs@vger.kernel.org, linux-doc@vger.kernel.org Subject: Re: [PATCH v2 13/16] iomap: move read/readahead logic out of CONFIG_BLOCK guard Message-ID: References: <20250908185122.3199171-1-joannelkoong@gmail.com> <20250908185122.3199171-14-joannelkoong@gmail.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250908185122.3199171-14-joannelkoong@gmail.com> X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html On Mon, Sep 08, 2025 at 11:51:19AM -0700, Joanne Koong wrote: > There is no longer a dependency on CONFIG_BLOCK in the iomap read and > readahead logic. Move this logic out of the CONFIG_BLOCK guard. This > allows non-block-based filesystems to use iomap for reads/readahead. Please move the bio code into a new file. Example patch attached below that does just that without addressing any of the previous comments: diff --git a/fs/iomap/Makefile b/fs/iomap/Makefile index f7e1c8534c46..a572b8808524 100644 --- a/fs/iomap/Makefile +++ b/fs/iomap/Makefile @@ -14,5 +14,6 @@ iomap-y += trace.o \ iomap-$(CONFIG_BLOCK) += direct-io.o \ ioend.o \ fiemap.o \ - seek.o + seek.o \ + bio.o iomap-$(CONFIG_SWAP) += swapfile.o diff --git a/fs/iomap/bio.c b/fs/iomap/bio.c new file mode 100644 index 000000000000..bcb87441be9f --- /dev/null +++ b/fs/iomap/bio.c @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2010 Red Hat, Inc. + * Copyright (C) 2016-2023 Christoph Hellwig. + */ +#include +#include +#include "internal.h" +#include "trace.h" + +static void iomap_read_end_io(struct bio *bio) +{ + int error = blk_status_to_errno(bio->bi_status); + struct folio_iter fi; + + bio_for_each_folio_all(fi, bio) + iomap_finish_folio_read(fi.folio, fi.offset, fi.length, error); + bio_put(bio); +} + +static int iomap_submit_read_bio(struct iomap_read_folio_ctx *ctx) +{ + struct bio *bio = ctx->private; + + if (bio) + submit_bio(bio); + + return 0; +} + +/** + * Read in a folio range asynchronously through bios. + * + * This should only be used for read/readahead, not for buffered writes. + * Buffered writes must read in the folio synchronously. + */ +static int iomap_read_folio_range_bio_async(const struct iomap_iter *iter, + struct iomap_read_folio_ctx *ctx, loff_t pos, size_t plen) +{ + struct folio *folio = ctx->cur_folio; + const struct iomap *iomap = &iter->iomap; + size_t poff = offset_in_folio(folio, pos); + loff_t length = iomap_length(iter); + sector_t sector; + struct bio *bio = ctx->private; + + iomap_start_folio_read(folio, plen); + + sector = iomap_sector(iomap, pos); + if (!bio || bio_end_sector(bio) != sector || + !bio_add_folio(bio, folio, plen, poff)) { + gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL); + gfp_t orig_gfp = gfp; + unsigned int nr_vecs = DIV_ROUND_UP(length, PAGE_SIZE); + + if (bio) + submit_bio(bio); + + if (ctx->rac) /* same as readahead_gfp_mask */ + gfp |= __GFP_NORETRY | __GFP_NOWARN; + bio = bio_alloc(iomap->bdev, bio_max_segs(nr_vecs), + REQ_OP_READ, gfp); + /* + * If the bio_alloc fails, try it again for a single page to + * avoid having to deal with partial page reads. This emulates + * what do_mpage_read_folio does. + */ + if (!bio) + bio = bio_alloc(iomap->bdev, 1, REQ_OP_READ, orig_gfp); + if (ctx->rac) + bio->bi_opf |= REQ_RAHEAD; + bio->bi_iter.bi_sector = sector; + bio->bi_end_io = iomap_read_end_io; + bio_add_folio_nofail(bio, folio, plen, poff); + ctx->private = bio; + } + return 0; +} + +const struct iomap_read_ops iomap_read_bios_ops = { + .read_folio_range = iomap_read_folio_range_bio_async, + .read_submit = iomap_submit_read_bio, +}; +EXPORT_SYMBOL_GPL(iomap_read_bios_ops); + +int iomap_read_folio_range(const struct iomap_iter *iter, + struct folio *folio, loff_t pos, size_t len) +{ + const struct iomap *srcmap = iomap_iter_srcmap(iter); + struct bio_vec bvec; + struct bio bio; + + bio_init(&bio, srcmap->bdev, &bvec, 1, REQ_OP_READ); + bio.bi_iter.bi_sector = iomap_sector(srcmap, pos); + bio_add_folio_nofail(&bio, folio, len, offset_in_folio(folio, pos)); + return submit_bio_wait(&bio); +} diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index c424e8c157dd..48626c11f3d8 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -8,6 +8,7 @@ #include #include #include +#include "internal.h" #include "trace.h" #include "../internal.h" @@ -526,103 +527,6 @@ void iomap_readahead(const struct iomap_ops *ops, } EXPORT_SYMBOL_GPL(iomap_readahead); -#ifdef CONFIG_BLOCK -static void iomap_read_end_io(struct bio *bio) -{ - int error = blk_status_to_errno(bio->bi_status); - struct folio_iter fi; - - bio_for_each_folio_all(fi, bio) - iomap_finish_folio_read(fi.folio, fi.offset, fi.length, error); - bio_put(bio); -} - -static int iomap_submit_read_bio(struct iomap_read_folio_ctx *ctx) -{ - struct bio *bio = ctx->private; - - if (bio) - submit_bio(bio); - - return 0; -} - -/** - * Read in a folio range asynchronously through bios. - * - * This should only be used for read/readahead, not for buffered writes. - * Buffered writes must read in the folio synchronously. - */ -static int iomap_read_folio_range_bio_async(const struct iomap_iter *iter, - struct iomap_read_folio_ctx *ctx, loff_t pos, size_t plen) -{ - struct folio *folio = ctx->cur_folio; - const struct iomap *iomap = &iter->iomap; - size_t poff = offset_in_folio(folio, pos); - loff_t length = iomap_length(iter); - sector_t sector; - struct bio *bio = ctx->private; - - iomap_start_folio_read(folio, plen); - - sector = iomap_sector(iomap, pos); - if (!bio || bio_end_sector(bio) != sector || - !bio_add_folio(bio, folio, plen, poff)) { - gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL); - gfp_t orig_gfp = gfp; - unsigned int nr_vecs = DIV_ROUND_UP(length, PAGE_SIZE); - - if (bio) - submit_bio(bio); - - if (ctx->rac) /* same as readahead_gfp_mask */ - gfp |= __GFP_NORETRY | __GFP_NOWARN; - bio = bio_alloc(iomap->bdev, bio_max_segs(nr_vecs), - REQ_OP_READ, gfp); - /* - * If the bio_alloc fails, try it again for a single page to - * avoid having to deal with partial page reads. This emulates - * what do_mpage_read_folio does. - */ - if (!bio) - bio = bio_alloc(iomap->bdev, 1, REQ_OP_READ, orig_gfp); - if (ctx->rac) - bio->bi_opf |= REQ_RAHEAD; - bio->bi_iter.bi_sector = sector; - bio->bi_end_io = iomap_read_end_io; - bio_add_folio_nofail(bio, folio, plen, poff); - ctx->private = bio; - } - return 0; -} - -const struct iomap_read_ops iomap_read_bios_ops = { - .read_folio_range = iomap_read_folio_range_bio_async, - .read_submit = iomap_submit_read_bio, -}; -EXPORT_SYMBOL_GPL(iomap_read_bios_ops); - -static int iomap_read_folio_range(const struct iomap_iter *iter, - struct folio *folio, loff_t pos, size_t len) -{ - const struct iomap *srcmap = iomap_iter_srcmap(iter); - struct bio_vec bvec; - struct bio bio; - - bio_init(&bio, srcmap->bdev, &bvec, 1, REQ_OP_READ); - bio.bi_iter.bi_sector = iomap_sector(srcmap, pos); - bio_add_folio_nofail(&bio, folio, len, offset_in_folio(folio, pos)); - return submit_bio_wait(&bio); -} -#else -static int iomap_read_folio_range(const struct iomap_iter *iter, - struct folio *folio, loff_t pos, size_t len) -{ - WARN_ON_ONCE(1); - return -EIO; -} -#endif /* CONFIG_BLOCK */ - /* * iomap_is_partially_uptodate checks whether blocks within a folio are * uptodate or not. diff --git a/fs/iomap/internal.h b/fs/iomap/internal.h index d05cb3aed96e..dc6e95c93f13 100644 --- a/fs/iomap/internal.h +++ b/fs/iomap/internal.h @@ -6,4 +6,17 @@ u32 iomap_finish_ioend_direct(struct iomap_ioend *ioend); +#ifdef CONFIG_BLOCK +int iomap_read_folio_range(const struct iomap_iter *iter, + struct folio *folio, loff_t pos, size_t len); +#else +static int iomap_read_folio_range(const struct iomap_iter *iter, + struct folio *folio, loff_t pos, size_t len) +{ + WARN_ON_ONCE(1); + return -EIO; +} +#endif /* CONFIG_BLOCK */ + + #endif /* _IOMAP_INTERNAL_H */