From: Jan Kara <jack@suse.cz>
To: Zhang Yi <yi.zhang@huawei.com>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, cluster-devel@redhat.com,
ntfs3@lists.linux.dev, ocfs2-devel@oss.oracle.com,
reiserfs-devel@vger.kernel.org, jack@suse.cz, tytso@mit.edu,
akpm@linux-foundation.org, axboe@kernel.dk,
viro@zeniv.linux.org.uk, rpeterso@redhat.com,
agruenba@redhat.com, almaz.alexandrovich@paragon-software.com,
mark@fasheh.com, dushistov@mail.ru, hch@infradead.org,
chengzhihao1@huawei.com, yukuai3@huawei.com
Subject: Re: [PATCH v2 02/14] fs/buffer: add some new buffer read helpers
Date: Thu, 1 Sep 2022 17:44:09 +0200 [thread overview]
Message-ID: <20220901154409.dcyvtknzzdjhuzas@quack3> (raw)
In-Reply-To: <20220901133505.2510834-3-yi.zhang@huawei.com>
On Thu 01-09-22 21:34:53, Zhang Yi wrote:
> Current ll_rw_block() helper is fragile because it assumes that locked
> buffer means it's under IO which is submitted by some other who holds
> the lock, it skip buffer if it failed to get the lock, so it's only
> safe on the readahead path. Unfortunately, now that most filesystems
> still use this helper mistakenly on the sync metadata read path. There
> is no guarantee that the one who holds the buffer lock always submit IO
> (e.g. buffer_migrate_folio_norefs() after commit 88dbcbb3a484 ("blkdev:
> avoid migration stalls for blkdev pages"), it could lead to false
> positive -EIO when submitting reading IO.
>
> This patch add some friendly buffer read helpers to prepare replacing
> ll_rw_block() and similar calls. We can only call bh_readahead_[]
> helpers for the readahead paths.
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/buffer.c | 65 +++++++++++++++++++++++++++++++++++++
> include/linux/buffer_head.h | 38 ++++++++++++++++++++++
> 2 files changed, 103 insertions(+)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index a0b70b3239f3..a6bc769e665d 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -3017,6 +3017,71 @@ int bh_uptodate_or_lock(struct buffer_head *bh)
> }
> EXPORT_SYMBOL(bh_uptodate_or_lock);
>
> +/**
> + * __bh_read - Submit read for a locked buffer
> + * @bh: struct buffer_head
> + * @op_flags: appending REQ_OP_* flags besides REQ_OP_READ
> + * @wait: wait until reading finish
> + *
> + * Returns zero on success or don't wait, and -EIO on error.
> + */
> +int __bh_read(struct buffer_head *bh, blk_opf_t op_flags, bool wait)
> +{
> + int ret = 0;
> +
> + BUG_ON(!buffer_locked(bh));
> +
> + get_bh(bh);
> + bh->b_end_io = end_buffer_read_sync;
> + submit_bh(REQ_OP_READ | op_flags, bh);
> + if (wait) {
> + wait_on_buffer(bh);
> + if (!buffer_uptodate(bh))
> + ret = -EIO;
> + }
> + return ret;
> +}
> +EXPORT_SYMBOL(__bh_read);
> +
> +/**
> + * __bh_read_batch - Submit read for a batch of unlocked buffers
> + * @nr: entry number of the buffer batch
> + * @bhs: a batch of struct buffer_head
> + * @op_flags: appending REQ_OP_* flags besides REQ_OP_READ
> + * @force_lock: force to get a lock on the buffer if set, otherwise drops any
> + * buffer that cannot lock.
> + *
> + * Returns zero on success or don't wait, and -EIO on error.
> + */
> +void __bh_read_batch(int nr, struct buffer_head *bhs[],
> + blk_opf_t op_flags, bool force_lock)
> +{
> + int i;
> +
> + for (i = 0; i < nr; i++) {
> + struct buffer_head *bh = bhs[i];
> +
> + if (buffer_uptodate(bh))
> + continue;
> +
> + if (force_lock)
> + lock_buffer(bh);
> + else
> + if (!trylock_buffer(bh))
> + continue;
> +
> + if (buffer_uptodate(bh)) {
> + unlock_buffer(bh);
> + continue;
> + }
> +
> + bh->b_end_io = end_buffer_read_sync;
> + get_bh(bh);
> + submit_bh(REQ_OP_READ | op_flags, bh);
> + }
> +}
> +EXPORT_SYMBOL(__bh_read_batch);
> +
> /**
> * bh_submit_read - Submit a locked buffer for reading
> * @bh: struct buffer_head
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index c3863c417b00..6d09785bed9f 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -232,6 +232,9 @@ void write_boundary_block(struct block_device *bdev,
> sector_t bblock, unsigned blocksize);
> int bh_uptodate_or_lock(struct buffer_head *bh);
> int bh_submit_read(struct buffer_head *bh);
> +int __bh_read(struct buffer_head *bh, blk_opf_t op_flags, bool wait);
> +void __bh_read_batch(int nr, struct buffer_head *bhs[],
> + blk_opf_t op_flags, bool force_lock);
>
> extern int buffer_heads_over_limit;
>
> @@ -399,6 +402,41 @@ static inline struct buffer_head *__getblk(struct block_device *bdev,
> return __getblk_gfp(bdev, block, size, __GFP_MOVABLE);
> }
>
> +static inline void bh_readahead(struct buffer_head *bh, blk_opf_t op_flags)
> +{
> + if (!buffer_uptodate(bh) && trylock_buffer(bh)) {
> + if (!buffer_uptodate(bh))
> + __bh_read(bh, op_flags, false);
> + else
> + unlock_buffer(bh);
> + }
> +}
> +
> +static inline void bh_read_nowait(struct buffer_head *bh, blk_opf_t op_flags)
> +{
> + if (!bh_uptodate_or_lock(bh))
> + __bh_read(bh, op_flags, false);
> +}
> +
> +/* Returns 1 if buffer uptodated, 0 on success, and -EIO on error. */
> +static inline int bh_read(struct buffer_head *bh, blk_opf_t op_flags)
> +{
> + if (bh_uptodate_or_lock(bh))
> + return 1;
> + return __bh_read(bh, op_flags, true);
> +}
> +
> +static inline void bh_read_batch(int nr, struct buffer_head *bhs[])
> +{
> + __bh_read_batch(nr, bhs, 0, true);
> +}
> +
> +static inline void bh_readahead_batch(int nr, struct buffer_head *bhs[],
> + blk_opf_t op_flags)
> +{
> + __bh_read_batch(nr, bhs, op_flags, false);
> +}
> +
> /**
> * __bread() - reads a specified block and returns the bh
> * @bdev: the block_device to read from
> --
> 2.31.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2022-09-01 15:44 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-01 13:34 [PATCH v2 00/14] fs/buffer: remove ll_rw_block() Zhang Yi
2022-09-01 13:34 ` [PATCH v2 01/14] fs/buffer: remove __breadahead_gfp() Zhang Yi
2022-09-05 5:52 ` Christoph Hellwig
2022-09-01 13:34 ` [PATCH v2 02/14] fs/buffer: add some new buffer read helpers Zhang Yi
2022-09-01 15:44 ` Jan Kara [this message]
2022-09-05 5:52 ` Christoph Hellwig
2022-09-01 13:34 ` [PATCH v2 03/14] fs/buffer: replace ll_rw_block() Zhang Yi
2022-09-01 15:45 ` Jan Kara
2022-09-05 5:53 ` Christoph Hellwig
2022-09-01 13:34 ` [PATCH v2 04/14] gfs2: " Zhang Yi
2022-09-01 15:46 ` Jan Kara
2022-09-05 5:54 ` Christoph Hellwig
2022-09-05 14:10 ` Andreas Gruenbacher
2022-09-01 13:34 ` [PATCH v2 05/14] isofs: " Zhang Yi
2022-09-05 5:54 ` Christoph Hellwig
2022-09-01 13:34 ` [PATCH v2 06/14] jbd2: " Zhang Yi
2022-09-01 15:50 ` Jan Kara
2022-09-05 5:54 ` Christoph Hellwig
2022-09-05 8:23 ` Theodore Ts'o
2022-09-01 13:34 ` [PATCH v2 07/14] ntfs3: " Zhang Yi
2022-09-01 15:47 ` Jan Kara
2022-09-05 5:54 ` Christoph Hellwig
2022-09-01 13:34 ` [PATCH v2 08/14] ocfs2: " Zhang Yi
2022-09-01 15:51 ` Jan Kara
2022-09-05 5:55 ` Christoph Hellwig
2022-09-06 1:06 ` [Ocfs2-devel] " Joseph Qi
2022-09-01 13:35 ` [PATCH v2 09/14] reiserfs: " Zhang Yi
2022-09-01 15:52 ` Jan Kara
2022-09-05 5:55 ` Christoph Hellwig
2022-09-01 13:35 ` [PATCH v2 10/14] udf: " Zhang Yi
2022-09-01 15:58 ` Jan Kara
2022-09-05 5:55 ` Christoph Hellwig
2022-09-01 13:35 ` [PATCH v2 11/14] ufs: " Zhang Yi
2022-09-01 15:59 ` Jan Kara
2022-09-05 5:56 ` Christoph Hellwig
2022-09-01 13:35 ` [PATCH v2 12/14] fs/buffer: remove ll_rw_block() helper Zhang Yi
2022-09-05 5:56 ` Christoph Hellwig
2022-09-01 13:35 ` [PATCH v2 13/14] ext2: replace bh_submit_read() helper with bh_read_locked() Zhang Yi
2022-09-01 15:59 ` Jan Kara
2022-09-02 0:30 ` Al Viro
2022-09-02 1:32 ` Zhang Yi
2022-09-02 1:51 ` Al Viro
2022-09-02 1:58 ` Zhang Yi
2022-09-05 5:56 ` Christoph Hellwig
2022-09-01 13:35 ` [PATCH v2 14/14] fs/buffer: remove bh_submit_read() helper Zhang Yi
2022-09-05 5:57 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220901154409.dcyvtknzzdjhuzas@quack3 \
--to=jack@suse.cz \
--cc=agruenba@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=almaz.alexandrovich@paragon-software.com \
--cc=axboe@kernel.dk \
--cc=chengzhihao1@huawei.com \
--cc=cluster-devel@redhat.com \
--cc=dushistov@mail.ru \
--cc=hch@infradead.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark@fasheh.com \
--cc=ntfs3@lists.linux.dev \
--cc=ocfs2-devel@oss.oracle.com \
--cc=reiserfs-devel@vger.kernel.org \
--cc=rpeterso@redhat.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=yi.zhang@huawei.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox