* [PATCH 1/3] bcachefs: Introduce bch2_splice_read
@ 2025-08-08 13:43 Alan Huang
2025-08-08 13:43 ` [PATCH 2/3] bcachefs: Use our own splice_read implementation Alan Huang
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Alan Huang @ 2025-08-08 13:43 UTC (permalink / raw)
To: kent.overstreet; +Cc: linux-bcachefs, Alan Huang
This provides our own splice read, which locks ei_pagecache_lock around
filemap_splice_read.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
---
fs/bcachefs/fs-io-buffered.c | 12 ++++++++++++
fs/bcachefs/fs-io-buffered.h | 4 ++++
2 files changed, 16 insertions(+)
diff --git a/fs/bcachefs/fs-io-buffered.c b/fs/bcachefs/fs-io-buffered.c
index fd8beb5167ee..c16c45a72c47 100644
--- a/fs/bcachefs/fs-io-buffered.c
+++ b/fs/bcachefs/fs-io-buffered.c
@@ -1100,6 +1100,18 @@ ssize_t bch2_write_iter(struct kiocb *iocb, struct iov_iter *from)
return bch2_err_class(ret);
}
+ssize_t bch2_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe,
+ size_t len, unsigned int flags)
+{
+ ssize_t ret;
+ struct bch_inode_info *inode = to_bch_ei(in->f_mapping->host);
+ bch2_pagecache_add_get(inode);
+ ret = filemap_splice_read(in, ppos, pipe, len, flags);
+ bch2_pagecache_add_put(inode);
+ return ret;
+}
+
void bch2_fs_fs_io_buffered_exit(struct bch_fs *c)
{
bioset_exit(&c->writepage_bioset);
diff --git a/fs/bcachefs/fs-io-buffered.h b/fs/bcachefs/fs-io-buffered.h
index 3207ebbb4ab4..3bcbf62ad420 100644
--- a/fs/bcachefs/fs-io-buffered.h
+++ b/fs/bcachefs/fs-io-buffered.h
@@ -17,6 +17,10 @@ int bch2_write_end(struct file *, struct address_space *, loff_t,
ssize_t bch2_write_iter(struct kiocb *, struct iov_iter *);
+ssize_t bch2_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe,
+ size_t len, unsigned int flags);
+
void bch2_fs_fs_io_buffered_exit(struct bch_fs *);
int bch2_fs_fs_io_buffered_init(struct bch_fs *);
#else
--
2.49.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] bcachefs: Use our own splice_read implementation
2025-08-08 13:43 [PATCH 1/3] bcachefs: Introduce bch2_splice_read Alan Huang
@ 2025-08-08 13:43 ` Alan Huang
2025-08-08 13:43 ` [PATCH 3/3] bcachefs: Don't lock ei_pagecache_lock in bch2_readahead Alan Huang
2025-08-08 18:28 ` [PATCH 1/3] bcachefs: Introduce bch2_splice_read Kent Overstreet
2 siblings, 0 replies; 4+ messages in thread
From: Alan Huang @ 2025-08-08 13:43 UTC (permalink / raw)
To: kent.overstreet; +Cc: linux-bcachefs, Alan Huang
splice_read will read file data to the page cache, to prevent cache
coherency issure from happening, we need to lock ei_pagecache_lock
around it. Therefore, use our own splice_read implementation.
This also solves one deadlock issue, which locks the folio first when
bch2_readahead is invoked from splice_read path, but fail to lock
ei_pagecache_lock, which was already locked by the direct I/O path
or fallocate path, when the DIO path tries to invalidate the page cache,
a deadlock happens.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
---
fs/bcachefs/fs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c
index 0425238a83ee..6d8830345538 100644
--- a/fs/bcachefs/fs.c
+++ b/fs/bcachefs/fs.c
@@ -1716,7 +1716,7 @@ static const struct file_operations bch_file_operations = {
.mmap = bch2_mmap,
.get_unmapped_area = thp_get_unmapped_area,
.fsync = bch2_fsync,
- .splice_read = filemap_splice_read,
+ .splice_read = bch2_splice_read,
.splice_write = iter_file_splice_write,
.fallocate = bch2_fallocate_dispatch,
.unlocked_ioctl = bch2_fs_file_ioctl,
--
2.49.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] bcachefs: Don't lock ei_pagecache_lock in bch2_readahead
2025-08-08 13:43 [PATCH 1/3] bcachefs: Introduce bch2_splice_read Alan Huang
2025-08-08 13:43 ` [PATCH 2/3] bcachefs: Use our own splice_read implementation Alan Huang
@ 2025-08-08 13:43 ` Alan Huang
2025-08-08 18:28 ` [PATCH 1/3] bcachefs: Introduce bch2_splice_read Kent Overstreet
2 siblings, 0 replies; 4+ messages in thread
From: Alan Huang @ 2025-08-08 13:43 UTC (permalink / raw)
To: kent.overstreet; +Cc: linux-bcachefs, Alan Huang
The lock should already be acquired from all code path.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
---
fs/bcachefs/fs-io-buffered.c | 15 ---------------
1 file changed, 15 deletions(-)
diff --git a/fs/bcachefs/fs-io-buffered.c b/fs/bcachefs/fs-io-buffered.c
index c16c45a72c47..01bc3e048542 100644
--- a/fs/bcachefs/fs-io-buffered.c
+++ b/fs/bcachefs/fs-io-buffered.c
@@ -70,15 +70,6 @@ static int readpages_iter_init(struct readpages_iter *iter,
return 0;
}
-static void readpages_iter_exit(struct readpages_iter *iter,
- struct readahead_control *ractl)
-{
- darray_for_each_reverse(iter->folios, folio) {
- readpages_iter_folio_revert(ractl, *folio);
- folio_get(*folio);
- }
-}
-
static inline struct folio *readpage_iter_peek(struct readpages_iter *iter)
{
if (iter->idx >= iter->folios.nr)
@@ -305,10 +296,6 @@ void bch2_readahead(struct readahead_control *ractl)
* scheduling.
*/
blk_start_plug(&plug);
- if (!bch2_pagecache_add_tryget(inode)) {
- readpages_iter_exit(&readpages_iter, ractl);
- goto out;
- }
struct btree_trans *trans = bch2_trans_get(c);
while ((folio = readpage_iter_peek(&readpages_iter))) {
@@ -334,8 +321,6 @@ void bch2_readahead(struct readahead_control *ractl)
}
bch2_trans_put(trans);
- bch2_pagecache_add_put(inode);
-out:
blk_finish_plug(&plug);
darray_exit(&readpages_iter.folios);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/3] bcachefs: Introduce bch2_splice_read
2025-08-08 13:43 [PATCH 1/3] bcachefs: Introduce bch2_splice_read Alan Huang
2025-08-08 13:43 ` [PATCH 2/3] bcachefs: Use our own splice_read implementation Alan Huang
2025-08-08 13:43 ` [PATCH 3/3] bcachefs: Don't lock ei_pagecache_lock in bch2_readahead Alan Huang
@ 2025-08-08 18:28 ` Kent Overstreet
2 siblings, 0 replies; 4+ messages in thread
From: Kent Overstreet @ 2025-08-08 18:28 UTC (permalink / raw)
To: Alan Huang; +Cc: linux-bcachefs
On Fri, Aug 08, 2025 at 09:43:10PM +0800, Alan Huang wrote:
> This provides our own splice read, which locks ei_pagecache_lock around
> filemap_splice_read.
>
> Signed-off-by: Alan Huang <mmpgouride@gmail.com>
> ---
> fs/bcachefs/fs-io-buffered.c | 12 ++++++++++++
> fs/bcachefs/fs-io-buffered.h | 4 ++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/fs/bcachefs/fs-io-buffered.c b/fs/bcachefs/fs-io-buffered.c
> index fd8beb5167ee..c16c45a72c47 100644
> --- a/fs/bcachefs/fs-io-buffered.c
> +++ b/fs/bcachefs/fs-io-buffered.c
> @@ -1100,6 +1100,18 @@ ssize_t bch2_write_iter(struct kiocb *iocb, struct iov_iter *from)
> return bch2_err_class(ret);
> }
>
> +ssize_t bch2_splice_read(struct file *in, loff_t *ppos,
> + struct pipe_inode_info *pipe,
> + size_t len, unsigned int flags)
> +{
> + ssize_t ret;
> + struct bch_inode_info *inode = to_bch_ei(in->f_mapping->host);
> + bch2_pagecache_add_get(inode);
> + ret = filemap_splice_read(in, ppos, pipe, len, flags);
> + bch2_pagecache_add_put(inode);
> + return ret;
> +}
Taking the lock here, around the entire splice_read() call sucks - most
buffered reads (and this is the fastpath, which we care about) will be
reading from cache, i.e. they aren't adding new folios to the pagecache
and they don't need pagecache_add lock.
Meaning, we really don't want to add two new atomic ops on the inode to
every buffered read.
Does this help with lockdep at all? The main issue is still the fault
path vs. mmap_lock.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-08-08 18:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-08 13:43 [PATCH 1/3] bcachefs: Introduce bch2_splice_read Alan Huang
2025-08-08 13:43 ` [PATCH 2/3] bcachefs: Use our own splice_read implementation Alan Huang
2025-08-08 13:43 ` [PATCH 3/3] bcachefs: Don't lock ei_pagecache_lock in bch2_readahead Alan Huang
2025-08-08 18:28 ` [PATCH 1/3] bcachefs: Introduce bch2_splice_read Kent Overstreet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).