linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.14 26/39] fs/buffer: split locking for pagecache lookups
       [not found] <20250429235006.536648-1-sashal@kernel.org>
@ 2025-04-29 23:49 ` Sasha Levin
  2025-04-30  0:47   ` Luis Chamberlain
  2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 27/39] fs/buffer: introduce sleeping flavors " Sasha Levin
  2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 28/39] fs/buffer: use sleeping version of __find_get_block() Sasha Levin
  2 siblings, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2025-04-29 23:49 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Davidlohr Bueso, Jan Kara, kdevops, Luis Chamberlain,
	Christian Brauner, Sasha Levin, viro, linux-fsdevel

From: Davidlohr Bueso <dave@stgolabs.net>

[ Upstream commit 7ffe3de53a885dbb5836541c2178bd07d1bad7df ]

Callers of __find_get_block() may or may not allow for blocking
semantics, and is currently assumed that it will not. Layout
two paths based on this. The the private_lock scheme will
continued to be used for atomic contexts. Otherwise take the
folio lock instead, which protects the buffers, such as
vs migration and try_to_free_buffers().

Per the "hack idea", the latter can alleviate contention on
the private_lock for bdev mappings. For reasons of determinism
and avoid making bugs hard to reproduce, the trylocking is not
attempted.

No change in semantics. All lookup users still take the spinlock.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://kdevops.org/ext4/v6.15-rc2.html # [0]
Link: https://lore.kernel.org/all/aAAEvcrmREWa1SKF@bombadil.infradead.org/ # [1]
Link: https://lore.kernel.org/20250418015921.132400-2-dave@stgolabs.net
Tested-by: kdevops@lists.linux.dev
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/buffer.c | 41 +++++++++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index cc8452f602516..a03c245022dcf 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -176,18 +176,8 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
 }
 EXPORT_SYMBOL(end_buffer_write_sync);
 
-/*
- * Various filesystems appear to want __find_get_block to be non-blocking.
- * But it's the page lock which protects the buffers.  To get around this,
- * we get exclusion from try_to_free_buffers with the blockdev mapping's
- * i_private_lock.
- *
- * Hack idea: for the blockdev mapping, i_private_lock contention
- * may be quite high.  This code could TryLock the page, and if that
- * succeeds, there is no need to take i_private_lock.
- */
 static struct buffer_head *
-__find_get_block_slow(struct block_device *bdev, sector_t block)
+__find_get_block_slow(struct block_device *bdev, sector_t block, bool atomic)
 {
 	struct address_space *bd_mapping = bdev->bd_mapping;
 	const int blkbits = bd_mapping->host->i_blkbits;
@@ -204,7 +194,16 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
 	if (IS_ERR(folio))
 		goto out;
 
-	spin_lock(&bd_mapping->i_private_lock);
+	/*
+	 * Folio lock protects the buffers. Callers that cannot block
+	 * will fallback to serializing vs try_to_free_buffers() via
+	 * the i_private_lock.
+	 */
+	if (atomic)
+		spin_lock(&bd_mapping->i_private_lock);
+	else
+		folio_lock(folio);
+
 	head = folio_buffers(folio);
 	if (!head)
 		goto out_unlock;
@@ -236,7 +235,10 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
 		       1 << blkbits);
 	}
 out_unlock:
-	spin_unlock(&bd_mapping->i_private_lock);
+	if (atomic)
+		spin_unlock(&bd_mapping->i_private_lock);
+	else
+		folio_unlock(folio);
 	folio_put(folio);
 out:
 	return ret;
@@ -1388,14 +1390,15 @@ lookup_bh_lru(struct block_device *bdev, sector_t block, unsigned size)
  * it in the LRU and mark it as accessed.  If it is not present then return
  * NULL
  */
-struct buffer_head *
-__find_get_block(struct block_device *bdev, sector_t block, unsigned size)
+static struct buffer_head *
+find_get_block_common(struct block_device *bdev, sector_t block,
+			unsigned size, bool atomic)
 {
 	struct buffer_head *bh = lookup_bh_lru(bdev, block, size);
 
 	if (bh == NULL) {
 		/* __find_get_block_slow will mark the page accessed */
-		bh = __find_get_block_slow(bdev, block);
+		bh = __find_get_block_slow(bdev, block, atomic);
 		if (bh)
 			bh_lru_install(bh);
 	} else
@@ -1403,6 +1406,12 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size)
 
 	return bh;
 }
+
+struct buffer_head *
+__find_get_block(struct block_device *bdev, sector_t block, unsigned size)
+{
+	return find_get_block_common(bdev, block, size, true);
+}
 EXPORT_SYMBOL(__find_get_block);
 
 /**
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.14 27/39] fs/buffer: introduce sleeping flavors for pagecache lookups
       [not found] <20250429235006.536648-1-sashal@kernel.org>
  2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 26/39] fs/buffer: split locking for pagecache lookups Sasha Levin
@ 2025-04-29 23:49 ` Sasha Levin
  2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 28/39] fs/buffer: use sleeping version of __find_get_block() Sasha Levin
  2 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-04-29 23:49 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Davidlohr Bueso, Jan Kara, kdevops, Luis Chamberlain,
	Christian Brauner, Sasha Levin, viro, willy, akpm, josef, mhocko,
	linux-fsdevel

From: Davidlohr Bueso <dave@stgolabs.net>

[ Upstream commit 2814a7d3d2ff5d2cdd22936f641f758fdb971fa0 ]

Add __find_get_block_nonatomic() and sb_find_get_block_nonatomic()
calls for which users will be converted where safe. These versions
will take the folio lock instead of the mapping's private_lock.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://kdevops.org/ext4/v6.15-rc2.html # [0]
Link: https://lore.kernel.org/all/aAAEvcrmREWa1SKF@bombadil.infradead.org/ # [1]
Link: https://lore.kernel.org/20250418015921.132400-3-dave@stgolabs.net
Tested-by: kdevops@lists.linux.dev
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/buffer.c                 | 9 +++++++++
 include/linux/buffer_head.h | 8 ++++++++
 2 files changed, 17 insertions(+)

diff --git a/fs/buffer.c b/fs/buffer.c
index a03c245022dcf..7981097c846d4 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1414,6 +1414,15 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size)
 }
 EXPORT_SYMBOL(__find_get_block);
 
+/* same as __find_get_block() but allows sleeping contexts */
+struct buffer_head *
+__find_get_block_nonatomic(struct block_device *bdev, sector_t block,
+			   unsigned size)
+{
+	return find_get_block_common(bdev, block, size, false);
+}
+EXPORT_SYMBOL(__find_get_block_nonatomic);
+
 /**
  * bdev_getblk - Get a buffer_head in a block device's buffer cache.
  * @bdev: The block device.
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 932139c5d46f5..ffcd76d977703 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -223,6 +223,8 @@ void __wait_on_buffer(struct buffer_head *);
 wait_queue_head_t *bh_waitq_head(struct buffer_head *bh);
 struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block,
 			unsigned size);
+struct buffer_head *__find_get_block_nonatomic(struct block_device *bdev,
+			sector_t block, unsigned size);
 struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block,
 		unsigned size, gfp_t gfp);
 void __brelse(struct buffer_head *);
@@ -398,6 +400,12 @@ sb_find_get_block(struct super_block *sb, sector_t block)
 	return __find_get_block(sb->s_bdev, block, sb->s_blocksize);
 }
 
+static inline struct buffer_head *
+sb_find_get_block_nonatomic(struct super_block *sb, sector_t block)
+{
+	return __find_get_block_nonatomic(sb->s_bdev, block, sb->s_blocksize);
+}
+
 static inline void
 map_bh(struct buffer_head *bh, struct super_block *sb, sector_t block)
 {
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.14 28/39] fs/buffer: use sleeping version of __find_get_block()
       [not found] <20250429235006.536648-1-sashal@kernel.org>
  2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 26/39] fs/buffer: split locking for pagecache lookups Sasha Levin
  2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 27/39] fs/buffer: introduce sleeping flavors " Sasha Levin
@ 2025-04-29 23:49 ` Sasha Levin
  2 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-04-29 23:49 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Davidlohr Bueso, Jan Kara, kdevops, Luis Chamberlain,
	Christian Brauner, Sasha Levin, viro, linux-fsdevel

From: Davidlohr Bueso <dave@stgolabs.net>

[ Upstream commit 5b67d43976828dea2394eae2556b369bb7a61f64 ]

Convert to the new nonatomic flavor to benefit from potential performance
benefits and adapt in the future vs migration such that semantics
are kept.

Convert write_boundary_block() which already takes the buffer
lock as well as bdev_getblk() depending on the respective gpf flags.
There are no changes in semantics.

Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://kdevops.org/ext4/v6.15-rc2.html # [0]
Link: https://lore.kernel.org/all/aAAEvcrmREWa1SKF@bombadil.infradead.org/ # [1]
Link: https://lore.kernel.org/20250418015921.132400-4-dave@stgolabs.net
Tested-by: kdevops@lists.linux.dev # [0] [1]
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/buffer.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 7981097c846d4..2494fe3a5e69e 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -658,7 +658,9 @@ EXPORT_SYMBOL(generic_buffers_fsync);
 void write_boundary_block(struct block_device *bdev,
 			sector_t bblock, unsigned blocksize)
 {
-	struct buffer_head *bh = __find_get_block(bdev, bblock + 1, blocksize);
+	struct buffer_head *bh;
+
+	bh = __find_get_block_nonatomic(bdev, bblock + 1, blocksize);
 	if (bh) {
 		if (buffer_dirty(bh))
 			write_dirty_buffer(bh, 0);
@@ -1440,7 +1442,12 @@ EXPORT_SYMBOL(__find_get_block_nonatomic);
 struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block,
 		unsigned size, gfp_t gfp)
 {
-	struct buffer_head *bh = __find_get_block(bdev, block, size);
+	struct buffer_head *bh;
+
+	if (gfpflags_allow_blocking(gfp))
+		bh = __find_get_block_nonatomic(bdev, block, size);
+	else
+		bh = __find_get_block(bdev, block, size);
 
 	might_alloc(gfp);
 	if (bh)
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH AUTOSEL 6.14 26/39] fs/buffer: split locking for pagecache lookups
  2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 26/39] fs/buffer: split locking for pagecache lookups Sasha Levin
@ 2025-04-30  0:47   ` Luis Chamberlain
  0 siblings, 0 replies; 4+ messages in thread
From: Luis Chamberlain @ 2025-04-30  0:47 UTC (permalink / raw)
  To: Sasha Levin
  Cc: linux-kernel, stable, Davidlohr Bueso, Jan Kara, kdevops,
	Christian Brauner, viro, linux-fsdevel

On Tue, Apr 29, 2025 at 07:49:53PM -0400, Sasha Levin wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> [ Upstream commit 7ffe3de53a885dbb5836541c2178bd07d1bad7df ]
> 
> Callers of __find_get_block() may or may not allow for blocking
> semantics, and is currently assumed that it will not. Layout
> two paths based on this. The the private_lock scheme will
> continued to be used for atomic contexts. Otherwise take the
> folio lock instead, which protects the buffers, such as
> vs migration and try_to_free_buffers().
> 
> Per the "hack idea", the latter can alleviate contention on
> the private_lock for bdev mappings. For reasons of determinism
> and avoid making bugs hard to reproduce, the trylocking is not
> attempted.
> 
> No change in semantics. All lookup users still take the spinlock.

This is pushing it. I would not expect this to go to stable. At all.

BTW you had mentioned the code for auto-sel would be published a while
ago, is it available anywhere?

  Luis

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-04-30  0:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250429235006.536648-1-sashal@kernel.org>
2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 26/39] fs/buffer: split locking for pagecache lookups Sasha Levin
2025-04-30  0:47   ` Luis Chamberlain
2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 27/39] fs/buffer: introduce sleeping flavors " Sasha Levin
2025-04-29 23:49 ` [PATCH AUTOSEL 6.14 28/39] fs/buffer: use sleeping version of __find_get_block() Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).