[PATCH] exfat: enable request merging for dir readahead

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] exfat: enable request merging for dir readahead
@ 2025-04-07 10:23 ` Anthony Iliopoulos
  2025-04-07 13:00   ` Namjae Jeon
  2025-04-08  1:15   ` Sungjong Seo
  0 siblings, 2 replies; 3+ messages in thread
From: Anthony Iliopoulos @ 2025-04-07 10:23 UTC (permalink / raw)
  To: Namjae Jeon, Sungjong Seo, Yuezhang Mo; +Cc: linux-fsdevel, linux-kernel

Directory listings that need to access the inode metadata (e.g. via
statx to obtain the file types) of large filesystems with lots of
metadata that aren't yet in dcache, will take a long time due to the
directory readahead submitting one io request at a time which although
targeting sequential disk sectors (up to EXFAT_MAX_RA_SIZE) are not
merged at the block layer.

Add plugging around sb_breadahead so that the requests can be batched
and submitted jointly to the block layer where they can be merged by the
io schedulers, instead of having each request individually submitted to
the hardware queues.

This significantly improves the throughput of directory listings as it
also minimizes the number of io completions and related handling from
the device driver side.

Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
---
 fs/exfat/dir.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
index 3103b932b674..a46ab2690b4d 100644
--- a/fs/exfat/dir.c
+++ b/fs/exfat/dir.c
@@ -621,6 +621,7 @@ static int exfat_dir_readahead(struct super_block *sb, sector_t sec)
 {
 	struct exfat_sb_info *sbi = EXFAT_SB(sb);
 	struct buffer_head *bh;
+	struct blk_plug plug;
 	unsigned int max_ra_count = EXFAT_MAX_RA_SIZE >> sb->s_blocksize_bits;
 	unsigned int page_ra_count = PAGE_SIZE >> sb->s_blocksize_bits;
 	unsigned int adj_ra_count = max(sbi->sect_per_clus, page_ra_count);
@@ -644,8 +645,10 @@ static int exfat_dir_readahead(struct super_block *sb, sector_t sec)
 	if (!bh || !buffer_uptodate(bh)) {
 		unsigned int i;

+		blk_start_plug(&plug);
 		for (i = 0; i < ra_count; i++)
 			sb_breadahead(sb, (sector_t)(sec + i));
+		blk_finish_plug(&plug);
 	}
 	brelse(bh);
 	return 0;
-- 
2.49.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] exfat: enable request merging for dir readahead
  2025-04-07 10:23 ` [PATCH] exfat: enable request merging for dir readahead Anthony Iliopoulos
@ 2025-04-07 13:00   ` Namjae Jeon
  2025-04-08  1:15   ` Sungjong Seo
  1 sibling, 0 replies; 3+ messages in thread
From: Namjae Jeon @ 2025-04-07 13:00 UTC (permalink / raw)
  To: Anthony Iliopoulos; +Cc: Sungjong Seo, Yuezhang Mo, linux-fsdevel, linux-kernel

On Mon, Apr 7, 2025 at 7:23 PM Anthony Iliopoulos <ailiop@suse.com> wrote:
>
> Directory listings that need to access the inode metadata (e.g. via
> statx to obtain the file types) of large filesystems with lots of
> metadata that aren't yet in dcache, will take a long time due to the
> directory readahead submitting one io request at a time which although
> targeting sequential disk sectors (up to EXFAT_MAX_RA_SIZE) are not
> merged at the block layer.
>
> Add plugging around sb_breadahead so that the requests can be batched
> and submitted jointly to the block layer where they can be merged by the
> io schedulers, instead of having each request individually submitted to
> the hardware queues.
>
> This significantly improves the throughput of directory listings as it
> also minimizes the number of io completions and related handling from
> the device driver side.
>
> Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
> ---
>  fs/exfat/dir.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
> index 3103b932b674..a46ab2690b4d 100644
> --- a/fs/exfat/dir.c
> +++ b/fs/exfat/dir.c

Hi Anthony,
> @@ -621,6 +621,7 @@ static int exfat_dir_readahead(struct super_block *sb, sector_t sec)
>  {
>         struct exfat_sb_info *sbi = EXFAT_SB(sb);
>         struct buffer_head *bh;
> +       struct blk_plug plug;
>         unsigned int max_ra_count = EXFAT_MAX_RA_SIZE >> sb->s_blocksize_bits;
>         unsigned int page_ra_count = PAGE_SIZE >> sb->s_blocksize_bits;
>         unsigned int adj_ra_count = max(sbi->sect_per_clus, page_ra_count);
> @@ -644,8 +645,10 @@ static int exfat_dir_readahead(struct super_block *sb, sector_t sec)
>         if (!bh || !buffer_uptodate(bh)) {
>                 unsigned int i;
It is better to move plug declaration here.
Thanks!
>
> +               blk_start_plug(&plug);
>                 for (i = 0; i < ra_count; i++)
>                         sb_breadahead(sb, (sector_t)(sec + i));
> +               blk_finish_plug(&plug);
>         }
>         brelse(bh);
>         return 0;
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [PATCH] exfat: enable request merging for dir readahead
  2025-04-07 10:23 ` [PATCH] exfat: enable request merging for dir readahead Anthony Iliopoulos
  2025-04-07 13:00   ` Namjae Jeon
@ 2025-04-08  1:15   ` Sungjong Seo
  1 sibling, 0 replies; 3+ messages in thread
From: Sungjong Seo @ 2025-04-08  1:15 UTC (permalink / raw)
  To: 'Anthony Iliopoulos', 'Namjae Jeon',
	'Yuezhang Mo'
  Cc: linux-fsdevel, linux-kernel, sjdev.seo, cpgs, sj1557.seo

Hi, Anthony

> Directory listings that need to access the inode metadata (e.g. via
> statx to obtain the file types) of large filesystems with lots of
> metadata that aren't yet in dcache, will take a long time due to the
> directory readahead submitting one io request at a time which although
> targeting sequential disk sectors (up to EXFAT_MAX_RA_SIZE) are not
> merged at the block layer.
> 
> Add plugging around sb_breadahead so that the requests can be batched
> and submitted jointly to the block layer where they can be merged by the
> io schedulers, instead of having each request individually submitted to
> the hardware queues.
> 
> This significantly improves the throughput of directory listings as it
> also minimizes the number of io completions and related handling from
> the device driver side.

Good approach. However, this attempt was in the past Samsung code,
and there was a problem that the latency of directory-related operations
became longer when ra_count is large (maybe, MAX_RA_SIZE).
In the most recent code, blk_flush_plug is being done in units of
pages as follows.

```
blk_start_plug(&plug);
for (i = 0; i < ra_count; i++) {
        if (i && !(i & (sects_per_page - 1)))
                blk_flush_plug(&plug, false);
        sb_breadahead(sb, sec + i);
}
blk_finish_plug(&plug);
```

However, since blk_flush_plug is not exported, it can no longer be used in
module build. It seems that blk_flush_plug needs to be exported or
improved to repeat blk_start_plug and blk_finish_plug in units of pages.

After changing to plug by page unit, could you also compare the throughput?

Thanks

> 
> Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
> ---
>  fs/exfat/dir.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
> index 3103b932b674..a46ab2690b4d 100644
> --- a/fs/exfat/dir.c
> +++ b/fs/exfat/dir.c
> @@ -621,6 +621,7 @@ static int exfat_dir_readahead(struct super_block *sb,
> sector_t sec)
>  {
>  	struct exfat_sb_info *sbi = EXFAT_SB(sb);
>  	struct buffer_head *bh;
> +	struct blk_plug plug;
>  	unsigned int max_ra_count = EXFAT_MAX_RA_SIZE >> sb-
> >s_blocksize_bits;
>  	unsigned int page_ra_count = PAGE_SIZE >> sb->s_blocksize_bits;
>  	unsigned int adj_ra_count = max(sbi->sect_per_clus, page_ra_count);
> @@ -644,8 +645,10 @@ static int exfat_dir_readahead(struct super_block
*sb,
> sector_t sec)
>  	if (!bh || !buffer_uptodate(bh)) {
>  		unsigned int i;
> 
> +		blk_start_plug(&plug);
>  		for (i = 0; i < ra_count; i++)
>  			sb_breadahead(sb, (sector_t)(sec + i));
> +		blk_finish_plug(&plug);
>  	}
>  	brelse(bh);
>  	return 0;
> --
> 2.49.0



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-04-08  1:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20250407102359epcas1p1dd23affd903c1ece78ddbfe85d39034e@epcas1p1.samsung.com>
2025-04-07 10:23 ` [PATCH] exfat: enable request merging for dir readahead Anthony Iliopoulos
2025-04-07 13:00   ` Namjae Jeon
2025-04-08  1:15   ` Sungjong Seo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.