Re: [PATCH] mm/filemap: allocate folios according to the blocksize

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Hannes Reinecke <hare@suse.de>
To: Pankaj Raghav <p.raghav@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	mcgrof@kernel.org, SSDR Gost Dev <gost.dev@samsung.com>
Subject: Re: [PATCH] mm/filemap: allocate folios according to the blocksize
Date: Thu, 20 Apr 2023 14:19:57 +0200	[thread overview]
Message-ID: <a826abe1-332f-22db-982c-ecec67a40585@suse.de> (raw)
In-Reply-To: <2466fa23-a817-1dee-b89f-fcbeaca94a9e@samsung.com>

On 4/20/23 14:05, Pankaj Raghav wrote:
> To keep this thread alive and get some direction on the next steps, I made some changes
> with which I am able to do **buffered reads** with fio on brd with logical block size > 4k.
> 
> Along with your patches (this patch and the brd patches), I added the following diff:
> 
> diff --git a/fs/mpage.c b/fs/mpage.c
> index 242e213ee064..2e0c066d72d3 100644
> --- a/fs/mpage.c
> +++ b/fs/mpage.c
> @@ -161,7 +161,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
>          struct folio *folio = args->folio;
>          struct inode *inode = folio->mapping->host;
>          const unsigned blkbits = inode->i_blkbits;
> -       const unsigned blocks_per_page = PAGE_SIZE >> blkbits;
> +       const unsigned blocks_per_page = folio_size(folio) >> blkbits;
>          const unsigned blocksize = 1 << blkbits;
>          struct buffer_head *map_bh = &args->map_bh;
>          sector_t block_in_file;
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 47afbca1d122..2e42b5127f4c 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -210,7 +210,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>          unsigned long index = readahead_index(ractl);
>          gfp_t gfp_mask = readahead_gfp_mask(mapping);
>          unsigned long i;
> -
> +       int order = 0;
>          /*
>           * Partway through the readahead operation, we will have added
>           * locked pages to the page cache, but will not yet have submitted
> @@ -223,6 +223,9 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>           */
>          unsigned int nofs = memalloc_nofs_save();
> 
> +       if (mapping->host->i_blkbits > PAGE_SHIFT)
> +               order = mapping->host->i_blkbits - PAGE_SHIFT;
> +
>          filemap_invalidate_lock_shared(mapping);
>          /*
>           * Preallocate as many pages as we will need.
> @@ -245,7 +248,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>                          continue;
>                  }
> 
> -               folio = filemap_alloc_folio(gfp_mask, 0);
> +               folio = filemap_alloc_folio(gfp_mask, order);
>                  if (!folio)
>                          break;
>                  if (filemap_add_folio(mapping, folio, index + i,
> @@ -259,7 +262,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>                  if (i == nr_to_read - lookahead_size)
>                          folio_set_readahead(folio);
>                  ractl->_workingset |= folio_test_workingset(folio);
> -               ractl->_nr_pages++;
> +               ractl->_nr_pages += folio_nr_pages(folio);
>          }
> 
> 
> And with that (drum roll):
> 
> root@debian:~# cat /sys/block/ram0/queue/logical_block_size
> 8192
> root@debian:~# fio -bs=8k -iodepth=8 -rw=read -ioengine=io_uring -size=200M -name=io_uring_1
> -filename=/dev/ram0
> io_uring_1: (g=0): rw=read, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=io_uring,
> iodepth=8
> fio-3.33
> Starting 1 process
> 
> io_uring_1: (groupid=0, jobs=1): err= 0: pid=450: Thu Apr 20 11:34:10 2023
>    read: IOPS=94.8k, BW=741MiB/s (777MB/s)(40.0MiB/54msec)
> 
> <snip>
> 
> Run status group 0 (all jobs):
>     READ: bw=741MiB/s (777MB/s), 741MiB/s-741MiB/s (777MB/s-777MB/s), io=40.0MiB (41.9MB), run=54-54msec
> 
> Disk stats (read/write):
>    ram0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
> 
> 
> **Questions on the future work**:
> 
> As willy pointed out, we have to do this `order = mapping->host->i_blkbits - PAGE_SHIFT` in
> many places. Should we pursue something that willy suggested: encapsulating order in the
> mapping->flags as a next step?[1]
> 
> 
> [1] https://lore.kernel.org/lkml/ZDty+PQfHkrGBojn@casper.infradead.org/

Well ... really, not sure.
Yes, continue updating buffer_heads would be a logical thing as it could 
be done incrementally.

But really, the end-goal should be to move away from buffer_heads for fs 
and mm usage. So I wonder if we shouldn't rather look in that direction..

Cheers,

Hannes

next prev parent reply	other threads:[~2023-04-20 12:20 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230414134914eucas1p1f0b08409dce8bc946057d0a4fa7f1601@eucas1p1.samsung.com>
2023-04-14 13:49 ` [PATCH] mm/filemap: allocate folios according to the blocksize Hannes Reinecke
2023-04-14 13:56   ` Matthew Wilcox
2023-04-16  4:00   ` Matthew Wilcox
2023-04-17  2:18   ` kernel test robot
2023-04-17  4:10     ` Matthew Wilcox
2023-04-17  4:36   ` Christoph Hellwig
2023-04-17  6:08     ` Hannes Reinecke
2023-04-17  6:27       ` Christoph Hellwig
2023-04-17  6:30         ` Hannes Reinecke
2023-04-20 12:05   ` Pankaj Raghav
2023-04-20 12:19     ` Hannes Reinecke [this message]
2023-04-20 12:28       ` Pankaj Raghav
2023-04-20 15:03         ` Darrick J. Wong
2023-04-22  0:14           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a826abe1-332f-22db-982c-ecec67a40585@suse.de \
    --to=hare@suse.de \
    --cc=gost.dev@samsung.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox