public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Matthew Wilcox <willy@infradead.org>,
	Luis Chamberlain <mcgrof@kernel.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>,
	brauner@kernel.org, viro@zeniv.linux.org.uk,
	akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, gost.dev@samsung.com
Subject: Re: [RFC 0/4] convert create_page_buffers to create_folio_buffers
Date: Sat, 15 Apr 2023 15:14:33 +0200	[thread overview]
Message-ID: <31765c8c-e895-4207-2b8c-39f6c7c83ece@suse.de> (raw)
In-Reply-To: <ZDodlnm2nvYxbvR4@casper.infradead.org>

On 4/15/23 05:44, Matthew Wilcox wrote:
> On Fri, Apr 14, 2023 at 08:24:56PM -0700, Luis Chamberlain wrote:
>> I thought of that but I saw that the loop that assigns the arr only
>> pegs a bh if we don't "continue" for certain conditions, which made me
>> believe that we only wanted to keep on the array as non-null items which
>> meet the initial loop's criteria. If that is not accurate then yes,
>> the simplication is nice!
> 
> Uh, right.  A little bit more carefully this time ... how does this
> look?
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 5e67e21b350a..dff671079b02 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2282,7 +2282,7 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block)
>   {
>   	struct inode *inode = folio->mapping->host;
>   	sector_t iblock, lblock;
> -	struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
> +	struct buffer_head *bh, *head;
>   	unsigned int blocksize, bbits;
>   	int nr, i;
>   	int fully_mapped = 1;
> @@ -2335,7 +2335,7 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block)
>   			if (buffer_uptodate(bh))
>   				continue;
>   		}
> -		arr[nr++] = bh;
> +		nr++;
>   	} while (i++, iblock++, (bh = bh->b_this_page) != head);
>   
>   	if (fully_mapped)
> @@ -2352,25 +2352,29 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block)
>   		return 0;
>   	}
>   
> -	/* Stage two: lock the buffers */
> -	for (i = 0; i < nr; i++) {
> -		bh = arr[i];
> +	/*
> +	 * Stage two: lock the buffers.  Recheck the uptodate flag under
> +	 * the lock in case somebody else brought it uptodate first.
> +	 */
> +	bh = head;
> +	do {
> +		if (buffer_uptodate(bh))
> +			continue;
>   		lock_buffer(bh);
> +		if (buffer_uptodate(bh)) {
> +			unlock_buffer(bh);
> +			continue;
> +		}
>   		mark_buffer_async_read(bh);
> -	}
> +	} while ((bh = bh->b_this_page) != head);
>   
> -	/*
> -	 * Stage 3: start the IO.  Check for uptodateness
> -	 * inside the buffer lock in case another process reading
> -	 * the underlying blockdev brought it uptodate (the sct fix).
> -	 */
> -	for (i = 0; i < nr; i++) {
> -		bh = arr[i];
> -		if (buffer_uptodate(bh))
> -			end_buffer_async_read(bh, 1);
> -		else
> +	/* Stage 3: start the IO */
> +	bh = head;
> +	do {
> +		if (buffer_async_read(bh))
>   			submit_bh(REQ_OP_READ, bh);
> -	}
> +	} while ((bh = bh->b_this_page) != head);
> +
>   	return 0;
>   }
>   EXPORT_SYMBOL(block_read_full_folio);
> 
> 
> I do wonder how much it's worth doing this vs switching to non-BH methods.
> I appreciate that's a lot of work still.

That's what I've been wondering, too.

I would _vastly_ prefer to switch over to iomap; however, the blasted
sb_bread() is getting in the way. Currently iomap only runs on entire
pages / folios, but a lot of (older) filesystems insist on doing 512
byte I/O. While this seem logical (seeing that 512 bytes is the
default, and, in most cases, the only supported sector size) question
is whether _we_ from the linux side need to do that.
We _could_ upgrade to always do full page I/O; there's a good
chance we'll be using the entire page anyway eventually.
And with storage bandwidth getting larger and larger we might even
get a performance boost there.

And it would save us having to implement sub-page I/O for iomap.

Hmm?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


  reply	other threads:[~2023-04-15 13:14 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230414110825eucas1p1ed4d16627889ef8542dfa31b1183063d@eucas1p1.samsung.com>
2023-04-14 11:08 ` [RFC 0/4] convert create_page_buffers to create_folio_buffers Pankaj Raghav
2023-04-14 11:08   ` [RFC 1/4] fs/buffer: add set_bh_folio helper Pankaj Raghav
2023-04-14 11:08   ` [RFC 2/4] buffer: add alloc_folio_buffers() helper Pankaj Raghav
2023-04-14 13:06     ` Matthew Wilcox
2023-04-14 15:01       ` Pankaj Raghav
2023-04-14 11:08   ` [RFC 3/4] fs/buffer: add folio_create_empty_buffers helper Pankaj Raghav
2023-04-14 13:16     ` Matthew Wilcox
2023-04-14 11:08   ` [RFC 4/4] fs/buffer: convert create_page_buffers to create_folio_buffers Pankaj Raghav
2023-04-14 13:21     ` Matthew Wilcox
2023-04-14 13:47   ` [RFC 0/4] " Hannes Reinecke
2023-04-14 13:51     ` Matthew Wilcox
2023-04-14 13:56       ` Hannes Reinecke
2023-04-14 15:00     ` Pankaj Raghav
2023-04-15  1:01     ` Luis Chamberlain
2023-04-15  2:31       ` Matthew Wilcox
2023-04-15  3:24         ` Luis Chamberlain
2023-04-15  3:44           ` Matthew Wilcox
2023-04-15 13:14             ` Hannes Reinecke [this message]
2023-04-15 17:09               ` Matthew Wilcox
2023-04-16  1:28                 ` Luis Chamberlain
2023-04-16  3:40                   ` Matthew Wilcox
2023-04-16  5:26                     ` Luis Chamberlain
2023-04-16 14:07                       ` Matthew Wilcox
2023-04-17 15:40                         ` Darrick J. Wong
2023-04-16 22:57                       ` Dave Chinner
2023-04-17  2:27     ` Luis Chamberlain
2023-04-17  6:04       ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31765c8c-e895-4207-2b8c-39f6c7c83ece@suse.de \
    --to=hare@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox