Re: [PATCH 0/5] fs/buffer: strack reduction on async read

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Luis Chamberlain <mcgrof@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: hare@suse.de, dave@stgolabs.net, david@fromorbit.com,
	djwong@kernel.org, kbusch@kernel.org, john.g.garry@oracle.com,
	hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, linux-mm@kvack.org,
	linux-block@vger.kernel.org, gost.dev@samsung.com,
	p.raghav@samsung.com, da.gomez@samsung.com,
	kernel@pankajraghav.com
Subject: Re: [PATCH 0/5] fs/buffer: strack reduction on async read
Date: Mon, 3 Feb 2025 06:00:00 -0800	[thread overview]
Message-ID: <Z6DL4MrsHbtX_MIs@bombadil.infradead.org> (raw)
In-Reply-To: <Z51ISh2YAlwoLo5h@casper.infradead.org>

On Fri, Jan 31, 2025 at 10:01:46PM +0000, Matthew Wilcox wrote:
> On Fri, Jan 31, 2025 at 08:54:31AM -0800, Luis Chamberlain wrote:
> > On Thu, Dec 19, 2024 at 03:51:34AM +0000, Matthew Wilcox wrote:
> > > On Wed, Dec 18, 2024 at 06:27:36PM -0800, Luis Chamberlain wrote:
> > > > On Wed, Dec 18, 2024 at 08:05:29PM +0000, Matthew Wilcox wrote:
> > > > > On Tue, Dec 17, 2024 at 06:26:21PM -0800, Luis Chamberlain wrote:
> > > > > > This splits up a minor enhancement from the bs > ps device support
> > > > > > series into its own series for better review / focus / testing.
> > > > > > This series just addresses the reducing the array size used and cleaning
> > > > > > up the async read to be easier to read and maintain.
> > > > > 
> > > > > How about this approach instead -- get rid of the batch entirely?
> > > > 
> > > > Less is more! I wish it worked, but we end up with a null pointer on
> > > > ext4/032 (and indeed this is the test that helped me find most bugs in
> > > > what I was working on):
> > > 
> > > Yeah, I did no testing; just wanted to give people a different approach
> > > to consider.
> > > 
> > > > [  106.034851] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > [  106.046300] RIP: 0010:end_buffer_async_read_io+0x11/0x90
> > > > [  106.047819] Code: f2 ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 53 48 8b 47 10 48 89 fb 48 8b 40 18 <48> 8b 00 f6 40 0d 40 74 0d 0f b7 00 66 25 00 f0 66 3d 00 80 74 09
> > > 
> > > That decodes as:
> > > 
> > >    5:	53                   	push   %rbx
> > >    6:	48 8b 47 10          	mov    0x10(%rdi),%rax
> > >    a:	48 89 fb             	mov    %rdi,%rbx
> > >    d:	48 8b 40 18          	mov    0x18(%rax),%rax
> > >   11:*	48 8b 00             	mov    (%rax),%rax		<-- trapping instruction
> > >   14:	f6 40 0d 40          	testb  $0x40,0xd(%rax)
> > > 
> > > 6: bh->b_folio
> > > d: b_folio->mapping
> > > 11: mapping->host
> > > 
> > > So folio->mapping is NULL.
> > > 
> > > Ah, I see the problem.  end_buffer_async_read() uses the buffer_async_read
> > > test to decide if all buffers on the page are uptodate or not.  So both
> > > having no batch (ie this patch) and having a batch which is smaller than
> > > the number of buffers in the folio can lead to folio_end_read() being
> > > called prematurely (ie we'll unlock the folio before finishing reading
> > > every buffer in the folio).
> > 
> > But:
> > 
> > a) all batched buffers are locked in the old code, we only unlock
> >    the currently evaluated buffer, the buffers from our pivot are locked
> >    and should also have the async flag set. That fact that buffers ahead
> >    should have the async flag set should prevent from calling
> >    folio_end_read() prematurely as I read the code, no?
> 
> I'm sure you know what you mean by "the old code", but I don't.
> 
> If you mean "the code in 6.13", here's what it does:

Yes that is what I meant, sorry.

> 
>         tmp = bh;
>         do {
>                 if (!buffer_uptodate(tmp))
>                         folio_uptodate = 0;
>                 if (buffer_async_read(tmp)) {
>                         BUG_ON(!buffer_locked(tmp));
>                         goto still_busy;
>                 }
>                 tmp = tmp->b_this_page;
>         } while (tmp != bh);
>         folio_end_read(folio, folio_uptodate);
> 
> so it's going to cycle around every buffer on the page, and if it finds
> none which are marked async_read, it'll call folio_end_read().
> That's fine in 6.13 because in stage 2, all buffers which are part of
> this folio are marked as async_read.

Indeed, also, its not just every buffer on the page, since we can call
end_buffer_async_read() on every buffer in the page we can end up
calling end_buffer_async_read() on every buffer in the worst case, and
on each loop above we start from the pivot buffer up to the end of the
page.

> In your patch, you mark every buffer _in the batch_ as async_read
> and then submit the entire batch.  So if they all complete before you
> mark the next bh as being uptodate, it'll think the read is complete
> and call folio_end_read().

Ah yes, thanks, this clarifies to me what you meant!

  Luis

next prev parent reply	other threads:[~2025-02-03 14:00 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-18  2:26 [PATCH 0/5] fs/buffer: strack reduction on async read Luis Chamberlain
2024-12-18  2:26 ` [PATCH 1/5] fs/buffer: move async batch read code into a helper Luis Chamberlain
2024-12-18  2:26 ` [PATCH 2/5] fs/buffer: simplify block_read_full_folio() with bh_offset() Luis Chamberlain
2024-12-18  2:26 ` [PATCH 3/5] fs/buffer: add a for_each_bh() for block_read_full_folio() Luis Chamberlain
2024-12-18 19:20   ` Matthew Wilcox
2024-12-18  2:26 ` [PATCH 4/5] fs/buffer: add iteration support " Luis Chamberlain
2024-12-18  2:26 ` [PATCH 5/5] fs/buffer: reduce stack usage on bh_read_iter() Luis Chamberlain
2024-12-18  2:47   ` Luis Chamberlain
2024-12-18 20:05 ` [PATCH 0/5] fs/buffer: strack reduction on async read Matthew Wilcox
2024-12-19  2:27   ` Luis Chamberlain
2024-12-19  3:51     ` Matthew Wilcox
2024-12-30 17:30       ` Luis Chamberlain
2025-01-31 16:54       ` Luis Chamberlain
2025-01-31 22:01         ` Matthew Wilcox
2025-02-03 14:00           ` Luis Chamberlain [this message]
2024-12-19  6:28 ` Christoph Hellwig
2024-12-19 17:53   ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z6DL4MrsHbtX_MIs@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=da.gomez@samsung.com \
    --cc=dave@stgolabs.net \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=john.g.garry@oracle.com \
    --cc=kbusch@kernel.org \
    --cc=kernel@pankajraghav.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=ritesh.list@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).