All of lore.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Nikolay Borisov <nborisov@suse.com>,
	David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org, willy@infradead.org
Subject: Re: [PATCH v2] btrfs: use preallocated pages for super block write
Date: Fri, 10 Jun 2022 09:39:29 +0100	[thread overview]
Message-ID: <20220610083929.GA3742796@falcondesktop> (raw)
In-Reply-To: <51502090-6278-ae62-8084-b995cf04caa5@gmx.com>

On Fri, Jun 10, 2022 at 03:33:47PM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/6/10 15:23, Nikolay Borisov wrote:
> > 
> > 
> > On 10.06.22 г. 3:07 ч., Qu Wenruo wrote:
> > > 
> > > 
> > > On 2022/6/10 00:46, David Sterba wrote:
> > > > Currently the super block page is from the mapping of the block device,
> > > > this is result of direct conversion from the previous buffer_head to bio
> > > > API.  We don't use the page cache or the mapping anywhere else, the page
> > > > is a temporary space for the associated bio.
> > > > 
> > > > Allocate pages for all super block copies at device allocation time,
> > > > also to avoid any later allocation problems when writing the super
> > > > block. This simplifies the page reference tracking, but the page lock is
> > > > still used as waiting mechanism for the write and write error is tracked
> > > > in the page.
> > > > 
> > > > As there is a separate page for each super block copy all can be
> > > > submitted in parallel, as before.
> > > 
> > > Is there any history on why we want parallel super block submission?
> > 
> > Because it's in the transaction critical path so it affects latency of
> > operations.
> 
> Not exactly.
> 
> Super block writeback happens with TRANS_STATE_UNBLOCKED status, which
> means new transaction can already be started.
> 
> Thus even if we don't do any parallel submission, there is no
> performance impact on transaction path.

Hell, no. There's more than transaction states to consider.

Taking longer to write super blocks can have a performance impact on fsync
for example. And fsync always has to write super blocks and wait for them
to complete. In fact, a large part of time spent on fsync is precisely on
writing super blocks.

In some cases fsync has to fallback to a transaction commit and wait for
it to complete before returning to user space - which again requires writing
super blocks and waiting for their completion.

Similarly, there are ioctls like snapshot and subvolume creation which
commit a transaction, and any changes to the way super blocks are written
can also affect their latency and impact applications.

> 
> Thanks,
> Qu
> > 
> > > 
> > > In fact, for the 3 super blocks, the primary one has FUA flag, while the
> > > other backup ones doesn't.
> > > 
> > > This means, even we wait for the super block write, only the first one
> > > would take some real time, while the other two can return almost
> > > immediate for devices with write cache.
> > > 
> > > In fact, waiting for the super block write back can tell us if our
> > > primary super block did really reach disk, allowing us to do better
> > > error handling, other than the almost non-exist check in endio.
> > > 
> > > And such synchronous submission allows us to have only one copy of the
> > > sb.
> > > 
> > > 
> > > Furthermore, if we really go parallel, I don't think the current way is
> > > the correct way.
> > > 
> > > One device can have at most 3 superblocks, but a btrfs can easily have
> > > more than 4 devices.
> > > 
> > > Shouldn't we parallel based on device, other than each superblock?
> > 
> > I agree that instead of having 3 pages per-device we can go the 1 page
> > per device, and parallelize based on the device, rather than the super
> > block copies.
> > 
> > <snip>

  parent reply	other threads:[~2022-06-10  8:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09 16:46 [PATCH v2] btrfs: use preallocated pages for super block write David Sterba
2022-06-09 21:00 ` Matthew Wilcox
2022-06-09 22:54   ` David Sterba
2022-06-09 22:58 ` Qu Wenruo
2022-06-09 22:59   ` David Sterba
2022-06-09 23:15     ` Qu Wenruo
2022-06-09 23:35       ` David Sterba
2022-06-10  1:40     ` Matthew Wilcox
2022-06-10  2:46       ` Qu Wenruo
2022-06-10  3:31         ` Matthew Wilcox
2022-06-10  4:53           ` Qu Wenruo
2022-06-10  0:07 ` Qu Wenruo
2022-06-10  7:23   ` Nikolay Borisov
2022-06-10  7:33     ` Qu Wenruo
2022-06-10  7:39       ` Nikolay Borisov
2022-06-10  7:55         ` Qu Wenruo
2022-06-10  8:39       ` Filipe Manana [this message]
2022-06-10  8:44         ` Qu Wenruo
2022-06-10  8:49           ` Qu Wenruo
2022-06-10  9:07 ` Qu Wenruo
2022-06-10 12:06 ` Nikolay Borisov
2022-06-11 13:30 ` Anand Jain
2022-06-13  6:37   ` Nikolay Borisov
2022-06-13  6:35 ` Nikolay Borisov
2022-06-21 13:24 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220610083929.GA3742796@falcondesktop \
    --to=fdmanana@kernel.org \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.