From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Nikolay Borisov <nborisov@suse.com>,
David Sterba <dsterba@suse.com>,
linux-btrfs@vger.kernel.org, willy@infradead.org
Subject: Re: [PATCH v2] btrfs: use preallocated pages for super block write
Date: Fri, 10 Jun 2022 09:39:29 +0100 [thread overview]
Message-ID: <20220610083929.GA3742796@falcondesktop> (raw)
In-Reply-To: <51502090-6278-ae62-8084-b995cf04caa5@gmx.com>
On Fri, Jun 10, 2022 at 03:33:47PM +0800, Qu Wenruo wrote:
>
>
> On 2022/6/10 15:23, Nikolay Borisov wrote:
> >
> >
> > On 10.06.22 г. 3:07 ч., Qu Wenruo wrote:
> > >
> > >
> > > On 2022/6/10 00:46, David Sterba wrote:
> > > > Currently the super block page is from the mapping of the block device,
> > > > this is result of direct conversion from the previous buffer_head to bio
> > > > API. We don't use the page cache or the mapping anywhere else, the page
> > > > is a temporary space for the associated bio.
> > > >
> > > > Allocate pages for all super block copies at device allocation time,
> > > > also to avoid any later allocation problems when writing the super
> > > > block. This simplifies the page reference tracking, but the page lock is
> > > > still used as waiting mechanism for the write and write error is tracked
> > > > in the page.
> > > >
> > > > As there is a separate page for each super block copy all can be
> > > > submitted in parallel, as before.
> > >
> > > Is there any history on why we want parallel super block submission?
> >
> > Because it's in the transaction critical path so it affects latency of
> > operations.
>
> Not exactly.
>
> Super block writeback happens with TRANS_STATE_UNBLOCKED status, which
> means new transaction can already be started.
>
> Thus even if we don't do any parallel submission, there is no
> performance impact on transaction path.
Hell, no. There's more than transaction states to consider.
Taking longer to write super blocks can have a performance impact on fsync
for example. And fsync always has to write super blocks and wait for them
to complete. In fact, a large part of time spent on fsync is precisely on
writing super blocks.
In some cases fsync has to fallback to a transaction commit and wait for
it to complete before returning to user space - which again requires writing
super blocks and waiting for their completion.
Similarly, there are ioctls like snapshot and subvolume creation which
commit a transaction, and any changes to the way super blocks are written
can also affect their latency and impact applications.
>
> Thanks,
> Qu
> >
> > >
> > > In fact, for the 3 super blocks, the primary one has FUA flag, while the
> > > other backup ones doesn't.
> > >
> > > This means, even we wait for the super block write, only the first one
> > > would take some real time, while the other two can return almost
> > > immediate for devices with write cache.
> > >
> > > In fact, waiting for the super block write back can tell us if our
> > > primary super block did really reach disk, allowing us to do better
> > > error handling, other than the almost non-exist check in endio.
> > >
> > > And such synchronous submission allows us to have only one copy of the
> > > sb.
> > >
> > >
> > > Furthermore, if we really go parallel, I don't think the current way is
> > > the correct way.
> > >
> > > One device can have at most 3 superblocks, but a btrfs can easily have
> > > more than 4 devices.
> > >
> > > Shouldn't we parallel based on device, other than each superblock?
> >
> > I agree that instead of having 3 pages per-device we can go the 1 page
> > per device, and parallelize based on the device, rather than the super
> > block copies.
> >
> > <snip>
next prev parent reply other threads:[~2022-06-10 8:40 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-09 16:46 [PATCH v2] btrfs: use preallocated pages for super block write David Sterba
2022-06-09 21:00 ` Matthew Wilcox
2022-06-09 22:54 ` David Sterba
2022-06-09 22:58 ` Qu Wenruo
2022-06-09 22:59 ` David Sterba
2022-06-09 23:15 ` Qu Wenruo
2022-06-09 23:35 ` David Sterba
2022-06-10 1:40 ` Matthew Wilcox
2022-06-10 2:46 ` Qu Wenruo
2022-06-10 3:31 ` Matthew Wilcox
2022-06-10 4:53 ` Qu Wenruo
2022-06-10 0:07 ` Qu Wenruo
2022-06-10 7:23 ` Nikolay Borisov
2022-06-10 7:33 ` Qu Wenruo
2022-06-10 7:39 ` Nikolay Borisov
2022-06-10 7:55 ` Qu Wenruo
2022-06-10 8:39 ` Filipe Manana [this message]
2022-06-10 8:44 ` Qu Wenruo
2022-06-10 8:49 ` Qu Wenruo
2022-06-10 9:07 ` Qu Wenruo
2022-06-10 12:06 ` Nikolay Borisov
2022-06-11 13:30 ` Anand Jain
2022-06-13 6:37 ` Nikolay Borisov
2022-06-13 6:35 ` Nikolay Borisov
2022-06-21 13:24 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220610083929.GA3742796@falcondesktop \
--to=fdmanana@kernel.org \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
--cc=quwenruo.btrfs@gmx.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox