From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: David Sterba <dsterba@suse.com>,
torvalds@linux-foundation.org, linux-btrfs@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL] Btrfs updates for 6.2 (updated merge log)
Date: Tue, 13 Dec 2022 02:38:29 +0100 [thread overview]
Message-ID: <20221213013829.GD5824@suse.cz> (raw)
In-Reply-To: <09d56e5a-0e11-ca60-785a-7f06aedf1932@gmx.com>
On Tue, Dec 13, 2022 at 08:09:29AM +0800, Qu Wenruo wrote:
> > - raid56 reliability vs performance trade off
> > - fix destructive RMW for raid5 data (raid6 still needs work) - do full RMW
> > cycle for writes and verify all checksums before overwrite, this should
> > prevent rewriting potentially corrupted data without notice
>
> Unfortunately, the "RMW" term seems abused.
I used is as a shortcut but it's probably confusing, thanks for the
suggested updates.
> > - stripes are cached in memory which should reduce the performance impact but
> > still can hurt some workloads
>
> The cache behavior is not changed in this big chunk of raid56 work, but
> commit f6065f8edeb2 ("btrfs: raid56: don't trust any cached sector in
> __raid56_parity_recover()") is still the main thing affecting recovery path.
>
> Thus although we didn't change the cache policy, it will still be bad
> for recovery cases (missing device, or some sector has mimsatch csum).
Yeah, there's no change but I felt it should be mentioned together with
the RMW as it'll be used more than before.
Linus, below is the complete merge log with the edits.
---
User visible features:
- raid56 reliability vs performance trade off
- fix destructive RMW for raid5 data (raid6 still needs work) - do full
checksum verification for all data during RMW cycle, this should prevent
rewriting potentially corrupted data without notice
- stripes are cached in memory which should reduce the performance impact but
still can hurt some workloads
- checksums are verified after repair again
- this is the last option without introducing additional features (write
intent bitmap, journal, another tree), the extra checksum read/verification
was supposed to be avoided by the original implementation exactly for
performance reasons but that caused all the reliability problems
- discard=async by default for devices that support it
- implement emergency flush reserve to avoid almost all unnecessary transaction
aborts due to ENOSPC in cases where there are too many delayed refs or
delayed allocation
- skip block group synchronization if there's no change in used bytes, can
reduce transaction commit count for some workloads
Performance improvements:
- fiemap and lseek
- overall speedup due to skipping unnecessary or duplicate searches (-40% run time)
- cache some data structures and sharedness of extents (-30% run time)
- send
- faster backref resolution when finding clones
- cached leaf to root mapping for faster backref walking
- improved clone/sharing detection
- overall run time improvements (-70%)
Core:
- module initialization converted to a table of function pointers run in a
sequence
- preparation for fscrypt, extend passing file names across calls, dir item can
store encryption status
- raid56 updates
- more accurate error tracking of sectors within stripe
- simplify recovery path and remove dedicated endio worker kthread
- simplify scrub call paths
- refactoring to support the extra data checksum verification during RMW
cycle
- tree block parentness checks consolidated and done at metadata read time
- improved error handling
- cleanups
- move a lot of code for better synchronization between kernel and user space
sources, split big files
- enum cleanups
- GFP flag cleanups
- header file cleanups, prototypes, dependencies
- redundant parameter cleanups
- inline extent handling simplifications
- inode parameter conversion
- data structure cleanups, reductions, renames, merges
next prev parent reply other threads:[~2022-12-13 1:39 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-12 20:16 [GIT PULL] Btrfs updates for 6.2 David Sterba
2022-12-13 0:09 ` Qu Wenruo
2022-12-13 1:38 ` David Sterba [this message]
2022-12-13 5:00 ` pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221213013829.GD5824@suse.cz \
--to=dsterba@suse.cz \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.