All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: David Sterba <dsterba@suse.com>,
	torvalds@linux-foundation.org, linux-btrfs@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL] Btrfs updates for 6.2 (updated merge log)
Date: Tue, 13 Dec 2022 02:38:29 +0100	[thread overview]
Message-ID: <20221213013829.GD5824@suse.cz> (raw)
In-Reply-To: <09d56e5a-0e11-ca60-785a-7f06aedf1932@gmx.com>

On Tue, Dec 13, 2022 at 08:09:29AM +0800, Qu Wenruo wrote:
> > - raid56 reliability vs performance trade off
> >    - fix destructive RMW for raid5 data (raid6 still needs work) - do full RMW
> >      cycle for writes and verify all checksums before overwrite, this should
> >      prevent rewriting potentially corrupted data without notice
> 
> Unfortunately, the "RMW" term seems abused.

I used is as a shortcut but it's probably confusing, thanks for the
suggested updates.

> >    - stripes are cached in memory which should reduce the performance impact but
> >      still can hurt some workloads
> 
> The cache behavior is not changed in this big chunk of raid56 work, but 
> commit f6065f8edeb2 ("btrfs: raid56: don't trust any cached sector in 
> __raid56_parity_recover()") is still the main thing affecting recovery path.
> 
> Thus although we didn't change the cache policy, it will still be bad 
> for recovery cases (missing device, or some sector has mimsatch csum).

Yeah, there's no change but I felt it should be mentioned together with
the RMW as it'll be used more than before.

Linus, below is the complete merge log with the edits.

---
User visible features:

- raid56 reliability vs performance trade off
  - fix destructive RMW for raid5 data (raid6 still needs work) - do full
    checksum verification for all data during RMW cycle, this should prevent
    rewriting potentially corrupted data without notice
  - stripes are cached in memory which should reduce the performance impact but
    still can hurt some workloads
  - checksums are verified after repair again
  - this is the last option without introducing additional features (write
    intent bitmap, journal, another tree), the extra checksum read/verification
    was supposed to be avoided by the original implementation exactly for
    performance reasons but that caused all the reliability problems

- discard=async by default for devices that support it

- implement emergency flush reserve to avoid almost all unnecessary transaction
  aborts due to ENOSPC in cases where there are too many delayed refs or
  delayed allocation

- skip block group synchronization if there's no change in used bytes, can
  reduce transaction commit count for some workloads

Performance improvements:

- fiemap and lseek
  - overall speedup due to skipping unnecessary or duplicate searches (-40% run time)
  - cache some data structures and sharedness of extents (-30% run time)

- send
  - faster backref resolution when finding clones
  - cached leaf to root mapping for faster backref walking
  - improved clone/sharing detection
  - overall run time improvements (-70%)

Core:

- module initialization converted to a table of function pointers run in a
  sequence

- preparation for fscrypt, extend passing file names across calls, dir item can
  store encryption status

- raid56 updates
  - more accurate error tracking of sectors within stripe
  - simplify recovery path and remove dedicated endio worker kthread
  - simplify scrub call paths
  - refactoring to support the extra data checksum verification during RMW
    cycle

- tree block parentness checks consolidated and done at metadata read time

- improved error handling

- cleanups
  - move a lot of code for better synchronization between kernel and user space
    sources, split big files
  - enum cleanups
  - GFP flag cleanups
  - header file cleanups, prototypes, dependencies
  - redundant parameter cleanups
  - inline extent handling simplifications
  - inode parameter conversion
  - data structure cleanups, reductions, renames, merges

  reply	other threads:[~2022-12-13  1:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-12 20:16 [GIT PULL] Btrfs updates for 6.2 David Sterba
2022-12-13  0:09 ` Qu Wenruo
2022-12-13  1:38   ` David Sterba [this message]
2022-12-13  5:00 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221213013829.GD5824@suse.cz \
    --to=dsterba@suse.cz \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.