public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 0/2] btrfs: remove the inode_need_compress() call in
Date: Mon, 31 Aug 2020 23:25:45 +0200	[thread overview]
Message-ID: <20200831212545.GD28318@twin.jikos.cz> (raw)
In-Reply-To: <20200804072548.34001-1-wqu@suse.com>

On Tue, Aug 04, 2020 at 03:25:46PM +0800, Qu Wenruo wrote:
> This is an attempt to remove the inode_need_compress() call in
> compress_file_extent().
> 
> As that compress_file_extent() can race with inode ioctl or bad
> compression ratio, to cause NULL pointer dereferecen for @pages, it's
> nature to try to remove that inode_need_compress() to remove the race
> completely.
> 
> However that's not that easy, we have the following problems:
> 
> - We still need to check @pages anyway
>   That @pages check is for kcalloc() failure, so what we really get is
>   just removing one indent from the if (inode_need_compress()).
>   Everything else is still the same (in fact, even worse, see below
>   problems)
> 
> - Behavior change
>   Before that change, every async_chunk does their check on
>   INODE_NO_COMPRESS flags.
>   If we hit any bad compression ratio, all incoming async_chunk will
>   fall back to plain text write.
> 
>   But if we remove that inode_need_compress() check, then we still try
>   to compress, and lead to potentially wasted CPU times.

Hm, any way I read it, this does not follow the intentions how the
compression decisions should work. The whole idea is to have 2 modes,
force-compress which overrides everything, potentially wasting cpu
cycles.  This is the fallback when the automatic nocompress logic is
bailing out too early.

To fix that, the normal 'compress' mode plus heuristics should decide if
the compression should happen at all and after it does not, flip the
nocompress bit. This is still not ideal as there's no "learning" that
some file ranges can compress worse but we still want to keep the file
compression on, just skip the bad ranges. Compared to one bad range then
skip compression on the file completely.

And your code throws away the location where the heuristic can give the
hint.

As we have several ways to request the compression (mount option, defrag
ioctl, property), the decision needs to be more fine grained.

1. btrfs_run_delalloc_range

- check if the compression is allowed at all (not if there are nodatacow
  or nodatasum bits)
- check if there's explicit request (defrag, compress-force)
- skip it if the nocompress bit is set
- the rest are the weak tests, either mount option or inode flags/props

At this point we don't need to call the heuristics.

2. compress_file_range

The range is split to the 128k chunks and each is evaluated separately.
Here we know we want to compress the range because of all the checks in
1 passed, so the only remaining decisions are based on:

- heuristics, try the given range and bail out eventuall or be more
  clever and do some longer-term statistics before flipping nocompress
  bit
- check the nocompress bit that could have been set in previous
  iteration

So this is the plan, you can count how many things are not implemented.
I have prototype for the heuristic learning, but that's not the
important part, first the test conditions need to be shuffled around to
follow the above logic.

> - Still race between compression disable and NULL pointer dereferecen
>   There is a hidden race, mostly exposed by btrfs/071 test case, that we
>   have "compress_type = fs_info->compress_type", so we can still hit case
>   where that compress_type is NONE (caused by remount -o nocompress), and
>   then btrfs_compress_pages() will return -E2BIG, without modifying
>   @nr_pages

We maybe should add some last sanity check before btrfs_compress_pages
is called in case the defrag or prop changes after the checks.

>   Then later when we cleanup @pages, we try to access pages[i]->mapping,
>   triggering NULL pointer dereference.
> 
>   This will be address in the first patch though.

The patch makes it more robust so this is ok. The unexpected changes of
the compress type due to remount or prop changes could be avoided by
saving the type + level at the beginning of compress_file_range.

  parent reply	other threads:[~2020-08-31 21:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-04  7:25 [PATCH v2 0/2] btrfs: remove the inode_need_compress() call in Qu Wenruo
2020-08-04  7:25 ` [PATCH v2 1/2] btrfs: prevent NULL pointer dereference in compress_file_range() when btrfs_compress_pages() hits default case Qu Wenruo
2020-08-11 18:57   ` Josef Bacik
2021-03-09  1:02   ` Su Yue
2020-08-04  7:25 ` [PATCH v2 2/2] btrfs: inode: don't re-evaluate inode_need_compress() in compress_file_extent() Qu Wenruo
2020-08-11 18:59   ` Josef Bacik
2020-08-31 21:25 ` David Sterba [this message]
2021-04-19 19:34 ` [PATCH v2 0/2] btrfs: remove the inode_need_compress() call in David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200831212545.GD28318@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox