[PATCH] btrfs-progs: docs: add warning for btrfs checksum features

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] btrfs-progs: docs: add warning for btrfs checksum features
@ 2025-11-21  5:03 Qu Wenruo
  2025-11-21  5:17 ` Christoph Anton Mitterer
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2025-11-21  5:03 UTC (permalink / raw)
  To: linux-btrfs

The checksum of btrfs, no matter the algorithm utilized, can not provide
any guarantee that the metadata/data is not modified by a malicious
attacker.

And refer end users to fs-verity if they require a strong verification
of a file.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
This also makes me wonder, does it even make any sense to support
SHA256?

I know in the past it is an important step towards write-time dedupe,
but that feature eventually get rejected (and I totally agree the
decision).

And dedupe ioctl is also not utilizing checksum to verify if the content
matches, there is no real user requires cryptographic hash functions.

Which makes SHA256 not only the slowest checksum algorithm, but also the
largest one (takes 8x metadata space compared to CRC32C).

I'd say we should even deprecate SHA256 checksum support.
---
 Documentation/ch-checksumming.rst | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/ch-checksumming.rst b/Documentation/ch-checksumming.rst
index 72261c23fb8c..cc3cb43b175f 100644
--- a/Documentation/ch-checksumming.rst
+++ b/Documentation/ch-checksumming.rst
@@ -3,6 +3,17 @@ writing and verified after reading the blocks from devices. The whole metadata
 block has an inline checksum stored in the b-tree node header. Each data block
 has a detached checksum stored in the checksum tree.

+.. warning::
+   The checksum of btrfs is only to detect which mirrors is good, it can not
+   guarantee the data/metadata is not modified by any malicious attacker, no matter
+   the checksum algorithm utilized.
+
+   The attacker can always modify the file content, update the checksum tree to use
+   a newly calculated checksum.
+
+   For strong verification of read-only files, please use *fs-verity* feature,
+   which btrfs supports since v5.15.
+
 .. note::
    Since a data checksum is calculated just before submitting to the block
    device, btrfs has a strong requirement that the corresponding data block must
--
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs-progs: docs: add warning for btrfs checksum features
  2025-11-21  5:03 [PATCH] btrfs-progs: docs: add warning for btrfs checksum features Qu Wenruo
@ 2025-11-21  5:17 ` Christoph Anton Mitterer
  2025-11-21  5:24   ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Anton Mitterer @ 2025-11-21  5:17 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On November 21, 2025 6:03:05 AM GMT+01:00, Qu Wenruo <wqu@suse.com> wrote:
>The checksum of btrfs, no matter the algorithm utilized, can not provide
>any guarantee that the metadata/data is not modified by a malicious
>attacker.

Is that even the case when the wohle btrfs itself is encrypted, like in dm-crypt (without AEAD or verity, but only a normal cipher like aes-xts-plain64)?
Wouldn't an attacker then neet to know how he can forge the right encrypted checksum?

Cheers, 
Chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs-progs: docs: add warning for btrfs checksum features
  2025-11-21  5:17 ` Christoph Anton Mitterer
@ 2025-11-21  5:24   ` Qu Wenruo
  2025-11-21  6:02     ` Christoph Anton Mitterer
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2025-11-21  5:24 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



在 2025/11/21 15:47, Christoph Anton Mitterer 写道:
> 
> 
> On November 21, 2025 6:03:05 AM GMT+01:00, Qu Wenruo <wqu@suse.com> wrote:
>> The checksum of btrfs, no matter the algorithm utilized, can not provide
>> any guarantee that the metadata/data is not modified by a malicious
>> attacker.
> 
> Is that even the case when the wohle btrfs itself is encrypted, like in dm-crypt (without AEAD or verity, but only a normal cipher like aes-xts-plain64)?
> Wouldn't an attacker then neet to know how he can forge the right encrypted checksum?

In that case the attacker won't even know it's a btrfs or not.

Thanks,
Qu

> 
> Cheers,
> Chris


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs-progs: docs: add warning for btrfs checksum features
  2025-11-21  5:24   ` Qu Wenruo
@ 2025-11-21  6:02     ` Christoph Anton Mitterer
  2025-11-21  6:44       ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Anton Mitterer @ 2025-11-21  6:02 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On November 21, 2025 6:24:26 AM GMT+01:00, Qu Wenruo <wqu@suse.com> wrote:
>在 2025/11/21 15:47, Christoph Anton Mitterer 写道:
>> Is that even the case when the wohle btrfs itself is encrypted, like in dm-crypt (without AEAD or verity, but only a normal cipher like aes-xts-plain64)?
>> Wouldn't an attacker then neet to know how he can forge the right encrypted checksum?
>
>In that case the attacker won't even know it's a btrfs or not.

I wouldn't be so sure about that, at least not depending on the threat model.
First, there's always the case of leaking meta data,... like people observing the list or my access to Debian's packages archives (btrfs-progs) would e.g. know that there's a good chance I'm using btrfs.

Also, an attacker might be able to make snapshots of the offline device and see write patterns that may be typical for btrfs.
Even with only a single snapshot being made, with the empty device not randomised in advance, it might be clear which fs is used.

But all that's anyway not the main point.

Even if an attacker doesn't know what's in it,  he could try to silently corrupt data or replace (encrypted)  blocks with such from an older snapshot... which would then perhaps decrypt to something non-gibberish.

The question IMO is, whether a (dm-crypt) encrypted btrfs that uses a strong hash function for btrfs (i.e. like hash-then-encrypt) would be effectively integrity protected.
I never really found a definitive answer on that (as in properly analysed by some professional cryptographer).

If so, keeling SHA256/etc. would make sense for that use case.

Cheers, 
Chris.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs-progs: docs: add warning for btrfs checksum features
  2025-11-21  6:02     ` Christoph Anton Mitterer
@ 2025-11-21  6:44       ` Qu Wenruo
  2025-11-21 23:55         ` Christoph Anton Mitterer
  2025-11-22 14:36         ` Neal Gompa
  0 siblings, 2 replies; 8+ messages in thread
From: Qu Wenruo @ 2025-11-21  6:44 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs, linux-crypto



在 2025/11/21 16:32, Christoph Anton Mitterer 写道:
> 
> 
> On November 21, 2025 6:24:26 AM GMT+01:00, Qu Wenruo <wqu@suse.com> wrote:
>> 在 2025/11/21 15:47, Christoph Anton Mitterer 写道:
>>> Is that even the case when the wohle btrfs itself is encrypted, like in dm-crypt (without AEAD or verity, but only a normal cipher like aes-xts-plain64)?
>>> Wouldn't an attacker then neet to know how he can forge the right encrypted checksum?
>>
>> In that case the attacker won't even know it's a btrfs or not.
> 
> I wouldn't be so sure about that, at least not depending on the threat model.
> First, there's always the case of leaking meta data,... like people observing the list or my access to Debian's packages archives (btrfs-progs) would e.g. know that there's a good chance I'm using btrfs.
> 
> Also, an attacker might be able to make snapshots of the offline device and see write patterns that may be typical for btrfs.
> Even with only a single snapshot being made, with the empty device not randomised in advance, it might be clear which fs is used.
> 
> 
> But all that's anyway not the main point.
> 
> Even if an attacker doesn't know what's in it,  he could try to silently corrupt data or replace (encrypted)  blocks with such from an older snapshot... which would then perhaps decrypt to something non-gibberish.

Adding linux-crypto list for more feedback.

In that case, as long as the csum tree can not be modified, no matter 
whatever algorithm is, btrfs can still detect something is modified.

> 
> The question IMO is, whether a (dm-crypt) encrypted btrfs that uses a strong hash function for btrfs (i.e. like hash-then-encrypt) would be effectively integrity protected.

In that case, I can not give a concrete answer, but I tend to believe 
it's protected, and no matter what the algorithm is (including CRC32C).

The attack must be able to modify both the data and the checksum to pass 
the check.

But since it's encrypted, the attacker is only to move the encrypted 
data around, not able to modify the content pin-pointly, btrfs should be 
able to detect the mismatch for both metadata and data:

- For metadata
   The bytenr will mismatch, thus be rejected.

   This prevents csum tree from bing modified.

- For data
   The checksum will mismatch, thus be rejected.


> I never really found a definitive answer on that (as in properly analysed by some professional cryptographer).
> 
> If so, keeling SHA256/etc. would make sense for that use case.

Still makes not much sense, CRC32C is enough for the above case.

Thanks,
Qu

> 
> Cheers,
> Chris.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs-progs: docs: add warning for btrfs checksum features
  2025-11-21  6:44       ` Qu Wenruo
@ 2025-11-21 23:55         ` Christoph Anton Mitterer
  2025-11-22  0:52           ` Eric Biggers
  2025-11-22 14:36         ` Neal Gompa
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Anton Mitterer @ 2025-11-21 23:55 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, linux-crypto

On Fri, 2025-11-21 at 17:14 +1030, Qu Wenruo wrote:
> 
> Adding linux-crypto list for more feedback.

It would be good if any of them could confirm or reject:

- Whether a filesystem that uses full checksumming (data + meta-data)
and that is encrypted with dm-crypt,... is effectively integrity
protected like it would be with an AEAD.

In particular also:

- Whether this requires a strong cryptographic hash (or as Qu presumed,
any hash would do) and whether the hashing is needed to be done as a
Merkle-tree or whether that's not needed

- Whether, if one uses such a fs, AEAD or dm-verity is even
recommended, or just a waste of resources as the checksumming done by
the fs would already be enough.


> > The question IMO is, whether a (dm-crypt) encrypted btrfs that uses
> > a strong hash function for btrfs (i.e. like hash-then-encrypt)
> > would be effectively integrity protected.
> 
> In that case, I can not give a concrete answer, but I tend to believe
> it's protected, and no matter what the algorithm is (including
> CRC32C).

I'd rather not think CRC would be enough... I mean why would all crypto
use strong hash algos for signatures, if it could also be done with
fast CRC.


> - For metadata
>    The bytenr will mismatch, thus be rejected.
> 
>    This prevents csum tree from bing modified.

But meta data *is* still checksum protected right (i.e. it doesn'thave
only the bytenr).

Maybe, if someone from the crypto guys has a look, you could outline
them how the exact hashing structure looks for btrfs.... like is it a
full Merkle-tree starting from the super block, what about super block
copies, etc. pp. 


Thanks,
Chris.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs-progs: docs: add warning for btrfs checksum features
  2025-11-21 23:55         ` Christoph Anton Mitterer
@ 2025-11-22  0:52           ` Eric Biggers
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Biggers @ 2025-11-22  0:52 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: Qu Wenruo, linux-btrfs, linux-crypto

On Sat, Nov 22, 2025 at 12:55:18AM +0100, Christoph Anton Mitterer wrote:
> On Fri, 2025-11-21 at 17:14 +1030, Qu Wenruo wrote:
> > 
> > Adding linux-crypto list for more feedback.
> 
> It would be good if any of them could confirm or reject:
> 
> - Whether a filesystem that uses full checksumming (data + meta-data)
> and that is encrypted with dm-crypt,... is effectively integrity
> protected like it would be with an AEAD.

No, encrypting checksummed data using an unauthenticated encryption mode
isn't equivalent to an AEAD.

- Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs-progs: docs: add warning for btrfs checksum features
  2025-11-21  6:44       ` Qu Wenruo
  2025-11-21 23:55         ` Christoph Anton Mitterer
@ 2025-11-22 14:36         ` Neal Gompa
  1 sibling, 0 replies; 8+ messages in thread
From: Neal Gompa @ 2025-11-22 14:36 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Christoph Anton Mitterer, linux-btrfs, linux-crypto

On Fri, Nov 21, 2025 at 1:44 AM Qu Wenruo <wqu@suse.com> wrote:
>
>
>
> 在 2025/11/21 16:32, Christoph Anton Mitterer 写道:
> >
> >
> > On November 21, 2025 6:24:26 AM GMT+01:00, Qu Wenruo <wqu@suse.com> wrote:
> >> 在 2025/11/21 15:47, Christoph Anton Mitterer 写道:
> >>> Is that even the case when the wohle btrfs itself is encrypted, like in dm-crypt (without AEAD or verity, but only a normal cipher like aes-xts-plain64)?
> >>> Wouldn't an attacker then neet to know how he can forge the right encrypted checksum?
> >>
> >> In that case the attacker won't even know it's a btrfs or not.
> >
> > I wouldn't be so sure about that, at least not depending on the threat model.
> > First, there's always the case of leaking meta data,... like people observing the list or my access to Debian's packages archives (btrfs-progs) would e.g. know that there's a good chance I'm using btrfs.
> >
> > Also, an attacker might be able to make snapshots of the offline device and see write patterns that may be typical for btrfs.
> > Even with only a single snapshot being made, with the empty device not randomised in advance, it might be clear which fs is used.
> >
> >
> > But all that's anyway not the main point.
> >
> > Even if an attacker doesn't know what's in it,  he could try to silently corrupt data or replace (encrypted)  blocks with such from an older snapshot... which would then perhaps decrypt to something non-gibberish.
>
> Adding linux-crypto list for more feedback.
>
> In that case, as long as the csum tree can not be modified, no matter
> whatever algorithm is, btrfs can still detect something is modified.
>

A few years back, the Fedora Btrfs folks debated whether we should
switch the default away from crc32c to xxhash or anything else[1], and it
basically came down to the performance hit being too significant to
consider.
Given that crc32c does the job in detecting tampering with it, and you
can reinforce it with fsverity, I'm not too worried about it.

[1]: https://pagure.io/fedora-btrfs/project/issue/40



--
真実はいつも一つ！/ Always, there's only one truth!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-11-22 14:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-21  5:03 [PATCH] btrfs-progs: docs: add warning for btrfs checksum features Qu Wenruo
2025-11-21  5:17 ` Christoph Anton Mitterer
2025-11-21  5:24   ` Qu Wenruo
2025-11-21  6:02     ` Christoph Anton Mitterer
2025-11-21  6:44       ` Qu Wenruo
2025-11-21 23:55         ` Christoph Anton Mitterer
2025-11-22  0:52           ` Eric Biggers
2025-11-22 14:36         ` Neal Gompa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox