linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: "Matwey V. Kornilov" <matwey.kornilov@gmail.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: btrfs: obtain block checksums from user space
Date: Thu, 24 Sep 2015 14:35:41 -0400	[thread overview]
Message-ID: <5604427D.1000708@gmail.com> (raw)
In-Reply-To: <mu1e38$io1$1@ger.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2045 bytes --]

On 2015-09-24 14:06, Matwey V. Kornilov wrote:
>
> Hello,
>
> I would like to read the list of the checksums for the specific file
> stored onto btrfs filesystem. I think I could use the checksums in the
> manner like rsync does, but safe both CPU (because csums are already
> calculated for the file) and I/O (because I don't need to reread all the
> file from the hard drive).
As of right now, there is no way to do this from userspace without just 
directly parsing the on-disk format (which isn't safe or reliable if the 
filesystem is mounted). It has been discussed before, but the 
discussions haven't really gotten anywhere.

It's worth noting that the way btrfs does checksums isn't per-file, it's 
per-block. This means that:
a. I think (I'm not 100% certain about this) that the checksum in btrfs 
includes the padding up to the end of the block for blocks that aren't full.
b. Files that get stored in-line in their metadata block won't have a 
checksum just for the file data (because the checksum will cover the 
whole metadata block).
c. While it is possible with some checksum algorithms (if I remember 
right, CRC32c is one such algorithm, and that is what btrfs uses for 
it's checksums) to combine the checksums from a group of data blocks to 
get the checksum for data as a whole, this in and of itself takes a 
significant amount of CPU time for large amounts of data.

All in all, this means that if you just want a checksum of the contents 
of the file, it's almost certainly better to just do it in userspace.
If you're trying to figure out what changed, using send/receive and 
snapshots is more efficient (usually).
>
> I've looked through linux kernel sources and not found appropriate ioctl
> to do this. Frankly speaking, I've not found good documentations for all
> available btrfs ioctls.
I agree that this documentation really needs to be improved (if you want 
to take the time to figure out how it all works, patches for the 
documentation would be greatly appreciated).


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

  reply	other threads:[~2015-09-24 18:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-24 18:06 btrfs: obtain block checksums from user space Matwey V. Kornilov
2015-09-24 18:35 ` Austin S Hemmelgarn [this message]
2015-09-24 18:48   ` Matwey V. Kornilov
2015-09-24 19:47     ` Austin S Hemmelgarn
2015-09-28 23:11     ` Calvin Walton
2015-09-28 23:16       ` Hugo Mills
2015-09-28 23:25         ` Calvin Walton
2015-10-01 16:59 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5604427D.1000708@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=matwey.kornilov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).