public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Mark Harmstone <mark@harmstone.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: add BTRFS_IOC_GET_CSUMS ioctl
Date: Sat, 21 Mar 2026 08:48:45 +1030	[thread overview]
Message-ID: <07cf5ebc-ac52-4fd9-82c5-404c0f4d6056@gmx.com> (raw)
In-Reply-To: <20260320125058.90053-1-mark@harmstone.com>



在 2026/3/20 23:20, Mark Harmstone 写道:
> Add a new unprivileged BTRFS_IOC_GET_CSUMS ioctl, which can be used to
> query the on-disk csums for a file.
> 
> This is done by userspace passing a struct btrfs_ioctl_get_csums_args to
> the kernel, which details the offset and length we're interested in, and
> a buffer for the kernel to write its results into. The kernel writes a
> struct btrfs_ioctl_get_csums_entry into the buffer, followed by the
> csums if available.
> 
> If the extent is an uncompressed, non-nodatasum extent, the kernel sets
> the entry type to BTRFS_GET_CSUMS_HAS_CSUMS and follows it with the
> csums. If it is sparse, preallocated, or beyond the EOF, it sets the
> type to BTRFS_GET_CSUMS_SPARSE - this is so userspace knows it can use
> the precomputed hash of the zero sector. Otherwise, it sets the type to
> BTRFS_GET_CSUMS_NO_CSUMS.

I'm not sure if it's a good idea to put hole and preallocated range into 
the same BTRFS_GET_CSUMS_SPARSE.

Although both means there is no csum, hole case means there is really no 
data extent, thus we should not create any extent instead of writing zero.

For preallocated, indicating it has no CSUM can allow mkfs to 
distinguish hole and preallocated, thus change to zero writes to 
prealloc, which is faster and make the resulted fs more aligned to the 
source dir.


And for EOF checks, I think we don't need to bother that much, aka, just 
let it return the regular results.

My assumption is, the mkfs shouldn't pass a range completely beyond the 
round_up(i_size), as non-reflink rootdir population would always read 
out the content of the inode from the host fs.
Thus we won't really read beyond the inode size.

> 
> We do store the csums of compressed extents, but we deliberately don't
> return them here: they're hashed over the compressed data, not the
> uncompressed data that's returned to userspace.

I agree with the skip of compressed extents, but I'd prefer to have a 
special flag to indicate that, other than NO_CSUMS.

Or mkfs is unable to distinguish hole and compressed extents.

[...]
> +/* Types for struct btrfs_ioctl_get_csums_entry::type */
> +#define BTRFS_GET_CSUMS_HAS_CSUMS	0
> +#define BTRFS_GET_CSUMS_SPARSE		1
> +#define BTRFS_GET_CSUMS_NO_CSUMS	2
> +
> +struct btrfs_ioctl_get_csums_entry {
> +	__u64 offset;		/* file offset of this range */
> +	__u64 length;		/* length in bytes */
> +	__u32 type;		/* BTRFS_GET_CSUMS_* type */
> +	__u32 reserved;		/* padding, must be 0 */
> +};
> +
> +struct btrfs_ioctl_get_csums_args {
> +	__u64 offset;		/* in/out: file offset */
> +	__u64 length;		/* in/out: range length */
> +	__u64 buf_size;		/* in/out: buffer capacity / bytes written */
> +	__u8 buf[];		/* out: entries + csum data */
> +};

 From the progs usage, it is always a single btrfs_ioctl_get_csums_entry 
at the beginning of buf[], then real buffer for csum, can we just 
combine both structures into one?

Furthermore, since we only query one extent at one time, the 
offset/length are more or less duplicated between args and entry structure.

We can just save the length into the args without the need for entry 
members (except the type).

Thanks,
Qu

> +
>   /* Flags for IOC_SHUTDOWN, must match XFS_FSOP_GOING_FLAGS_* flags. */
>   #define BTRFS_SHUTDOWN_FLAGS_DEFAULT			0x0
>   #define BTRFS_SHUTDOWN_FLAGS_LOGFLUSH			0x1
> @@ -1226,6 +1245,8 @@ enum btrfs_err_code {
>   				     struct btrfs_ioctl_encoded_io_args)
>   #define BTRFS_IOC_SUBVOL_SYNC_WAIT _IOW(BTRFS_IOCTL_MAGIC, 65, \
>   					struct btrfs_ioctl_subvol_wait)
> +#define BTRFS_IOC_GET_CSUMS _IOWR(BTRFS_IOCTL_MAGIC, 66, \
> +				  struct btrfs_ioctl_get_csums_args)
>   
>   /* Shutdown ioctl should follow XFS's interfaces, thus not using btrfs magic. */
>   #define BTRFS_IOC_SHUTDOWN	_IOR('X', 125, __u32)


  parent reply	other threads:[~2026-03-20 22:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 12:50 [PATCH] btrfs: add BTRFS_IOC_GET_CSUMS ioctl Mark Harmstone
2026-03-20 13:03 ` Mark Harmstone
2026-03-20 22:18 ` Qu Wenruo [this message]
2026-03-25  7:34   ` Qu Wenruo
2026-03-25 14:43     ` Mark Harmstone
2026-03-25 21:04       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07cf5ebc-ac52-4fd9-82c5-404c0f4d6056@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mark@harmstone.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox