From: Christoph Hellwig <hch@infradead.org>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>,
Christoph Hellwig <hch@infradead.org>,
Cyber_black <Cyberblackk@proton.me>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
Mark Fasheh <mark@fasheh.com>, Theodore Ts'o <tytso@mit.edu>,
linux-api@vger.kernel.org
Subject: Re: [RFC] fs/ioctl.c: FIBMAP requires CAP_SYS_RAWIO while FIEMAP exposes identical data unprivileged
Date: Tue, 19 May 2026 04:45:47 -0700 [thread overview]
Message-ID: <agxNa6Whf_tcm3o6@infradead.org> (raw)
In-Reply-To: <20260519033126.GD9531@frogsfrogsfrogs>
On Mon, May 18, 2026 at 08:31:26PM -0700, Darrick J. Wong wrote:
> > The only way that I'm personally aware of to determine whether ranges
> > in two files are reflinked to each other (and the only efficient way
> > to find identical blocks to, say, archive a large directory without
> > reading all the contents) is FIEMAP. I wrote some code to do this
> > awhile back (not in production use). Yes, I realize that it might
> > have issues with dirty page cache.
> >
> > Is there some other way to do this? Could an API be added that
> > efficiently answers the actual question without revealing information
> > that shouldn't be revealed?
>
> Well, yes, we *could* make yet another ioctl, but we could also just run
> fe_physical through a one-way u64 hash function and set
> FIEMAP_EXTENT_UNKNOWN if (say) you don't have CAP_SYS_RAWIO or
> something. Then your comparison function might still work... maybe?
What is the actual use case for that dedup detection? I.e. what is
considered duplicate? Does the application already have candidate
ranges or does it scan the output for all fіles?
For xfs the rmap can directly tell you what is shared, but I can't think
of a good way to expose that, but part of that might be that I don't
understand what question is asked and why.
Note the FIEMAP output can give you the wrong answer, e.g. with XFS
and multiple devices, or for file systems that can do tail packing and
have small amounts of data for multiple files in the same block.
> Also note that FIEMAP still doesn't report devices, so you're still
> playing with fire on multi-device reflink-aware filesystems like XFS.
or even on f2fs despite the lack of reflink support if the caller is
dumb enough. All that of course depends on what the caller is doing
based on the FIEMAP output.
next prev parent reply other threads:[~2026-05-19 11:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <_fcorj7Aa0YnzUmrPnqdEbTjLqS6S7t84HKrzsswvKm71LC0uVmTD2cthCwpgeI-296unEpzPZYBNdFFDXjsQvZRtGfTaQlKmcRkiSI4wiQ=@proton.me>
2026-05-18 5:08 ` [RFC] fs/ioctl.c: FIBMAP requires CAP_SYS_RAWIO while FIEMAP exposes identical data unprivileged Christoph Hellwig
2026-05-18 16:20 ` Darrick J. Wong
2026-05-18 16:22 ` Andy Lutomirski
2026-05-19 3:31 ` Darrick J. Wong
2026-05-19 7:53 ` Andreas Dilger
2026-05-19 11:45 ` Christoph Hellwig [this message]
2026-05-19 20:51 ` Andy Lutomirski
2026-05-19 2:23 ` Theodore Tso
2026-05-19 11:42 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agxNa6Whf_tcm3o6@infradead.org \
--to=hch@infradead.org \
--cc=Cyberblackk@proton.me \
--cc=djwong@kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mark@fasheh.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox