From: Gabriel de Perthuis <g2p.code@gmail.com>
To: Mark Fasheh <mfasheh@suse.de>
Cc: linux-btrfs@vger.kernel.org,
Chris Mason <chris.mason@fusionio.com>,
Josef Bacik <josef@redhat.com>, David Sterba <dsterba@suse.cz>
Subject: Re: [PATCH 0/4] btrfs: offline dedupe v2
Date: Tue, 11 Jun 2013 22:56:59 +0200 [thread overview]
Message-ID: <51B78F1B.7000100@gmail.com> (raw)
In-Reply-To: <1370982698-757-1-git-send-email-mfasheh@suse.de>
Le 11/06/2013 22:31, Mark Fasheh a écrit :
> Perhaps this isn't a limiation per-se but extent-same requires read/write
> access to the files we want to dedupe. During my last series I had a
> conversation with Gabriel de Perthuis about access checking where we tried
> to maintain the ability for a user to run extent-same against a readonly
> snapshot. In addition, I reasoned that since the underlying data won't
> change (at least to the user) that we ought only require the files to be
> open for read.
>
> What I found however is that neither of these is a great idea ;)
>
> - We want to require that the inode be open for writing so that an
> unprivileged user can't do things like run dedupe on a performance
> sensitive file that they might only have read access to. In addition I
> could see it as kind of a surprise (non-standard behavior) to an
> administrator that users could alter the layout of files they are only
> allowed to read.
>
> - Readonly snapshots won't let you open for write anyway (unsuprisingly,
> open() returns -EROFS). So that kind of kills the idea of them being able
> to open those files for write which we want to dedupe.
>
> That said, I still think being able to run this against a set of readonly
> snapshots makes sense especially if those snapshots are taken for backup
> purposes. I'm just not sure how we can sanely enable it.
The check could be: if (fmode_write || cap_sys_admin).
This isn't incompatible with mnt_want_write, that check is at the
level of the superblocks and vfsmount and not the subvolume fsid.
> Code review is very much appreciated. Thanks,
> --Mark
>
>
> ChangeLog
>
> - check that we have appropriate access to each file before deduping. For
> the source, we only check that it is opened for read. Target files have to
> be open for write.
>
> - don't dedupe on readonly submounts (this is to maintain
>
> - check that we don't dedupe files with different checksumming states
> (compare BTRFS_INODE_NODATASUM flags)
>
> - get and maintain write access to the mount during the extent same
> operation (mount_want_write())
>
> - allocate our read buffers up front in btrfs_ioctl_file_extent_same() and
> pass them through for re-use on every call to btrfs_extent_same(). (thanks
> to David Sterba <dsterba@suse.cz> for reporting this
>
> - As the read buffers could possibly be up to 1MB (depending on user
> request), we now conditionally vmalloc them.
>
> - removed redundant check for same inode. btrfs_extent_same() catches it now
> and bubbles the error up.
>
> - remove some unnecessary printks
>
> Changes from RFC to v1:
>
> - don't error on large length value in btrfs exent-same, instead we just
> dedupe the maximum allowed. That way userspace doesn't have to worry
> about an arbitrary length limit.
>
> - btrfs_extent_same will now loop over the dedupe range at 1MB increments (for
> a total of 16MB per request)
>
> - cleaned up poorly coded while loop in __extent_read_full_page() (thanks to
> David Sterba <dsterba@suse.cz> for reporting this)
>
> - included two fixes from Gabriel de Perthuis <g2p.code@gmail.com>:
> - allow dedupe across subvolumes
> - don't lock compressed pages twice when deduplicating
>
> - removed some unused / poorly designed fields in btrfs_ioctl_same_args.
> This should also give us a bit more reserved bytes.
>
> - return -E2BIG instead of -ENOMEM when arg list is too large (thanks to
> David Sterba <dsterba@suse.cz> for reporting this)
>
> - Some more reserved bytes are now included as a result of some of my
> cleanups. Quite possibly we could add a couple more.
>
next prev parent reply other threads:[~2013-06-11 20:57 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-11 20:31 [PATCH 0/4] btrfs: offline dedupe v2 Mark Fasheh
2013-06-11 20:31 ` [PATCH 1/4] btrfs: abtract out range locking in clone ioctl() Mark Fasheh
2013-06-11 20:31 ` [PATCH 2/4] btrfs_ioctl_clone: Move clone code into it's own function Mark Fasheh
2013-06-11 20:31 ` [PATCH 3/4] btrfs: Introduce extent_read_full_page_nolock() Mark Fasheh
2013-06-11 20:31 ` [PATCH 4/4] btrfs: offline dedupe Mark Fasheh
2013-07-15 20:55 ` Zach Brown
2013-07-17 0:14 ` Gabriel de Perthuis
2013-06-11 20:56 ` Gabriel de Perthuis [this message]
2013-06-11 21:04 ` [PATCH 0/4] btrfs: offline dedupe v2 Mark Fasheh
2013-06-11 21:31 ` Gabriel de Perthuis
2013-06-11 21:45 ` Mark Fasheh
2013-06-12 18:10 ` Josef Bacik
2013-06-17 20:04 ` Mark Fasheh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51B78F1B.7000100@gmail.com \
--to=g2p.code@gmail.com \
--cc=chris.mason@fusionio.com \
--cc=dsterba@suse.cz \
--cc=josef@redhat.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=mfasheh@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).