From: Chris Mason <chris.mason@oracle.com>
To: Nirbheek Chauhan <nirbheek.chauhan@gmail.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: "Appending" data to the middle of a file using btrfs-specific features
Date: Mon, 06 Dec 2010 11:05:02 -0500 [thread overview]
Message-ID: <1291651254-sup-4263@think> (raw)
In-Reply-To: <AANLkTinXfbKmXYy52JF-QuJr0WTzgvbMXHAw-tYK4i5X@mail.gmail.com>
Excerpts from Nirbheek Chauhan's message of 2010-12-06 07:41:16 -0500:
> Hello,
>
> I'd like to know if there has been any discussion about adding a new
> feature to write (add) data at an offset, but without overwriting
> existing data, or re-writing the existing data. Essentially, in-place
> addition/removal of data to a file at a place other than the end of
> the file.
>
> Some possible use-cases of such a feature would be:
>
> (a) Databases (currently hack around this by allocating sparse files)
> (b) Delta-patching (rsync, patch, xdelta, etc)
> (c) Video editors (especially if combined with reflink copies)
>
> Besides I/O savings, it would also have significant space savings if
> the current subvolume being written to has been snapshotted (a common
> use-case for incremental backups).
>
> I've been told that the problem is somewhat difficult to solve
> properly under block-based representation of data, but I was hoping
> that btrfs' reflink mechanism and its space-efficient packing of small
> files might make it doable.
>
> A hack I can think of is to do a BTRFS_IOC_CLONE_RANGE into a new file
> (upto the offset), writing whatever data is required, and then doing
> another BTRFS_IOC_CLONE_RANGE with an offset for the rest of the
> original file. This can be followed by a rename() over the original
> file. Similarly for removing data from the middle of a file. Would
> this work? Would it be cleaner to implement something equivalent
> internally?
It would work yes. The operation has three cases:
1) file size doesn't change
2) extend the file with new bytes in the middle
3) make the file smaller removing bytes in the middle
#1 is the easiest case, you can just use the clone range ioctl directly
For #2 and #3, all of the file pointers past the bytes you want to add
or remove need to be updated with a new file offset. I'd say for an
initial implementation to use the IOC_CLONE_RANGE code, and after
everything is working we can look at optimizing it with a shift ioctl if
it makes sense.
Of the use cases you list, video editors seems the most useful.
Databases already have things pretty much under control, and delta
patching wants to go to a new file anyway. Video editing software has
long been looking for ways to do this.
-chris
next prev parent reply other threads:[~2010-12-06 16:05 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-06 12:41 "Appending" data to the middle of a file using btrfs-specific features Nirbheek Chauhan
2010-12-06 16:05 ` Chris Mason [this message]
2010-12-06 19:14 ` Nirbheek Chauhan
2010-12-06 19:33 ` Chris Mason
2010-12-06 19:35 ` Freddie Cash
2010-12-06 20:30 ` Nirbheek Chauhan
2010-12-06 20:42 ` Freddie Cash
2010-12-07 7:38 ` Nirbheek Chauhan
2010-12-07 7:50 ` Andrey Kuzmin
[not found] ` <4CFE0A81.9040102@electric-spoon.com>
2010-12-07 11:29 ` Nirbheek Chauhan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1291651254-sup-4263@think \
--to=chris.mason@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nirbheek.chauhan@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.