linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* "Appending" data to the middle of a file using btrfs-specific features
@ 2010-12-06 12:41 Nirbheek Chauhan
  2010-12-06 16:05 ` Chris Mason
       [not found] ` <4CFE0A81.9040102@electric-spoon.com>
  0 siblings, 2 replies; 10+ messages in thread
From: Nirbheek Chauhan @ 2010-12-06 12:41 UTC (permalink / raw)
  To: linux-btrfs

Hello,

I'd like to know if there has been any discussion about adding a new
feature to write (add) data at an offset, but without overwriting
existing data, or re-writing the existing data. Essentially, in-place
addition/removal of data to a file at a place other than the end of
the file.

Some possible use-cases of such a feature would be:

(a) Databases (currently hack around this by allocating sparse files)
(b) Delta-patching (rsync, patch, xdelta, etc)
(c) Video editors (especially if combined with reflink copies)

Besides I/O savings, it would also have significant space savings if
the current subvolume being written to has been snapshotted (a common
use-case for incremental backups).

I've been told that the problem is somewhat difficult to solve
properly under block-based representation of data, but I was hoping
that btrfs' reflink mechanism and its space-efficient packing of small
files might make it doable.

A hack I can think of is to do a BTRFS_IOC_CLONE_RANGE into a new file
(upto the offset), writing whatever data is required, and then doing
another BTRFS_IOC_CLONE_RANGE with an offset for the rest of the
original file. This can be followed by a rename() over the original
file. Similarly for removing data from the middle of a file. Would
this work? Would it be cleaner to implement something equivalent
internally?

Thanks!

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
  2010-12-06 12:41 "Appending" data to the middle of a file using btrfs-specific features Nirbheek Chauhan
@ 2010-12-06 16:05 ` Chris Mason
  2010-12-06 19:14   ` Nirbheek Chauhan
  2010-12-07  7:50   ` Andrey Kuzmin
       [not found] ` <4CFE0A81.9040102@electric-spoon.com>
  1 sibling, 2 replies; 10+ messages in thread
From: Chris Mason @ 2010-12-06 16:05 UTC (permalink / raw)
  To: Nirbheek Chauhan; +Cc: linux-btrfs

Excerpts from Nirbheek Chauhan's message of 2010-12-06 07:41:16 -0500:
> Hello,
> 
> I'd like to know if there has been any discussion about adding a new
> feature to write (add) data at an offset, but without overwriting
> existing data, or re-writing the existing data. Essentially, in-place
> addition/removal of data to a file at a place other than the end of
> the file.
> 
> Some possible use-cases of such a feature would be:
> 
> (a) Databases (currently hack around this by allocating sparse files)
> (b) Delta-patching (rsync, patch, xdelta, etc)
> (c) Video editors (especially if combined with reflink copies)
> 
> Besides I/O savings, it would also have significant space savings if
> the current subvolume being written to has been snapshotted (a common
> use-case for incremental backups).
> 
> I've been told that the problem is somewhat difficult to solve
> properly under block-based representation of data, but I was hoping
> that btrfs' reflink mechanism and its space-efficient packing of small
> files might make it doable.
> 
> A hack I can think of is to do a BTRFS_IOC_CLONE_RANGE into a new file
> (upto the offset), writing whatever data is required, and then doing
> another BTRFS_IOC_CLONE_RANGE with an offset for the rest of the
> original file. This can be followed by a rename() over the original
> file. Similarly for removing data from the middle of a file. Would
> this work? Would it be cleaner to implement something equivalent
> internally?

It would work yes.  The operation has three cases:

1) file size doesn't change
2) extend the file with new bytes in the middle
3) make the file smaller removing bytes in the middle

#1 is the easiest case, you can just use the clone range ioctl directly

For #2 and #3, all of the file pointers past the bytes you want to add
or remove need to be updated with a new file offset.  I'd say for an
initial implementation to use the IOC_CLONE_RANGE code, and after
everything is working we can look at optimizing it with a shift ioctl if
it makes sense.

Of the use cases you list, video editors seems the most useful.
Databases already have things pretty much under control, and delta
patching wants to go to a new file anyway.  Video editing software has
long been looking for ways to do this.

-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
  2010-12-06 16:05 ` Chris Mason
@ 2010-12-06 19:14   ` Nirbheek Chauhan
  2010-12-06 19:33     ` Chris Mason
  2010-12-06 19:35     ` Freddie Cash
  2010-12-07  7:50   ` Andrey Kuzmin
  1 sibling, 2 replies; 10+ messages in thread
From: Nirbheek Chauhan @ 2010-12-06 19:14 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

On Mon, Dec 6, 2010 at 9:35 PM, Chris Mason <chris.mason@oracle.com> wr=
ote:
> Excerpts from Nirbheek Chauhan's message of 2010-12-06 07:41:16 -0500=
:
[snip]
>> Some possible use-cases of such a feature would be:
>>
>> (a) Databases (currently hack around this by allocating sparse files=
)
>> (b) Delta-patching (rsync, patch, xdelta, etc)
>> (c) Video editors (especially if combined with reflink copies)
>>
>> Besides I/O savings, it would also have significant space savings if
>> the current subvolume being written to has been snapshotted (a commo=
n
>> use-case for incremental backups).
>>
[snip]
>> A hack I can think of is to do a BTRFS_IOC_CLONE_RANGE into a new fi=
le
>> (upto the offset), writing whatever data is required, and then doing
>> another BTRFS_IOC_CLONE_RANGE with an offset for the rest of the
>> original file. This can be followed by a rename() over the original
>> file. Similarly for removing data from the middle of a file. Would
>> this work? Would it be cleaner to implement something equivalent
>> internally?
>
> It would work yes. =C2=A0The operation has three cases:
>
> 1) file size doesn't change
> 2) extend the file with new bytes in the middle
> 3) make the file smaller removing bytes in the middle
>
> #1 is the easiest case, you can just use the clone range ioctl direct=
ly
>
> For #2 and #3, all of the file pointers past the bytes you want to ad=
d
> or remove need to be updated with a new file offset. =C2=A0I'd say fo=
r an
> initial implementation to use the IOC_CLONE_RANGE code, and after
> everything is working we can look at optimizing it with a shift ioctl=
 if
> it makes sense.
>

Alrighty, I'll try this and report back any bugs and/or suggestions.

> Of the use cases you list, video editors seems the most useful.
> Databases already have things pretty much under control, and delta
> patching wants to go to a new file anyway. =C2=A0Video editing softwa=
re has
> long been looking for ways to do this.
>

As an aside, my primary motivation for this was that doing an
incremental backup of things like git bare repositories and databases
using btrfs subvolume snapshots is expensive w.r.t. disk space. Even
though rsync calculates a binary delta before transferring data, it
has to write everything out (except if just appending). So in that
case, each "incremental" backup is hardly so.

Thanks for your help! :)

--=20
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
  2010-12-06 19:14   ` Nirbheek Chauhan
@ 2010-12-06 19:33     ` Chris Mason
  2010-12-06 19:35     ` Freddie Cash
  1 sibling, 0 replies; 10+ messages in thread
From: Chris Mason @ 2010-12-06 19:33 UTC (permalink / raw)
  To: Nirbheek Chauhan; +Cc: linux-btrfs

Excerpts from Nirbheek Chauhan's message of 2010-12-06 14:14:59 -0500:
> On Mon, Dec 6, 2010 at 9:35 PM, Chris Mason <chris.mason@oracle.com> =
wrote:
> > Excerpts from Nirbheek Chauhan's message of 2010-12-06 07:41:16 -05=
00:
> [snip]
> >> Some possible use-cases of such a feature would be:
> >>
> >> (a) Databases (currently hack around this by allocating sparse fil=
es)
> >> (b) Delta-patching (rsync, patch, xdelta, etc)
> >> (c) Video editors (especially if combined with reflink copies)
> >>
> >> Besides I/O savings, it would also have significant space savings =
if
> >> the current subvolume being written to has been snapshotted (a com=
mon
> >> use-case for incremental backups).
> >>
> [snip]
> >> A hack I can think of is to do a BTRFS_IOC_CLONE_RANGE into a new =
file
> >> (upto the offset), writing whatever data is required, and then doi=
ng
> >> another BTRFS_IOC_CLONE_RANGE with an offset for the rest of the
> >> original file. This can be followed by a rename() over the origina=
l
> >> file. Similarly for removing data from the middle of a file. Would
> >> this work? Would it be cleaner to implement something equivalent
> >> internally?
> >
> > It would work yes. =C2=A0The operation has three cases:
> >
> > 1) file size doesn't change
> > 2) extend the file with new bytes in the middle
> > 3) make the file smaller removing bytes in the middle
> >
> > #1 is the easiest case, you can just use the clone range ioctl dire=
ctly
> >
> > For #2 and #3, all of the file pointers past the bytes you want to =
add
> > or remove need to be updated with a new file offset. =C2=A0I'd say =
for an
> > initial implementation to use the IOC_CLONE_RANGE code, and after
> > everything is working we can look at optimizing it with a shift ioc=
tl if
> > it makes sense.
> >
>=20
> Alrighty, I'll try this and report back any bugs and/or suggestions.
>=20
> > Of the use cases you list, video editors seems the most useful.
> > Databases already have things pretty much under control, and delta
> > patching wants to go to a new file anyway. =C2=A0Video editing soft=
ware has
> > long been looking for ways to do this.
> >
>=20
> As an aside, my primary motivation for this was that doing an
> incremental backup of things like git bare repositories and databases
> using btrfs subvolume snapshots is expensive w.r.t. disk space. Even
> though rsync calculates a binary delta before transferring data, it
> has to write everything out (except if just appending). So in that
> case, each "incremental" backup is hardly so.

Oh, I see what you mean.  Yes that is definitely an interesting use
case.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
  2010-12-06 19:14   ` Nirbheek Chauhan
  2010-12-06 19:33     ` Chris Mason
@ 2010-12-06 19:35     ` Freddie Cash
  2010-12-06 20:30       ` Nirbheek Chauhan
  1 sibling, 1 reply; 10+ messages in thread
From: Freddie Cash @ 2010-12-06 19:35 UTC (permalink / raw)
  To: Nirbheek Chauhan; +Cc: Chris Mason, linux-btrfs

On Mon, Dec 6, 2010 at 11:14 AM, Nirbheek Chauhan
<nirbheek.chauhan@gmail.com> wrote:
> As an aside, my primary motivation for this was that doing an
> incremental backup of things like git bare repositories and databases
> using btrfs subvolume snapshots is expensive w.r.t. disk space. Even
> though rsync calculates a binary delta before transferring data, it
> has to write everything out (except if just appending). So in that
> case, each "incremental" backup is hardly so.

Since btrfs is Copy-on-Write, have you experimented with --inplace on
the rsync command-line?  That way, rsync writes the changes "over-top"
of the existing file, thus allowing btrfs to only write out the blocks
that have changed, via CoW?

We do this with our ZFS rsync backups, and found disk usage to go way
down over the default "write out new data to new file, rename overtop"
method that rsync uses.

There's also the --no-whole-file option which causes rsync to only
send delta changes for existing files, another useful feature with CoW
filesystems.

-- 
Freddie Cash
fjwcash@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
  2010-12-06 19:35     ` Freddie Cash
@ 2010-12-06 20:30       ` Nirbheek Chauhan
  2010-12-06 20:42         ` Freddie Cash
  0 siblings, 1 reply; 10+ messages in thread
From: Nirbheek Chauhan @ 2010-12-06 20:30 UTC (permalink / raw)
  To: Freddie Cash; +Cc: Chris Mason, linux-btrfs

On Tue, Dec 7, 2010 at 1:05 AM, Freddie Cash <fjwcash@gmail.com> wrote:
> On Mon, Dec 6, 2010 at 11:14 AM, Nirbheek Chauhan
> <nirbheek.chauhan@gmail.com> wrote:
>> As an aside, my primary motivation for this was that doing an
>> incremental backup of things like git bare repositories and database=
s
>> using btrfs subvolume snapshots is expensive w.r.t. disk space. Even
>> though rsync calculates a binary delta before transferring data, it
>> has to write everything out (except if just appending). So in that
>> case, each "incremental" backup is hardly so.
>
> Since btrfs is Copy-on-Write, have you experimented with --inplace on
> the rsync command-line? =C2=A0That way, rsync writes the changes "ove=
r-top"
> of the existing file, thus allowing btrfs to only write out the block=
s
> that have changed, via CoW?
>
> We do this with our ZFS rsync backups, and found disk usage to go way
> down over the default "write out new data to new file, rename overtop=
"
> method that rsync uses.
>
> There's also the --no-whole-file option which causes rsync to only
> send delta changes for existing files, another useful feature with Co=
W
> filesystems.
>

I had tried the --inplace option, but it didn't seem to do anything
for me, so I didn't explore that further. However, after following
your suggestion and retrying with --no-whole-file, I see that the
behaviour is quite different! It seems that --whole-file is enabled by
default for local file transfers, and so --inplace had no effect.

But the behaviour of --inplace is not entirely to write out *only* the
blocks that have changed. From what I could make out, it does the
following:

(1) Calculate a delta b/w the src and trg files
(2) Seek to the first difference in the target file
(3) Start writing data

I'm glossing over the final step because I didn't look deeper, but I
think you can safely assume that after the first difference, all data
is rewritten. So this is halfway between "rewrite the whole file" and
"write only the changed bits into the file". It doesn't actually use
any CoW features from what I can see. There is lots of room for btrfs
reflinking magic. :)

Note that I tested this behaviour on a btrfs partition with a vanilla
rsync-3.0.7 tarball; the copy you use with ZFS might be doing some CoW
magic.

Thanks for the tip!

--=20
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
  2010-12-06 20:30       ` Nirbheek Chauhan
@ 2010-12-06 20:42         ` Freddie Cash
  2010-12-07  7:38           ` Nirbheek Chauhan
  0 siblings, 1 reply; 10+ messages in thread
From: Freddie Cash @ 2010-12-06 20:42 UTC (permalink / raw)
  To: Nirbheek Chauhan; +Cc: Chris Mason, linux-btrfs

On Mon, Dec 6, 2010 at 12:30 PM, Nirbheek Chauhan
<nirbheek.chauhan@gmail.com> wrote:
> On Tue, Dec 7, 2010 at 1:05 AM, Freddie Cash <fjwcash@gmail.com> wrot=
e:
>> On Mon, Dec 6, 2010 at 11:14 AM, Nirbheek Chauhan
>> <nirbheek.chauhan@gmail.com> wrote:
>>> As an aside, my primary motivation for this was that doing an
>>> incremental backup of things like git bare repositories and databas=
es
>>> using btrfs subvolume snapshots is expensive w.r.t. disk space. Eve=
n
>>> though rsync calculates a binary delta before transferring data, it
>>> has to write everything out (except if just appending). So in that
>>> case, each "incremental" backup is hardly so.
>>
>> Since btrfs is Copy-on-Write, have you experimented with --inplace o=
n
>> the rsync command-line? =C2=A0That way, rsync writes the changes "ov=
er-top"
>> of the existing file, thus allowing btrfs to only write out the bloc=
ks
>> that have changed, via CoW?
>>
>> We do this with our ZFS rsync backups, and found disk usage to go wa=
y
>> down over the default "write out new data to new file, rename overto=
p"
>> method that rsync uses.
>>
>> There's also the --no-whole-file option which causes rsync to only
>> send delta changes for existing files, another useful feature with C=
oW
>> filesystems.
>>
> I had tried the --inplace option, but it didn't seem to do anything
> for me, so I didn't explore that further. However, after following
> your suggestion and retrying with --no-whole-file, I see that the
> behaviour is quite different! It seems that --whole-file is enabled b=
y
> default for local file transfers, and so --inplace had no effect.

Yes, correct, --whole-file is used for local transfers since it's
assumed you have all the disk I/O in the world, so why try to limit
the amount of data transferred.  :)

> But the behaviour of --inplace is not entirely to write out *only* th=
e
> blocks that have changed. From what I could make out, it does the
> following:
>
> (1) Calculate a delta b/w the src and trg files
> (2) Seek to the first difference in the target file
> (3) Start writing data

That may be true, I've never looked into the actual algorithm(s) that
rsync uses.  Just played around with CLI options until we found the
set that works best in our situation (--inplace --delete-during
--no-whole-file --numeric-ids --hard-links --archive, over SSH with
HPN patches).

> I'm glossing over the final step because I didn't look deeper, but I
> think you can safely assume that after the first difference, all data
> is rewritten. So this is halfway between "rewrite the whole file" and
> "write only the changed bits into the file". It doesn't actually use
> any CoW features from what I can see. There is lots of room for btrfs
> reflinking magic. :)
>
> Note that I tested this behaviour on a btrfs partition with a vanilla
> rsync-3.0.7 tarball; the copy you use with ZFS might be doing some Co=
W
> magic.

All the CoW "magic" is handled by the filesystem, and not the tools on
top.  If the tool only updates X bytes, which fit into 1 block on the
fs, then only that 1 block gets updated via CoW.

Personally, I don't think the tools need to be updated to understand
CoW or to integrate with the underlying FS.  Instead, they should just
operate on blocks of X size, and let the FS figure out what to do.

Otherwise, you end up with "rsync for ZFS", "rsync for ZFS", "rsync
for BtrFS", "rsync for FAT32", etc.

But, I'm just a lowly sysadmin, what do I know about filesystem interna=
ls?  ;)


--=20
=46reddie Cash
fjwcash@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
  2010-12-06 20:42         ` Freddie Cash
@ 2010-12-07  7:38           ` Nirbheek Chauhan
  0 siblings, 0 replies; 10+ messages in thread
From: Nirbheek Chauhan @ 2010-12-07  7:38 UTC (permalink / raw)
  To: Freddie Cash; +Cc: Chris Mason, linux-btrfs

On Tue, Dec 7, 2010 at 2:12 AM, Freddie Cash <fjwcash@gmail.com> wrote:
> On Mon, Dec 6, 2010 at 12:30 PM, Nirbheek Chauhan
> <nirbheek.chauhan@gmail.com> wrote:
>> But the behaviour of --inplace is not entirely to write out *only* t=
he
>> blocks that have changed. From what I could make out, it does the
>> following:
>>
>> (1) Calculate a delta b/w the src and trg files
>> (2) Seek to the first difference in the target file
>> (3) Start writing data
>
> That may be true, I've never looked into the actual algorithm(s) that
> rsync uses. =C2=A0Just played around with CLI options until we found =
the
> set that works best in our situation (--inplace --delete-during
> --no-whole-file --numeric-ids --hard-links --archive, over SSH with
> HPN patches).
>
>> I'm glossing over the final step because I didn't look deeper, but I
>> think you can safely assume that after the first difference, all dat=
a
>> is rewritten. So this is halfway between "rewrite the whole file" an=
d
>> "write only the changed bits into the file". It doesn't actually use
>> any CoW features from what I can see. There is lots of room for btrf=
s
>> reflinking magic. :)
>>
>> Note that I tested this behaviour on a btrfs partition with a vanill=
a
>> rsync-3.0.7 tarball; the copy you use with ZFS might be doing some C=
oW
>> magic.
>
> All the CoW "magic" is handled by the filesystem, and not the tools o=
n
> top. =C2=A0If the tool only updates X bytes, which fit into 1 block o=
n the
> fs, then only that 1 block gets updated via CoW.
>

I'm quite sure that's what happens in btrfs too, but the thing about
updating in-place is that if you have

ABCDXXXEFGH

which needs to change to

ABCDZZZEFGH

You're all good. Only the blocks corresponding to XXX will be updated.
But if the change is

ABCDZZZZEFGH

You'll need to start rewriting EFGH since there's no way to insert
data in the middle (afaik) of a file with standard syscalls. Maybe
later you get a set of changes which sync you up with the file's
contents again, but the chances of that happening in a large file are
quite remote. That's why I said that it can be safely assumed that
after the first difference, all data is rewritten.

The only way to get around this on the filesystem level that I can
think of is data de-duplication; the filesystem doesn't let go of the
blocks for a while, and does reflinking if the same data is written
again. Perhaps that's what ZFS is doing, I have no idea :)

--=20
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
  2010-12-06 16:05 ` Chris Mason
  2010-12-06 19:14   ` Nirbheek Chauhan
@ 2010-12-07  7:50   ` Andrey Kuzmin
  1 sibling, 0 replies; 10+ messages in thread
From: Andrey Kuzmin @ 2010-12-07  7:50 UTC (permalink / raw)
  To: Chris Mason; +Cc: Nirbheek Chauhan, linux-btrfs

On Mon, Dec 6, 2010 at 7:05 PM, Chris Mason <chris.mason@oracle.com> wr=
ote:
> Excerpts from Nirbheek Chauhan's message of 2010-12-06 07:41:16 -0500=
:
>> Hello,
>>
>> I'd like to know if there has been any discussion about adding a new
>> feature to write (add) data at an offset, but without overwriting
>> existing data, or re-writing the existing data. Essentially, in-plac=
e
>> addition/removal of data to a file at a place other than the end of
>> the file.
>>
>> Some possible use-cases of such a feature would be:
>>
>> (a) Databases (currently hack around this by allocating sparse files=
)
>> (b) Delta-patching (rsync, patch, xdelta, etc)
>> (c) Video editors (especially if combined with reflink copies)
>>
>> Besides I/O savings, it would also have significant space savings if
>> the current subvolume being written to has been snapshotted (a commo=
n
>> use-case for incremental backups).
>>
>> I've been told that the problem is somewhat difficult to solve
>> properly under block-based representation of data, but I was hoping
>> that btrfs' reflink mechanism and its space-efficient packing of sma=
ll
>> files might make it doable.
>>
>> A hack I can think of is to do a BTRFS_IOC_CLONE_RANGE into a new fi=
le
>> (upto the offset), writing whatever data is required, and then doing
>> another BTRFS_IOC_CLONE_RANGE with an offset for the rest of the
>> original file. This can be followed by a rename() over the original
>> file. Similarly for removing data from the middle of a file. Would
>> this work? Would it be cleaner to implement something equivalent
>> internally?
>
> It would work yes. =C2=A0The operation has three cases:
>
> 1) file size doesn't change
> 2) extend the file with new bytes in the middle
> 3) make the file smaller removing bytes in the middle
>
> #1 is the easiest case, you can just use the clone range ioctl direct=
ly

Tis doesn't seem to be interesting, looking just like traditional COW o=
verwrite.

>
> For #2 and #3, all of the file pointers past the bytes you want to ad=
d
> or remove need to be updated with a new file offset. =C2=A0I'd say fo=
r an
> initial implementation to use the IOC_CLONE_RANGE code, and after
> everything is working we can look at optimizing it with a shift ioctl=
 if
> it makes sense.

Not sure how btrfs implements versioned B-trees, but other
snapshot-capable file-systems I'm aware of utilize DITTO B-tree entry
that says "for tis range, consult previous version tree". One can
imagine DITTO(n) extension that would tell "subtract n from look-up
key and then consult previous version tree", effectively achieving
range shift behavior. FWIW.

Regards,
Andrey


>
> Of the use cases you list, video editors seems the most useful.
> Databases already have things pretty much under control, and delta
> patching wants to go to a new file anyway. =C2=A0Video editing softwa=
re has
> long been looking for ways to do this.
>
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs=
" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.ht=
ml
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "Appending" data to the middle of a file using btrfs-specific features
       [not found] ` <4CFE0A81.9040102@electric-spoon.com>
@ 2010-12-07 11:29   ` Nirbheek Chauhan
  0 siblings, 0 replies; 10+ messages in thread
From: Nirbheek Chauhan @ 2010-12-07 11:29 UTC (permalink / raw)
  To: David Pottage; +Cc: Chris Mason, linux-btrfs

[I think the mail was sent to just me due to a reply-accident, I've
re-added the mailing list for this reply]

On Tue, Dec 7, 2010 at 3:50 PM, David Pottage <david@electric-spoon.com=
> wrote:
> On 06/12/10 12:41, Nirbheek Chauhan wrote:
>>
>> I'd like to know if there has been any discussion about adding a new
>> feature to write (add) data at an offset, but without overwriting
>> existing data, or re-writing the existing data. Essentially, in-plac=
e
>> addition/removal of data to a file at a place other than the end of
>> the file.
>>
>> Some possible use-cases of such a feature would be:
>>
>> (a) Databases (currently hack around this by allocating sparse files=
)
>> (b) Delta-patching (rsync, patch, xdelta, etc)
>> (c) Video editors (especially if combined with reflink copies)
>>
>> Besides I/O savings, it would also have significant space savings if
>> the current subvolume being written to has been snapshotted (a commo=
n
>> use-case for incremental backups).
>>
>
> This idea was discussed back in June. (Search the archives for "Compl=
ex
> filesystem operations: split and join"
>
> Back then the idea was to achieve insertion and removal of data by sp=
litting
> and joining existing files, so to insert data in the middle of a file=
, you
> would cut it in two, append data to the first file and then re-join i=
t.
>

Aha, I searched the archives and I found the thread in question[1],
thanks! The original thread seems to have gone for a split/join
implementation that would work with vfat along with a new syscall.

> I think that direct insertion and removal of data is a cleaner idea, =
though
> it may result in a more complex API. You could still achieve cutting =
files
> into two by creating a COW copy of the file and truncating one, and r=
emoving
> a block of bytes from the start of the other.
>

I agree, being able to manipulate file stream in a way similar to
inserting/deleting in linked lists would introduce new possibilities
(and challenges, I'm sure). As you mentioned in the original thread,
it's quite strange that there's no way to do this with current file
API.

> I still think it would be a good idea to be able to join files togeth=
er with
> a file system API call, so the equivalent of:
>
> =C2=A0 =C2=A0cat track1.mp3 track2.mp3 track3.mp3 > mix_tape.mp3
>
> Could be done as a filesystem call to create mix_tape.mp3 as a de-dup=
licated
> copy of the contents of the three source files, without many megabyte=
s of
> I/O.
>

Ah, this is relatively straightforward with the clone_range ioctl.
There was some talk about a reflink() or clone() syscall a while
ago[2], perhaps that could be extended as reflink_range() so that it
could be used with other filesystems which support reflinks as well.

1. http://thread.gmane.org/gmane.linux.kernel/996835
2. http://lwn.net/Articles/333783/

--=20
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-12-07 11:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-06 12:41 "Appending" data to the middle of a file using btrfs-specific features Nirbheek Chauhan
2010-12-06 16:05 ` Chris Mason
2010-12-06 19:14   ` Nirbheek Chauhan
2010-12-06 19:33     ` Chris Mason
2010-12-06 19:35     ` Freddie Cash
2010-12-06 20:30       ` Nirbheek Chauhan
2010-12-06 20:42         ` Freddie Cash
2010-12-07  7:38           ` Nirbheek Chauhan
2010-12-07  7:50   ` Andrey Kuzmin
     [not found] ` <4CFE0A81.9040102@electric-spoon.com>
2010-12-07 11:29   ` Nirbheek Chauhan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).