* copy on write misconception
@ 2013-02-22 17:11 Mike Power
2013-02-22 17:16 ` Hugo Mills
2013-02-22 17:24 ` cwillu
0 siblings, 2 replies; 6+ messages in thread
From: Mike Power @ 2013-02-22 17:11 UTC (permalink / raw)
To: linux-btrfs
I think I have a misconception of what copy on write in btrfs means for
individual files.
I had originally thought that I could create a large file:
time dd if=/dev/zero of=10G bs=1G count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 100.071 s, 107 MB/s
real 1m41.082s
user 0m0.000s
sys 0m7.792s
Then if I copied this file no blocks would be copied until they are
written. Hence the two files would use the same blocks underneath. But
specifically that copy would be fast. Since it would only need to write
some metadata. But when I copy the file:
time cp 10G 10G2
real 3m38.790s
user 0m0.124s
sys 0m10.709s
Oddly enough it actually takes longer then the initial file creation.
So I am guessing that the long duration copy of the file is expected and
that is not one of the virtues of btrfs copy on write. Does that sound
right?
I was looking at a virtual machine solution and thought btrfs would be
great if I could copy the vm disk to a new file at low cost and then
launch that vm and customize it to my needs.
OS Ubuntu 12.10
Mike Power
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copy on write misconception
2013-02-22 17:11 copy on write misconception Mike Power
@ 2013-02-22 17:16 ` Hugo Mills
2013-02-22 17:41 ` Mike Power
2013-02-22 17:24 ` cwillu
1 sibling, 1 reply; 6+ messages in thread
From: Hugo Mills @ 2013-02-22 17:16 UTC (permalink / raw)
To: Mike Power; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1586 bytes --]
On Fri, Feb 22, 2013 at 09:11:28AM -0800, Mike Power wrote:
> I think I have a misconception of what copy on write in btrfs means
> for individual files.
>
> I had originally thought that I could create a large file:
> time dd if=/dev/zero of=10G bs=1G count=10
> 10+0 records in
> 10+0 records out
> 10737418240 bytes (11 GB) copied, 100.071 s, 107 MB/s
>
> real 1m41.082s
> user 0m0.000s
> sys 0m7.792s
>
> Then if I copied this file no blocks would be copied until they are
> written. Hence the two files would use the same blocks underneath.
> But specifically that copy would be fast. Since it would only need
> to write some metadata. But when I copy the file:
> time cp 10G 10G2
>
> real 3m38.790s
> user 0m0.124s
> sys 0m10.709s
>
> Oddly enough it actually takes longer then the initial file
> creation. So I am guessing that the long duration copy of the file
> is expected and that is not one of the virtues of btrfs copy on
> write. Does that sound right?
You probably want cp --reflink=always, which makes a CoW copy of
the file's metadata only. The resulting files have the semantics of
two different files, but share their blocks until a part of one of
them is modified (at which point, the modified blocks are no longer
shared).
Hugo.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- "I don't like the look of it, I tell you." "Well, stop ---
looking at it, then."
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copy on write misconception
2013-02-22 17:11 copy on write misconception Mike Power
2013-02-22 17:16 ` Hugo Mills
@ 2013-02-22 17:24 ` cwillu
1 sibling, 0 replies; 6+ messages in thread
From: cwillu @ 2013-02-22 17:24 UTC (permalink / raw)
To: Mike Power; +Cc: linux-btrfs
> Then if I copied this file no blocks would be copied until they are written.
> Hence the two files would use the same blocks underneath. But specifically
> that copy would be fast. Since it would only need to write some metadata.
> But when I copy the file:
> time cp 10G 10G2
cp without arguments still does a regular copy; btrfs does nothing to
de-duplicate writes.
"cp --reflink 10G 10G2" will give you the results you expect.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copy on write misconception
2013-02-22 17:16 ` Hugo Mills
@ 2013-02-22 17:41 ` Mike Power
2013-02-22 18:35 ` cwillu
0 siblings, 1 reply; 6+ messages in thread
From: Mike Power @ 2013-02-22 17:41 UTC (permalink / raw)
To: Hugo Mills, linux-btrfs
On 02/22/2013 09:16 AM, Hugo Mills wrote:
> On Fri, Feb 22, 2013 at 09:11:28AM -0800, Mike Power wrote:
>> I think I have a misconception of what copy on write in btrfs means
>> for individual files.
>>
>> I had originally thought that I could create a large file:
>> time dd if=/dev/zero of=10G bs=1G count=10
>> 10+0 records in
>> 10+0 records out
>> 10737418240 bytes (11 GB) copied, 100.071 s, 107 MB/s
>>
>> real 1m41.082s
>> user 0m0.000s
>> sys 0m7.792s
>>
>> Then if I copied this file no blocks would be copied until they are
>> written. Hence the two files would use the same blocks underneath.
>> But specifically that copy would be fast. Since it would only need
>> to write some metadata. But when I copy the file:
>> time cp 10G 10G2
>>
>> real 3m38.790s
>> user 0m0.124s
>> sys 0m10.709s
>>
>> Oddly enough it actually takes longer then the initial file
>> creation. So I am guessing that the long duration copy of the file
>> is expected and that is not one of the virtues of btrfs copy on
>> write. Does that sound right?
> You probably want cp --reflink=always, which makes a CoW copy of
> the file's metadata only. The resulting files have the semantics of
> two different files, but share their blocks until a part of one of
> them is modified (at which point, the modified blocks are no longer
> shared).
>
> Hugo.
>
I see, and it works great:
time cp --reflink=always 10G 10G3
real 0m0.028s
user 0m0.000s
sys 0m0.000s
So from the user perspective I might say I want to opt out of this
feature not optin. I want all copies by all applications done as a copy
on write. But if my understanding is correct that is up to the
application being called (in this case cp) and how it in turns makes
calls to the system.
In short I can't remount the btrfs filesystem with some new args that
says always copy on write files because that is what it already.
Mike Power
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copy on write misconception
2013-02-22 17:41 ` Mike Power
@ 2013-02-22 18:35 ` cwillu
2013-02-23 18:30 ` Mike Power
0 siblings, 1 reply; 6+ messages in thread
From: cwillu @ 2013-02-22 18:35 UTC (permalink / raw)
To: Mike Power; +Cc: Hugo Mills, linux-btrfs
On Fri, Feb 22, 2013 at 11:41 AM, Mike Power <dodtsair@gmail.com> wrote:
> On 02/22/2013 09:16 AM, Hugo Mills wrote:
>>
>> On Fri, Feb 22, 2013 at 09:11:28AM -0800, Mike Power wrote:
>>>
>>> I think I have a misconception of what copy on write in btrfs means
>>> for individual files.
>>>
>>> I had originally thought that I could create a large file:
>>> time dd if=/dev/zero of=10G bs=1G count=10
>>> 10+0 records in
>>> 10+0 records out
>>> 10737418240 bytes (11 GB) copied, 100.071 s, 107 MB/s
>>>
>>> real 1m41.082s
>>> user 0m0.000s
>>> sys 0m7.792s
>>>
>>> Then if I copied this file no blocks would be copied until they are
>>> written. Hence the two files would use the same blocks underneath.
>>> But specifically that copy would be fast. Since it would only need
>>> to write some metadata. But when I copy the file:
>>> time cp 10G 10G2
>>>
>>> real 3m38.790s
>>> user 0m0.124s
>>> sys 0m10.709s
>>>
>>> Oddly enough it actually takes longer then the initial file
>>> creation. So I am guessing that the long duration copy of the file
>>> is expected and that is not one of the virtues of btrfs copy on
>>> write. Does that sound right?
>>
>> You probably want cp --reflink=always, which makes a CoW copy of
>> the file's metadata only. The resulting files have the semantics of
>> two different files, but share their blocks until a part of one of
>> them is modified (at which point, the modified blocks are no longer
>> shared).
>>
>> Hugo.
>>
> I see, and it works great:
> time cp --reflink=always 10G 10G3
>
> real 0m0.028s
> user 0m0.000s
> sys 0m0.000s
>
> So from the user perspective I might say I want to opt out of this feature
> not optin. I want all copies by all applications done as a copy on write.
> But if my understanding is correct that is up to the application being
> called (in this case cp) and how it in turns makes calls to the system.
>
> In short I can't remount the btrfs filesystem with some new args that says
> always copy on write files because that is what it already.
There's no "copy a file" syscall; when a program copies a file, it
opens a new file, and writes all the bytes from the old to the new.
Converting this to a reflink would require btrfs to implement full
de-dup (which is rather expensive), and still wouldn't prevent the
program from reading and writing all 10gb (and so wouldn't be any
faster).
You can set an alias in your shell to make cp --reflink=auto the
default, but that won't affect other programs, nor other users.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copy on write misconception
2013-02-22 18:35 ` cwillu
@ 2013-02-23 18:30 ` Mike Power
0 siblings, 0 replies; 6+ messages in thread
From: Mike Power @ 2013-02-23 18:30 UTC (permalink / raw)
To: cwillu; +Cc: Hugo Mills, linux-btrfs
On 02/22/2013 10:35 AM, cwillu wrote:
> On Fri, Feb 22, 2013 at 11:41 AM, Mike Power <dodtsair@gmail.com> wrote:
>> On 02/22/2013 09:16 AM, Hugo Mills wrote:
>>> On Fri, Feb 22, 2013 at 09:11:28AM -0800, Mike Power wrote:
>>>> I think I have a misconception of what copy on write in btrfs means
>>>> for individual files.
>>>>
>>>> I had originally thought that I could create a large file:
>>>> time dd if=/dev/zero of=10G bs=1G count=10
>>>> 10+0 records in
>>>> 10+0 records out
>>>> 10737418240 bytes (11 GB) copied, 100.071 s, 107 MB/s
>>>>
>>>> real 1m41.082s
>>>> user 0m0.000s
>>>> sys 0m7.792s
>>>>
>>>> Then if I copied this file no blocks would be copied until they are
>>>> written. Hence the two files would use the same blocks underneath.
>>>> But specifically that copy would be fast. Since it would only need
>>>> to write some metadata. But when I copy the file:
>>>> time cp 10G 10G2
>>>>
>>>> real 3m38.790s
>>>> user 0m0.124s
>>>> sys 0m10.709s
>>>>
>>>> Oddly enough it actually takes longer then the initial file
>>>> creation. So I am guessing that the long duration copy of the file
>>>> is expected and that is not one of the virtues of btrfs copy on
>>>> write. Does that sound right?
>>> You probably want cp --reflink=always, which makes a CoW copy of
>>> the file's metadata only. The resulting files have the semantics of
>>> two different files, but share their blocks until a part of one of
>>> them is modified (at which point, the modified blocks are no longer
>>> shared).
>>>
>>> Hugo.
>>>
>> I see, and it works great:
>> time cp --reflink=always 10G 10G3
>>
>> real 0m0.028s
>> user 0m0.000s
>> sys 0m0.000s
>>
>> So from the user perspective I might say I want to opt out of this feature
>> not optin. I want all copies by all applications done as a copy on write.
>> But if my understanding is correct that is up to the application being
>> called (in this case cp) and how it in turns makes calls to the system.
>>
>> In short I can't remount the btrfs filesystem with some new args that says
>> always copy on write files because that is what it already.
> There's no "copy a file" syscall; when a program copies a file, it
> opens a new file, and writes all the bytes from the old to the new.
> Converting this to a reflink would require btrfs to implement full
> de-dup (which is rather expensive), and still wouldn't prevent the
> program from reading and writing all 10gb (and so wouldn't be any
> faster).
>
> You can set an alias in your shell to make cp --reflink=auto the
> default, but that won't affect other programs, nor other users.
Thanks for the help guys. I learned that if I want some application to
support this behavior they must specifically choose to implement it.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-02-23 18:30 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-22 17:11 copy on write misconception Mike Power
2013-02-22 17:16 ` Hugo Mills
2013-02-22 17:41 ` Mike Power
2013-02-22 18:35 ` cwillu
2013-02-23 18:30 ` Mike Power
2013-02-22 17:24 ` cwillu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox