linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for clone
@ 2015-10-20 13:29 Timofey Titovets
  2015-10-20 14:56 ` Filipe Manana
  2015-10-23  0:14 ` Zygo Blaxell
  0 siblings, 2 replies; 5+ messages in thread
From: Timofey Titovets @ 2015-10-20 13:29 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1172 bytes --]

For performance reason, leave data at the start of disk, is preferable
while deduping
It's might sense for the reasons:
1. Spinning rust - start of the disk is much faster
2. Btrfs can deallocate empty data chunk from the end of fs - ie it's compact fs

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
 fs/btrfs/ioctl.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 3e3e613..3eb77c0 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3074,8 +3074,13 @@ static int btrfs_extent_same(struct inode *src,
u64 loff, u64 olen,

  /* pass original length for comparison so we stay within i_size */
  ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
- if (ret == 0)
-     ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
+ if (ret == 0) {
+     /* prefer inode with lowest offset as source for clone*/
+     if (loff > dest_loff)
+         ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
+     else
+         ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
+ }

  if (same_inode)
  unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
-- 
2.6.1

[-- Attachment #2: 0001-btrfs-ioctl.c-Prefer-inode-with-lowest-offset-as-sou.patch --]
[-- Type: text/x-patch, Size: 1215 bytes --]

From 5ed3822bc308c726d91a837fbd97ebacaa51e58d Mon Sep 17 00:00:00 2001
From: Timofey Titovets <nefelim4ag@gmail.com>
Date: Tue, 20 Oct 2015 15:53:20 +0300
Subject: [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for
 clone

For performance reason, leave data at the start of disk, is preferable

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
 fs/btrfs/ioctl.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 3e3e613..3eb77c0 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3074,8 +3074,13 @@ static int btrfs_extent_same(struct inode *src, u64 loff, u64 olen,
 
 	/* pass original length for comparison so we stay within i_size */
 	ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
-	if (ret == 0)
-		ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
+	if (ret == 0) {
+		/* prefer inode with lowest offset as source for clone*/
+		if (loff > dest_loff)
+			ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
+		else
+			ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
+	}
 
 	if (same_inode)
 		unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
-- 
2.6.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for clone
  2015-10-20 13:29 [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for clone Timofey Titovets
@ 2015-10-20 14:56 ` Filipe Manana
  2015-10-20 16:19   ` Timofey Titovets
  2015-10-23  0:14 ` Zygo Blaxell
  1 sibling, 1 reply; 5+ messages in thread
From: Filipe Manana @ 2015-10-20 14:56 UTC (permalink / raw)
  To: Timofey Titovets; +Cc: linux-btrfs

On Tue, Oct 20, 2015 at 2:29 PM, Timofey Titovets <nefelim4ag@gmail.com> wrote:
> For performance reason, leave data at the start of disk, is preferable
> while deduping

Have you made any performance tests to verify that?

> It's might sense for the reasons:
> 1. Spinning rust - start of the disk is much faster
> 2. Btrfs can deallocate empty data chunk from the end of fs - ie it's compact fs

I don't see why that makes sense. First the clone/extent_same ioctls
don't copy data, they just update metadata of the destination inode to
point to the same extents as the source inode, secondly, just because
an offset of a file is lower the offset of the other file, it doesn't
mean the physical (on disk) offset of the first file is lower than
that of the other file...

>
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> ---
>  fs/btrfs/ioctl.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 3e3e613..3eb77c0 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -3074,8 +3074,13 @@ static int btrfs_extent_same(struct inode *src,
> u64 loff, u64 olen,
>
>   /* pass original length for comparison so we stay within i_size */
>   ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
> - if (ret == 0)
> -     ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> + if (ret == 0) {
> +     /* prefer inode with lowest offset as source for clone*/
> +     if (loff > dest_loff)
> +         ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
> +     else
> +         ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> + }
>
>   if (same_inode)
>   unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
> --
> 2.6.1



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for clone
  2015-10-20 14:56 ` Filipe Manana
@ 2015-10-20 16:19   ` Timofey Titovets
  0 siblings, 0 replies; 5+ messages in thread
From: Timofey Titovets @ 2015-10-20 16:19 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

2015-10-20 17:56 GMT+03:00, Filipe Manana <fdmanana@gmail.com>:
> On Tue, Oct 20, 2015 at 2:29 PM, Timofey Titovets <nefelim4ag@gmail.com>
> wrote:
>> For performance reason, leave data at the start of disk, is preferable
>> while deduping
>
> Have you made any performance tests to verify that?


pe, i don't run any performance test, at now

It's like defragmentation, can give a boost in specific cases, and if
assumption, what, beginning sectors of hdd, is faster
(this is true, because count of sectors is not equal, at the beginning
and end of hdd space)

And while, i not shall sure in my code, it's useless

>> It's might sense for the reasons:
>> 1. Spinning rust - start of the disk is much faster
>> 2. Btrfs can deallocate empty data chunk from the end of fs - ie it's
>> compact fs
>
> I don't see why that makes sense. First the clone/extent_same ioctls
> don't copy data, they just update metadata of the destination inode to
> point to the same extents as the source inode,

s true, but as i say before, data at the beggining, of hdd, can be
accessed faster, then data of the end of hdd
I.e. it's give a boost, after dedup, not while processing requests

>secondly, just because
> an offset of a file is lower the offset of the other file, it doesn't
> mean the physical (on disk) offset of the first file is lower than
> that of the other file...

Oh, so sad, then i must go deeper in the code -.-

>>
>> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
>> ---
>>  fs/btrfs/ioctl.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 3e3e613..3eb77c0 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -3074,8 +3074,13 @@ static int btrfs_extent_same(struct inode *src,
>> u64 loff, u64 olen,
>>
>>   /* pass original length for comparison so we stay within i_size */
>>   ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
>> - if (ret == 0)
>> -     ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
>> + if (ret == 0) {
>> +     /* prefer inode with lowest offset as source for clone*/
>> +     if (loff > dest_loff)
>> +         ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
>> +     else
>> +         ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
>> + }
>>
>>   if (same_inode)
>>   unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
>> --
>> 2.6.1
>
>
>
> --
> Filipe David Manana,
>
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."
>


-- 
Have a nice day,
Timofey.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for clone
  2015-10-20 13:29 [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for clone Timofey Titovets
  2015-10-20 14:56 ` Filipe Manana
@ 2015-10-23  0:14 ` Zygo Blaxell
  2015-10-23 11:51   ` Timofey Titovets
  1 sibling, 1 reply; 5+ messages in thread
From: Zygo Blaxell @ 2015-10-23  0:14 UTC (permalink / raw)
  To: Timofey Titovets; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3601 bytes --]

On Tue, Oct 20, 2015 at 04:29:46PM +0300, Timofey Titovets wrote:
> For performance reason, leave data at the start of disk, is preferable
> while deduping
> It's might sense for the reasons:
> 1. Spinning rust - start of the disk is much faster
> 2. Btrfs can deallocate empty data chunk from the end of fs - ie it's compact fs

"src" is the extent that is kept, and "dst" is the extent that is
discarded.  When both extents are shared, the dedup userspace has to
pass a common "src" with many different "dst" over several extent-same
calls in order to get rid of all of the references to the "dst" extent.

If "src" and "dst" are arbitrarily swapped over multiple extent-same calls
then it becomes impossible to dedup shared extents.  Heck, if there are
more than two extents even in one extent-same call then it stops working.

It would be possible to have dedup figure out which extent the kernel
picked after the fact, but that's totally unnecessary extra work in
cases where the userspace has a good reason to pick the extents it did
(e.g. administrator hints about future usage of the files where the
extents were found).

Dedup userspace can figure out the physical addresses of the extents
and rearrange the arguments itself if desired.

> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> ---
>  fs/btrfs/ioctl.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 3e3e613..3eb77c0 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -3074,8 +3074,13 @@ static int btrfs_extent_same(struct inode *src,
> u64 loff, u64 olen,
> 
>   /* pass original length for comparison so we stay within i_size */
>   ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
> - if (ret == 0)
> -     ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> + if (ret == 0) {
> +     /* prefer inode with lowest offset as source for clone*/
> +     if (loff > dest_loff)
> +         ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
> +     else
> +         ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> + }
> 
>   if (same_inode)
>   unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
> -- 
> 2.6.1

> From 5ed3822bc308c726d91a837fbd97ebacaa51e58d Mon Sep 17 00:00:00 2001
> From: Timofey Titovets <nefelim4ag@gmail.com>
> Date: Tue, 20 Oct 2015 15:53:20 +0300
> Subject: [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for
>  clone
> 
> For performance reason, leave data at the start of disk, is preferable
> 
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> ---
>  fs/btrfs/ioctl.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 3e3e613..3eb77c0 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -3074,8 +3074,13 @@ static int btrfs_extent_same(struct inode *src, u64 loff, u64 olen,
>  
>  	/* pass original length for comparison so we stay within i_size */
>  	ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
> -	if (ret == 0)
> -		ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> +	if (ret == 0) {
> +		/* prefer inode with lowest offset as source for clone*/
> +		if (loff > dest_loff)
> +			ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
> +		else
> +			ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> +	}
>  
>  	if (same_inode)
>  		unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
> -- 
> 2.6.1
> 


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for clone
  2015-10-23  0:14 ` Zygo Blaxell
@ 2015-10-23 11:51   ` Timofey Titovets
  0 siblings, 0 replies; 5+ messages in thread
From: Timofey Titovets @ 2015-10-23 11:51 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: linux-btrfs

Zygo, you are right
Thread closed, thanks

2015-10-23 3:14 GMT+03:00 Zygo Blaxell <ce3g8jdj@umail.furryterror.org>:
> On Tue, Oct 20, 2015 at 04:29:46PM +0300, Timofey Titovets wrote:
>> For performance reason, leave data at the start of disk, is preferable
>> while deduping
>> It's might sense for the reasons:
>> 1. Spinning rust - start of the disk is much faster
>> 2. Btrfs can deallocate empty data chunk from the end of fs - ie it's compact fs
>
> "src" is the extent that is kept, and "dst" is the extent that is
> discarded.  When both extents are shared, the dedup userspace has to
> pass a common "src" with many different "dst" over several extent-same
> calls in order to get rid of all of the references to the "dst" extent.
>
> If "src" and "dst" are arbitrarily swapped over multiple extent-same calls
> then it becomes impossible to dedup shared extents.  Heck, if there are
> more than two extents even in one extent-same call then it stops working.
>
> It would be possible to have dedup figure out which extent the kernel
> picked after the fact, but that's totally unnecessary extra work in
> cases where the userspace has a good reason to pick the extents it did
> (e.g. administrator hints about future usage of the files where the
> extents were found).
>
> Dedup userspace can figure out the physical addresses of the extents
> and rearrange the arguments itself if desired.
>
>> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
>> ---
>>  fs/btrfs/ioctl.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 3e3e613..3eb77c0 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -3074,8 +3074,13 @@ static int btrfs_extent_same(struct inode *src,
>> u64 loff, u64 olen,
>>
>>   /* pass original length for comparison so we stay within i_size */
>>   ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
>> - if (ret == 0)
>> -     ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
>> + if (ret == 0) {
>> +     /* prefer inode with lowest offset as source for clone*/
>> +     if (loff > dest_loff)
>> +         ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
>> +     else
>> +         ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
>> + }
>>
>>   if (same_inode)
>>   unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
>> --
>> 2.6.1
>
>> From 5ed3822bc308c726d91a837fbd97ebacaa51e58d Mon Sep 17 00:00:00 2001
>> From: Timofey Titovets <nefelim4ag@gmail.com>
>> Date: Tue, 20 Oct 2015 15:53:20 +0300
>> Subject: [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for
>>  clone
>>
>> For performance reason, leave data at the start of disk, is preferable
>>
>> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
>> ---
>>  fs/btrfs/ioctl.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 3e3e613..3eb77c0 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -3074,8 +3074,13 @@ static int btrfs_extent_same(struct inode *src, u64 loff, u64 olen,
>>
>>       /* pass original length for comparison so we stay within i_size */
>>       ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
>> -     if (ret == 0)
>> -             ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
>> +     if (ret == 0) {
>> +             /* prefer inode with lowest offset as source for clone*/
>> +             if (loff > dest_loff)
>> +                     ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
>> +             else
>> +                     ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
>> +     }
>>
>>       if (same_inode)
>>               unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
>> --
>> 2.6.1
>>
>



-- 
Have a nice day,
Timofey.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-10-23 11:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-20 13:29 [RFC PATCH] btrfs/ioctl.c: Prefer inode with lowest offset as source for clone Timofey Titovets
2015-10-20 14:56 ` Filipe Manana
2015-10-20 16:19   ` Timofey Titovets
2015-10-23  0:14 ` Zygo Blaxell
2015-10-23 11:51   ` Timofey Titovets

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).