linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Comparing snapshots?
@ 2011-02-25  9:59 Arvin Schnell
  2011-02-25 19:32 ` João Eduardo Luís
  2011-02-25 20:03 ` Goffredo Baroncelli
  0 siblings, 2 replies; 5+ messages in thread
From: Arvin Schnell @ 2011-02-25  9:59 UTC (permalink / raw)
  To: linux-btrfs

Hi,

for a backup program I have to find all differing files
(including metadata) in two snapshots taken from the same
subvolume.

Having looked at the find-new command I thought about this
process:

1. Get the two transids when the two snapshots were created.

2. Query modifications to the original subvolume between the two
   transids.

Is the general process corrent or have I overseen something?

AFAIS the btrfs tool does not provide the required
information/commands. Would it be possible to add those?

Thanks in advance,
  Arvin

--=20
Arvin Schnell, <aschnell@suse.de>
Senior Software Engineer, Research & Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG N=FCrnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Comparing snapshots?
  2011-02-25  9:59 Comparing snapshots? Arvin Schnell
@ 2011-02-25 19:32 ` João Eduardo Luís
  2011-02-25 20:08   ` Goffredo Baroncelli
  2011-02-25 20:03 ` Goffredo Baroncelli
  1 sibling, 1 reply; 5+ messages in thread
From: João Eduardo Luís @ 2011-02-25 19:32 UTC (permalink / raw)
  To: Arvin Schnell; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2129 bytes --]

Hello,

Please note that my experience with btrfs is both recent and, above all, very small. However, I've been wondering about the same issue for a different purpose and your question intrigues me.

However, and I may be off-base here, I think that wouldn't be trivial to achieve. 

Even if one would be able to differ the metadata changes between both snapshots, the problem would still be present regarding finding the changed data. It would be possible to check for changed extents, at least by comparing extent checksums, but I don't think it would be trivial to discover where (exactly) the extent was modified.

I would recommend using the generation fields, whenever applicable, but I believe these are private to each subvolume/snapshot.


Anyway, I wonder if keeping a data structure (I would go with a tree) containing metadata regarding the changed files, within the file system, could be a plausible solution, but I'm in no condition (btrfs-knowledge-wise) to make such statement.


Cheers.

---
João Eduardo Luís
gpg key: 477C26E5 from pool.keyserver.eu 





On Feb 25, 2011, at 9:59 AM, Arvin Schnell wrote:

> Hi,
> 
> for a backup program I have to find all differing files
> (including metadata) in two snapshots taken from the same
> subvolume.
> 
> Having looked at the find-new command I thought about this
> process:
> 
> 1. Get the two transids when the two snapshots were created.
> 
> 2. Query modifications to the original subvolume between the two
>   transids.
> 
> Is the general process corrent or have I overseen something?
> 
> AFAIS the btrfs tool does not provide the required
> information/commands. Would it be possible to add those?
> 
> Thanks in advance,
>  Arvin
> 
> -- 
> Arvin Schnell, <aschnell@suse.de>
> Senior Software Engineer, Research & Development
> SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Comparing snapshots?
  2011-02-25  9:59 Comparing snapshots? Arvin Schnell
  2011-02-25 19:32 ` João Eduardo Luís
@ 2011-02-25 20:03 ` Goffredo Baroncelli
  1 sibling, 0 replies; 5+ messages in thread
From: Goffredo Baroncelli @ 2011-02-25 20:03 UTC (permalink / raw)
  To: Arvin Schnell; +Cc: linux-btrfs

On 02/25/2011 10:59 AM, Arvin Schnell wrote:
> Hi,
> 
> for a backup program I have to find all differing files
> (including metadata) in two snapshots taken from the same
> subvolume.
> 
> Having looked at the find-new command I thought about this
> process:
> 
> 1. Get the two transids when the two snapshots were created.
> 
> 2. Query modifications to the original subvolume between the two
>    transids.
> 
> Is the general process corrent or have I overseen something?

I suppose that you are thinking to something like:

- record the last trans-id (trans-id1)
- update the file-system
- [...]
- record the last trans-id (trans-id2)
- update the file-system
- [...]
- Backup all the objects which have a trans-id between (trans-id1-trans-id2]

This may lead to miss two kinds of "operations"
1) a file deletion
2) a file changed two times, the first one after the first "snapshot",
and the second one after the second snapshot.

In the first case you would not be able to find any key update between
the two trans-id(s), because they simply doesn't exist.

In the second case the trans-id associated to the object is after trans-id2.

For solving the point two you must change "Query modifications to the
original subvolume" into "Query modifications to the second snapshot".
This means that the second snapshot must exist (it is not sufficient to
know the trans-id)..

For solving the point one, it is needed to
a) track the change not only of the files but also of the directory (if
you remove a file, the timestamp of the directory inode is updated).

b) compare the update directories with the original ones. This means
that the first snapshot must exist (it is not sufficient to know the
trans-id).

I have to point out that for a backup purpose would be sufficient to
track the changed files (and not the deleted ones).

I started to develop a tool to comparing two snapshot. But I stopped
when I discovered that the ioctl BTRFS_IOC_TREE_SEARCH was not robust
enough for that: when I tried to find the changed inode, attribute,
extended attribute... I discovered that the ioctl BTRFS_IOC_TREE_SEARCH
don't work well is some corner case [*].

I even tried to propose a patch to mitigate the problem. But at the time
the develop efforts were (are) oriented to other issues, and the patch
was not merged..

However if you want to start to develop something, I can go deeper in
the problem.


[*] see the thread "Bug in the design of the tree search ioctl API ?",
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg07523.html

> AFAIS the btrfs tool does not provide the required
> information/commands. Would it be possible to add those?
> 
> Thanks in advance,
>   Arvin
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Comparing snapshots?
  2011-02-25 19:32 ` João Eduardo Luís
@ 2011-02-25 20:08   ` Goffredo Baroncelli
       [not found]     ` <4977C273-8274-4C10-A19A-B766E224A20C@gmail.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Goffredo Baroncelli @ 2011-02-25 20:08 UTC (permalink / raw)
  To: João Eduardo Luís; +Cc: linux-btrfs

On 02/25/2011 08:32 PM, Jo=E3o Eduardo Lu=EDs wrote:
> Hello,
>=20
> Please note that my experience with btrfs is both recent and, above
> all, very small. However, I've been wondering about the same issue
> for a different purpose and your question intrigues me.
>=20
> However, and I may be off-base here, I think that wouldn't be trivial
> to achieve.
>=20
> Even if one would be able to differ the metadata changes between both
> snapshots, the problem would still be present regarding finding the
> changed data. It would be possible to check for changed extents, at
> least by comparing extent checksums, but I don't think it would be
> trivial to discover where (exactly) the extent was modified.

Look at the find-new command. It returns also which part of the file is
changed. I don't remember very well the details, but also the data is
stored in a tree like the metadata. Using the same strategies of
comparing the keys and revid leads to discover which part of the file i=
s
changed, with minimum effort (no checksums comparing is needed).

>=20
> I would recommend using the generation fields, whenever applicable,
> but I believe these are private to each subvolume/snapshot.
>=20
>=20
> Anyway, I wonder if keeping a data structure (I would go with a tree)
> containing metadata regarding the changed files, within the file
> system, could be a plausible solution, but I'm in no condition
> (btrfs-knowledge-wise) to make such statement.
>=20
>=20
> Cheers.
>=20
> --- Jo=E3o Eduardo Lu=EDs gpg key: 477C26E5 from pool.keyserver.eu
>=20
>=20
>=20
>=20
>=20
> On Feb 25, 2011, at 9:59 AM, Arvin Schnell wrote:
>=20
>> Hi,
>>=20
>> for a backup program I have to find all differing files (including
>> metadata) in two snapshots taken from the same subvolume.
>>=20
>> Having looked at the find-new command I thought about this=20
>> process:
>>=20
>> 1. Get the two transids when the two snapshots were created.
>>=20
>> 2. Query modifications to the original subvolume between the two=20
>> transids.
>>=20
>> Is the general process corrent or have I overseen something?
>>=20
>> AFAIS the btrfs tool does not provide the required=20
>> information/commands. Would it be possible to add those?
>>=20
>> Thanks in advance, Arvin
>>=20
>> -- Arvin Schnell, <aschnell@suse.de> Senior Software Engineer,
>> Research & Development SUSE LINUX Products GmbH, GF: Markus Rex,
>> HRB 16746 (AG N=FCrnberg) -- To unsubscribe from this list: send the
>> line "unsubscribe linux-btrfs" in the body of a message to
>> majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
>=20

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Fwd: Comparing snapshots?
       [not found]     ` <4977C273-8274-4C10-A19A-B766E224A20C@gmail.com>
@ 2011-02-25 23:26       ` Joao Luis
  0 siblings, 0 replies; 5+ messages in thread
From: Joao Luis @ 2011-02-25 23:26 UTC (permalink / raw)
  To: linux-btrfs

I had rich text enabled by default, and the ml bounced back the email.
Apparently, HTML equals spam and/or virus. :-)

Here goes the plain-text version.


---------- Forwarded message ----------
=46rom: Jo=E3o Eduardo Lu=EDs <jecluis@gmail.com>
Date: 2011/2/25
Subject: Re: Comparing snapshots?
To: kreijack@inwind.it
Cc: linux-btrfs@vger.kernel.org


> On Feb 25, 2011, at 8:08 PM, Goffredo Baroncelli wrote:
>> On 02/25/2011 08:32 PM, Jo=E3o Eduardo Lu=EDs wrote:
>>
>> Hello,
>>
>> Please note that my experience with btrfs is both recent and, above
>> all, very small. However, I've been wondering about the same issue
>> for a different purpose and your question intrigues me.
>>
>> However, and I may be off-base here, I think that wouldn't be trivia=
l
>> to achieve.
>>
>> Even if one would be able to differ the metadata changes between bot=
h
>> snapshots, the problem would still be present regarding finding the
>> changed data. It would be possible to check for changed extents, at
>> least by comparing extent checksums, but I don't think it would be
>> trivial to discover where (exactly) the extent was modified.
>
> Look at the find-new command. It returns also which part of the file =
is
> changed. I don't remember very well the details, but also the data is
> stored in a tree like the metadata. Using the same strategies of
> comparing the keys and revid leads to discover which part of the file=
 is
> changed, with minimum effort (no checksums comparing is needed).


You are right. I just took a peek at the code, and it seems the
generation id (which IIRC is the same as the id of the last modifying
transaction) is shared file system wise, instead of being snapshot or
subvolume specific.
I should have confirmed in the code before replying.

Cheers.
---
Jo=E3o Eduardo Lu=EDs
gpg key: 477C26E5 from pool.keyserver.eu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-02-25 23:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-25  9:59 Comparing snapshots? Arvin Schnell
2011-02-25 19:32 ` João Eduardo Luís
2011-02-25 20:08   ` Goffredo Baroncelli
     [not found]     ` <4977C273-8274-4C10-A19A-B766E224A20C@gmail.com>
2011-02-25 23:26       ` Fwd: " Joao Luis
2011-02-25 20:03 ` Goffredo Baroncelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).