linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Fleetwood <mike.fleetwood@googlemail.com>
To: Andrew Martin <amartin@xes-inc.com>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: Data Integrity Check on EXT Family of Filesystems
Date: Mon, 23 Sep 2013 23:00:09 +0100	[thread overview]
Message-ID: <CAMU1PDis0F=BLBCjFYffd7-MaSLkmBPpWANi8yae4W8LeRF5gA@mail.gmail.com> (raw)
In-Reply-To: <956c398f-5500-433a-a423-ebb96a20c468@zimbra>

On 23 September 2013 22:08, Andrew Martin <amartin@xes-inc.com> wrote:
>
> Hello,
>
> I am considering writing a tool to perform data integrity checking on filesystems
> which do not support it internally (e.g. ext4). When storing long-term backups,
> I would like to be able to detect bit rot or other corruption to ensure that I
> have intact backups. The method I am considering is to recreate the directory
> structure of the backup directory in a "shadow" directory tree, and then hash
> each of the files in the backup directory and store the hash in the same filename
> in the shadow directory. Then, months later, I can traverse the backup directory,
> taking a hash of each file again and comparing it with the hash stored in the
> shadow directory tree. If the hashes match, then the file's integrity has been
> verified (or at least has not degraded since the shadow directory was created).
>
> Does this seem like a reasonable approach for checking data integrity, or is there
> an existing tool or different method which would be better?
>
> Thanks,
>
> Andrew Martin

Here's a couple of integrity checking tools to consider:
tripwire - http://sourceforge.net/projects/tripwire/
aide - http://aide.sourceforge.net/

Don't use them, just providing options.

Thanks,
Mike



On 23 September 2013 22:08, Andrew Martin <amartin@xes-inc.com> wrote:
> Hello,
>
> I am considering writing a tool to perform data integrity checking on filesystems
> which do not support it internally (e.g. ext4). When storing long-term backups,
> I would like to be able to detect bit rot or other corruption to ensure that I
> have intact backups. The method I am considering is to recreate the directory
> structure of the backup directory in a "shadow" directory tree, and then hash
> each of the files in the backup directory and store the hash in the same filename
> in the shadow directory. Then, months later, I can traverse the backup directory,
> taking a hash of each file again and comparing it with the hash stored in the
> shadow directory tree. If the hashes match, then the file's integrity has been
> verified (or at least has not degraded since the shadow directory was created).
>
> Does this seem like a reasonable approach for checking data integrity, or is there
> an existing tool or different method which would be better?
>
> Thanks,
>
> Andrew Martin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2013-09-23 22:00 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5856b37a-3b66-4404-a6f7-3c120b14ae95@zimbra>
2013-09-23 21:08 ` Data Integrity Check on EXT Family of Filesystems Andrew Martin
2013-09-23 22:00   ` Mike Fleetwood [this message]
2013-09-24 14:57     ` Theodore Ts'o
2013-09-24 17:31       ` Zach Brown
2013-09-26 19:22       ` Andrew Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMU1PDis0F=BLBCjFYffd7-MaSLkmBPpWANi8yae4W8LeRF5gA@mail.gmail.com' \
    --to=mike.fleetwood@googlemail.com \
    --cc=amartin@xes-inc.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).