From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Fleetwood Subject: Re: Data Integrity Check on EXT Family of Filesystems Date: Mon, 23 Sep 2013 23:00:09 +0100 Message-ID: References: <5856b37a-3b66-4404-a6f7-3c120b14ae95@zimbra> <956c398f-5500-433a-a423-ebb96a20c468@zimbra> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: linux-fsdevel@vger.kernel.org To: Andrew Martin Return-path: Received: from mail-vb0-f53.google.com ([209.85.212.53]:42212 "EHLO mail-vb0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751571Ab3IWWAL (ORCPT ); Mon, 23 Sep 2013 18:00:11 -0400 Received: by mail-vb0-f53.google.com with SMTP id i3so2641922vbh.26 for ; Mon, 23 Sep 2013 15:00:10 -0700 (PDT) In-Reply-To: <956c398f-5500-433a-a423-ebb96a20c468@zimbra> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 23 September 2013 22:08, Andrew Martin wrote: > > Hello, > > I am considering writing a tool to perform data integrity checking on filesystems > which do not support it internally (e.g. ext4). When storing long-term backups, > I would like to be able to detect bit rot or other corruption to ensure that I > have intact backups. The method I am considering is to recreate the directory > structure of the backup directory in a "shadow" directory tree, and then hash > each of the files in the backup directory and store the hash in the same filename > in the shadow directory. Then, months later, I can traverse the backup directory, > taking a hash of each file again and comparing it with the hash stored in the > shadow directory tree. If the hashes match, then the file's integrity has been > verified (or at least has not degraded since the shadow directory was created). > > Does this seem like a reasonable approach for checking data integrity, or is there > an existing tool or different method which would be better? > > Thanks, > > Andrew Martin Here's a couple of integrity checking tools to consider: tripwire - http://sourceforge.net/projects/tripwire/ aide - http://aide.sourceforge.net/ Don't use them, just providing options. Thanks, Mike On 23 September 2013 22:08, Andrew Martin wrote: > Hello, > > I am considering writing a tool to perform data integrity checking on filesystems > which do not support it internally (e.g. ext4). When storing long-term backups, > I would like to be able to detect bit rot or other corruption to ensure that I > have intact backups. The method I am considering is to recreate the directory > structure of the backup directory in a "shadow" directory tree, and then hash > each of the files in the backup directory and store the hash in the same filename > in the shadow directory. Then, months later, I can traverse the backup directory, > taking a hash of each file again and comparing it with the hash stored in the > shadow directory tree. If the hashes match, then the file's integrity has been > verified (or at least has not degraded since the shadow directory was created). > > Does this seem like a reasonable approach for checking data integrity, or is there > an existing tool or different method which would be better? > > Thanks, > > Andrew Martin > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html