From mboxrd@z Thu Jan 1 00:00:00 1970 From: "J." Subject: Re: file deletion Date: Mon, 27 Dec 2004 04:50:15 +0100 (CET) Message-ID: References: <003f01c4e9ba$7679d020$316c4ed5@j0s6l8> Reply-To: linux-c-programming@vger.kernel.org Mime-Version: 1.0 Return-path: In-Reply-To: <003f01c4e9ba$7679d020$316c4ed5@j0s6l8> Sender: linux-c-programming-owner@vger.kernel.org List-Id: Content-Type: TEXT/PLAIN; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-c-programming@vger.kernel.org On Fri, 24 Dec 2004, Andy wrote: > Does anybody know of any useful code or c commands that > you can use to search for duplicate files within a linux/unix directory and > its subdirectories and remove them? > Thanks > Andrew Ehm.. Personally no.. However there is ftw, opendir, readdir, stat, fstat... etc.. Ones you are able to access all the directory entries you have to make a decision how you want to compare e.g. md5, crc32, only size or name etc... Then there is the issue of choosing a optimal ADT and access/retrieval algo. If you dont have to c code the program but just looking for a solution I would most certainly go for a: `find -type f -exec md5sum '{}' \; >> md5.log` and parse the md5.log with a simple shell or awk script. That would save many headache's.. Plus you don't have to reinvent `find`... J. -- http://www.rdrs.net