From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Garzik Subject: Re: [Patch 4/4] chunkd: add self-checking Date: Tue, 12 Jan 2010 09:21:24 -0500 Message-ID: <4B4C8564.7030303@garzik.org> References: <20091227165955.3ed0845a@redhat.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=HERgQNFqnWw2tyjr0Z1zXozXUoSd/WunaQsrY5ogRQc=; b=M9KZ998zJKo8DV87id4Q0HIndiqyAkIW2w7QRRm3irQ780eWV3SO1JA/KDLly0AulP FKw9xGcKFJdoG1ckt7tMJPenXk13aIMnOzQlF3j9T8m3pP9CQRF4hizS3t0R3ynK0POE jvYBn/PAQxIFRFCbB2WnLhuSSoY6Efmw1Avbs= In-Reply-To: <20091227165955.3ed0845a@redhat.com> Sender: hail-devel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: Pete Zaitcev Cc: Project Hail List On 12/27/2009 06:59 PM, Pete Zaitcev wrote: > This patch adds a background process that periodically verifies the > integrity of all known objects. Objects that are found found faulty > are made invisible to applications. This way, only known good objects > remain visible and applications may implement integrity checking that > is run asynchronously with the self-checking. The intent is to let > the performance of checking to scale with the number of objects and > amount of data stored in the cell. > > Signed-off-by: Pete Zaitcev > > --- > server/Makefile.am | 2 > server/be-fs.c | 126 +++++++++++++++- > server/chunkd.h | 8 + > server/config.c | 28 +++ > server/selfcheck.c | 208 ++++++++++++++++++++++++++ > server/server.c | 2 > test/.gitignore | 1 > test/Makefile.am | 4 > test/selfcheck-unit.c | 313 ++++++++++++++++++++++++++++++++++++++++ > test/server-test.cfg | 2 > test/test.h | 2 > 11 files changed, 693 insertions(+), 3 deletions(-) OK. The objcache (patches 3 & 5) look good. I was thinking that the "table of open files" was better suited for server/be-fs.c, thus keeping the backend / frontend separation a bit more distinct. However, this is a matter of taste, and I can also see advantages in how you implemented it. So, no changes necessary, we'll how that works out. The reason why I think about backend / frontend separation, when we only have one storage backend, "the filesystem", is that I see a lot of merit in a second type of backend: single-file (aka raw block device mode). I don't want to stray too far from being able to implement a second type of storage backend. On the flip side, of course, I don't want imaginary future ideas to constraint us in the present. On to this patch, self-checking. Regarding this patch's implementation, the code appears to be correct. However, I wanted to discuss the design a little bit. I am definitely concerned that the current implementation does not really scale at all with the datastore. With this change, the admin would have to guess at the self-check period, with not much feedback from the system about how long a scan takes, when a scan begins, or when a scan ends. I also think a sleep(n) self_check() algorithm seems less useful to the average admin than a slightly more complex one that solves the problem defined as "guarantee an object is checked at least every N days." Because, as the dataset grows beyond a test database measures in megabytes, the dataset scan time will dwarf the self-check sleep period. The behavior becomes one of constant scanning, with a tiny period of rest in between. It seems like either (a) tracking the total dataset size and total dataset scan time, or (b) tracking and modifying per-object last-self-check times, would be needed. Also of relevance to admins is scan start time. This patch would have a scan begin at essentially random times throughout the day or night, in particular occurring during the most heavily-trafficked portion of the day. An algorithm that occurs on hour N every day (or hour N, day M of each week) is much more predictable, regardless of dataset size. That would be pretty easy: while (1) calc time until next desired interval (example: day 0, hour 5 (every sunday at 5am)) timer_init() timer_add(scan_timer) scan_timer() if (scan thread exists) log ("still scanning") return start scan thread scan_thread() dbscan() end thread (do not loop) Another option is to add an administrative command "START SCAN", and permit an external utility, scheduled by cron or executed by "make check", to be the entity that starts the scan thread in the background. That would permit maximum flexibility for both the admin and our testsuite. Comments on all this? I will happily apply the objcache stuff once all the self-checking details are settled. Jeff