From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff King Subject: Re: Keeping unreachable objects in a separate pack instead of loose? Date: Tue, 12 Jun 2012 13:50:46 -0400 Message-ID: <20120612175046.GA16522@sigill.intra.peff.net> References: <20120611183414.GD20134@sigill.intra.peff.net> <20120611211401.GA21775@thunk.org> <20120611213948.GB32061@sigill.intra.peff.net> <20120611221439.GE21775@thunk.org> <20120611222308.GA10476@sigill.intra.peff.net> <20120612171048.GB12706@sigill.intra.peff.net> <20120612173214.GA16014@sigill.intra.peff.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: Nicolas Pitre , Ted Ts'o , Thomas Rast , Hallvard B Furuseth , git@vger.kernel.org To: Shawn Pearce X-From: git-owner@vger.kernel.org Tue Jun 12 19:50:58 2012 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SeVEz-0004L4-JL for gcvg-git-2@plane.gmane.org; Tue, 12 Jun 2012 19:50:53 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752372Ab2FLRut (ORCPT ); Tue, 12 Jun 2012 13:50:49 -0400 Received: from 99-108-225-23.lightspeed.iplsin.sbcglobal.net ([99.108.225.23]:51209 "EHLO peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752335Ab2FLRus (ORCPT ); Tue, 12 Jun 2012 13:50:48 -0400 Received: (qmail 23158 invoked by uid 107); 12 Jun 2012 17:50:50 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) (smtp-auth username relayok, mechanism cram-md5) by peff.net (qpsmtpd/0.84) with ESMTPA; Tue, 12 Jun 2012 13:50:50 -0400 Received: by sigill.intra.peff.net (sSMTP sendmail emulation); Tue, 12 Jun 2012 13:50:46 -0400 Content-Disposition: inline In-Reply-To: Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Tue, Jun 12, 2012 at 10:45:22AM -0700, Shawn O. Pearce wrote: > > Then those objects will remain in the cruft pack. Which is why, as I > > said, it is not generally safe to just delete a cruft pack. However, > > when you do a full repack, those objects will be copied into the new > > pack (because they are referenced). Which is why I am claiming that it > > is safe to remove cruft packs at that point. > > But there is a race condition with a concurrent fetch and a concurrent > repack. If that fetch needs those cruft objects, and sees them in the > cruft pack, and the repack sees the references before the fetch, the > repacker might delete things the fetch is about to reference and that > will leave you with a corrupt repository. > > I think we already have this race condition with loose unreachable > objects whose mtimes are older than 2 weeks; they are removed by prune > but may have just become reachable by a concurrent fetch that doesn't > overwrite them because they already exist, and doesn't update the > mtime because they aren't writable. Correct. There is a race condition, but it is there already. I have discussed this with other GitHub folks, because we prune fairly aggressively (in our case it would be a push, not a fetch, of course). So far we have not had any record of it actually happening in practice. We could close it in both cases by tweaking the mtime of the file containing the object when we decide not to write because the object already exists. -Peff