Git development
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: Really slow 'git gc'
Date: Thu, 19 Feb 2009 13:25:30 -0800 (PST)	[thread overview]
Message-ID: <alpine.LFD.2.00.0902191318310.21686@localhost.localdomain> (raw)
In-Reply-To: <7vr61uku2f.fsf@gitster.siamese.dyndns.org>



On Thu, 19 Feb 2009, Junio C Hamano wrote:
> 
> I think we can add a single bit to "struct packed_git" and in the middle
> of setup_revisions() perform the O(N**2) once, so that find_pack_entry()
> can check the bit without looping.

Yes. However, most users call find_pack_entry() with a NULL ignore_packed, 
so we'd still have to pass in that flag to say whether we should look at 
the bit or not.

However, the real issue (?) I think is that the whole "ignore_packed" 
logic is crazy. It's the wrong way around. The whole thing is broken.

Rather than marking which ones are "unpacked" and should be ignored, it 
should just look at the ones to keep. That's how the filesystem layout 
works and that's what "git repack" does anyway.

So I think we should just remove the whole "--unpacked=" thing, and 
instead replace it with a "--keep" flag - and then only finding things in 
the keep packs if we have a list of them.

That would (a) make the logic a whole lot easier to follow and (b) get rid 
of the scalability issue, since you're not really supposed to have more 
than one or two .keep files anyway (if that).

Nobody uses "--unpacked=xyzzy" by hand anyway. The only thing that 
generates those things is git-repack.sh, so this is not a compatibility 
issue, I suspect.

			Linus

  reply	other threads:[~2009-02-19 21:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-19 20:24 Really slow 'git gc' Linus Torvalds
2009-02-19 21:14 ` Junio C Hamano
2009-02-19 21:25   ` Linus Torvalds [this message]
2009-02-28  9:15     ` [PATCH 0/6] "git repack -a -d" improvements Junio C Hamano
2009-02-28  9:15       ` [PATCH 1/6] git-repack: resist stray environment variable Junio C Hamano
2009-02-28  9:15       ` [PATCH 2/6] has_sha1_pack(): refactor "pretend these packs do not exist" interface Junio C Hamano
2009-02-28  9:15       ` [PATCH 3/6] has_sha1_kept_pack(): take "struct rev_info" Junio C Hamano
2009-02-28  9:15       ` [PATCH 4/6] Consolidate ignore_packed logic more Junio C Hamano
2009-02-28  9:15       ` [PATCH 5/6] Simplify is_kept_pack() Junio C Hamano
2009-02-28  9:15       ` [PATCH 6/6] is_kept_pack(): final clean-up Junio C Hamano
2009-02-28 12:29       ` [PATCH 0/6] "git repack -a -d" improvements Kjetil Barvik
2009-02-28 17:41         ` Junio C Hamano
     [not found]       ` <7Vazs5mFk91IKAarOd0wrBNmYj7eSJxVIcR0PEQxJl8R0aQmQDEqSJMphMrXhmVu570fijupQ34@cipher.nrlssc.navy.mil>
2009-03-18 20:59         ` [PATCH] t7700-repack: repack -a now works properly, expect success from test Brandon Casey
2009-03-20  3:47           ` [PATCH 0/5] repack improvements Brandon Casey
2009-03-20  3:47             ` [PATCH 1/5] t7700-repack: add two new tests demonstrating repacking flaws Brandon Casey
2009-03-20  3:47               ` [PATCH 2/5] git-repack.sh: don't use --kept-pack-only option to pack-objects Brandon Casey
2009-03-20  3:47                 ` [PATCH 3/5] pack-objects: only repack or loosen objects residing in "local" packs Brandon Casey
2009-03-20  3:47                   ` [PATCH 4/5] t7700-repack: repack -a now works properly, expect success from test Brandon Casey
2009-03-20  3:47                     ` [PATCH 5/5] Remove --kept-pack-only option and associated infrastructure Brandon Casey
2009-03-20  4:05             ` [PATCH 0/5] repack improvements Brandon Casey
2009-02-19 21:34   ` Really slow 'git gc' Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.0902191318310.21686@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox