git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Andy Whitcroft <apw@shadowen.org>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: git packing leaves unpacked files
Date: Tue, 26 Sep 2006 11:28:39 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0609261116170.3952@g5.osdl.org> (raw)
In-Reply-To: <45196BC8.8060608@shadowen.org>



On Tue, 26 Sep 2006, Andy Whitcroft wrote:
>
> I was just looking at my kernel repository and noticed that even after a
> git repack -a -d I have some loose files.  A quick look at repack
> doesn't seem to explain why some are either not packed or are kept unpacked.
> 
> Is this something I should be expecting?

Depending on what you're doing, yes.

You can often get a hint of what is going on by just running 
"git-fsck-objects" and seeing the "dangling" objects - objects that exist, 
but are not reachable.

There are a few things that cause dangling objects quite normally:

 - If you use "git update-index" to update the index half-way, and then do 
   more work, and use "git update-index" again (or commit), then the 
   half-way work will visible be in the form of dangling blobs. You can 
   just do a "git cat-file -p <blobname>" and see it, and maybe you'll 
   recognize that it was something you were about to commit, but never 
   did, because you did further development.

 - if you ever rebase any branch in the project, or do "git reset" to set 
   it to some old point, or delete a branch, dangling commits are very 
   much to be expected.

 - Even if _you_ didn't rebase anything, if the project you track rebases 
   itself, you'll get dangling objects because you had commits that became 
   unreachable when they were replaced by new history.

   My kernel tree doesn't do that, but some other ones occasionally do, 
   and git itself (in the "pu" branch) obviously does all the time.

   This is often the most common reason, especially if you follow 
   Junio's git tree.

   The most common sign of this is that there's a few dangling commits, 
   and when you use gitk to examine them, you see old valid commits that 
   just aren't reachable any more.

 - if you do any merges at all, and they've conflicted or they have had 
   more than one parent and the recursive merger has generated an 
   intermediate version of the tree, you'll have the merge process leave 
   the objects of those intermediate merges around as dangling left-overs 
   that aren't actually reachable from the end result of the merge.

   The most common form of this is that you see a few pending "blob"s, and 
   when you do "git cat-file -p <sha1> | less -S" on the blob-file, you'll 
   generally find a conflict marker in it (ie the "<<<<" "====" ">>>>" 
   things that a three-way merge leaves behind). You might also have a 
   whole dangling tree due to this.

 - if you use the rsync:// protocol, you'll often end up getting objects 
   that aren't reachable from the heads _you_ have, because you got the 
   whole object database from somebody else that had other heads (or, you 
   might get the dangling objects that they had due to any of the reasons 
   above).

   The rsync:// protocol simply doesn't do any git-level reachability 
   analysis, so it just gets everything, regardless.

Hmm. Those are tha main reasons I can think of. There may be other cases, 
but I think these are the main ones, and I think any other cases end up 
being just variations on the same kind of theme.

			Linus

      parent reply	other threads:[~2006-09-26 18:28 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-26 18:04 git packing leaves unpacked files Andy Whitcroft
2006-09-26 18:10 ` Jakub Narebski
2006-09-26 18:28 ` Linus Torvalds [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0609261116170.3952@g5.osdl.org \
    --to=torvalds@osdl.org \
    --cc=apw@shadowen.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).