git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* getting list of objects for packing
@ 2008-10-31 19:32 Brandon Casey
  2008-10-31 20:40 ` Nicolas Pitre
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-10-31 19:32 UTC (permalink / raw)
  To: Git Mailing List


I'm trying to write a script that will repack large binary or compressed
objects into their own non-compressed, non-delta'ed pack file.

To make the decision about whether an object should go into this special
pack file or not, I want the output from 'git cat-file --batch-check'.
I get it with something similar to:

   git rev-list --objects --all |
      sed -e 's/^\([0-9a-f]\{40\}\).*/\1/' |
      git cat-file --batch-check

First question: Is the rev-list call correct?
  -If I am understanding things right, then the list of objects produced
   by rev-list will be in the right order for piping to pack-objects. 
  -The sed statement is stripping off anything after the sha1. Any way to
   get rev-list to print out just the sha1 so that sed is not necessary?

Then I want to parse the output from cat-file and use an external program
to detect the file format. Here is a simplified version:

  | while read sha1 type size; do

       if [ $type = "blob" ]; then
           if ! ( git cat-file blob "$sha1" | file -b - | grep text ) &&
              [ $size -ge $threshhold ]; then
               # pack into special pack
           else
               # pack normally into normal pack
           fi
       fi
  done

All of this has actually been rewritten into a perl script, so ignore any
syntax mistakes.

I have successfully created two of the pack files that I have been trying to
make. Where the definition of successful means that after removing the existing
packs and objects, and putting in place the two pack files that I generated,
'git fsck --full' prints no errors and exits successfully.

These two packs will be placed into a central repository.

ISSUE TWO:

I have placed these two packs into my own personal repo, and I have unpacked all
of the other objects so that they are loose.

I thought I could use a similar sequence of commands to pack those loose objects
into a normal and special pack. I added the --unpacked option to my rev-list
command, but it still lists many more objects than exist loosely in the repository.

   git rev-list --objects --unpacked --all

The man page says:

   --objects
          Print  the  object  IDs  of any object referenced by the listed
          commits. --objects foo ^bar thus means "send me all object  IDs
          which  I  need to download if I have the commit object bar, but
          not foo".

   --unpacked
          Only useful with --objects; print the object IDs that  are  not
          in packs.

Is this the correct behavior for rev-list --unpacked?
Am I mis-reading the --unpacked text, or should it be changed?

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2008-11-13  0:52 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-31 19:32 getting list of objects for packing Brandon Casey
2008-10-31 20:40 ` Nicolas Pitre
2008-10-31 20:48   ` Brandon Casey
2008-10-31 21:30     ` Junio C Hamano
2008-10-31 21:40       ` Brandon Casey
2008-10-31 22:23         ` Jakub Narebski
2008-11-01  0:00         ` Brandon Casey
2008-11-02  3:35           ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file drafnel
2008-11-02 16:31             ` [PATCH 1/3] packed_git: convert pack_local flag into generic bit mask drafnel
2008-11-03 16:12               ` Shawn O. Pearce
2008-11-03 18:24                 ` Brandon Casey
2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
2008-11-03 20:41                   ` [PATCH v2 2/3] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
2008-11-03 20:43                     ` [PATCH v2 3/3] pack-objects: honor '.keep' files Brandon Casey
2008-11-03 20:49                       ` Shawn O. Pearce
2008-11-05 22:37                       ` Brandon Casey
2008-11-06 23:22                         ` Brandon Casey
2008-11-07  0:30                           ` Junio C Hamano
2008-11-07  1:17                             ` Brandon Casey
2008-11-07  8:12                               ` Andreas Ericsson
2008-11-07 19:25                                 ` Shawn O. Pearce
2008-11-10  5:59                             ` recognize loose local objects during repack drafnel
2008-11-10 21:03                               ` Junio C Hamano
     [not found]                             ` <1226296798-31522-1-git-send-email-foo@foo.com>
2008-11-10  5:59                               ` [PATCH 1/3] t7700: demonstrate mishandling of loose objects in an alternate ODB drafnel
     [not found]                               ` <1226296798-31522-2-git-send-email-foo@foo.com>
2008-11-10  5:59                                 ` [PATCH 2/3] sha1_file.c: split has_loose_object() into local and non-local counterparts drafnel
     [not found]                                 ` <1226296798-31522-3-git-send-email-foo@foo.com>
2008-11-10  5:59                                   ` [PATCH 3/3] pack-objects: extend --local to mean ignore non-local loose objects too drafnel
2008-11-07  1:52                           ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Brandon Casey
2008-11-07  1:54                             ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Brandon Casey
2008-11-07  1:55                               ` [PATCH 3/4] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
2008-11-07  1:56                                 ` [PATCH 4/4] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence Brandon Casey
2008-11-07  8:14                               ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Andreas Ericsson
2008-11-07  8:13                             ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Andreas Ericsson
2008-11-03 22:14                   ` [PATCH v3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
2008-11-04 19:17                   ` [PATCH v2 1/3] " Andreas Ericsson
2008-11-04 19:49                     ` Brandon Casey
2008-11-04 19:55                       ` Junio C Hamano
2008-11-04 20:01                         ` Brandon Casey
2008-11-04 20:21                       ` Andreas Ericsson
2008-11-04 23:55                   ` Junio C Hamano
2008-11-12  8:09                   ` Jeff King
2008-11-12 17:10                     ` Junio C Hamano
2008-11-12 19:17                       ` Jeff King
2008-11-12 17:30                     ` Brandon Casey
2008-11-12 17:59                       ` repack and .keep series Brandon Casey
2008-11-12 17:59                         ` [PATCH 1/6] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
2008-11-12 17:59                           ` [PATCH 2/6] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
2008-11-12 17:59                             ` [PATCH 3/6] pack-objects: new option --honor-pack-keep Brandon Casey
2008-11-12 17:59                               ` [PATCH 4/6] repack: don't repack local objects in packs with .keep file Brandon Casey
2008-11-12 17:59                                 ` [PATCH 5/6] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
2008-11-12 17:59                                   ` [PATCH 6/6] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence Brandon Casey
2008-11-13  0:50                                   ` [PATCH] t7700: test that 'repack -a' packs alternate packed objects Brandon Casey
2008-11-12 18:10                       ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Junio C Hamano
2008-11-12 18:19                         ` Junio C Hamano
     [not found]             ` <1225643477-32319-1-git-send-email-foo@foo.com>
2008-11-02 16:31               ` [PATCH 2/3] packed_git: add new PACK_KEEP flag and haspackkeep() access macro drafnel
     [not found]               ` <1225643477-32319-2-git-send-email-foo@foo.com>
2008-11-02 16:31                 ` [PATCH 3/3] pack-objects: honor '.keep' files drafnel
2008-11-03 16:17                   ` Shawn O. Pearce
2008-11-03 10:35             ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file Andreas Ericsson
2008-11-03 18:20               ` Brandon Casey
2008-11-03 20:25                 ` Andreas Ericsson
2008-11-03 22:02                   ` Brandon Casey
2008-11-04 19:25                     ` Andreas Ericsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).