All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Nicolas Pitre <nico@fluxnic.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 12/23] pack v4: creation code
Date: Tue, 27 Aug 2013 08:48:24 -0700	[thread overview]
Message-ID: <xmqqppszdtiv.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1377577567-27655-13-git-send-email-nico@fluxnic.net> (Nicolas Pitre's message of "Tue, 27 Aug 2013 00:25:56 -0400")

Nicolas Pitre <nico@fluxnic.net> writes:

> Let's actually open the destination pack file and write the header and
> the tables.
>
> The header isn't much different from pack v3, except for the pack version
> number of course.
>
> The first table is the sorted SHA1 table normally found in the pack index
> file.  With pack v4 we write this table in the main pack file instead as
> it is index referenced by subsequent objects in the pack.  Doing so has
> many advantages:
>
> - The SHA1 references used to be duplicated on disk: once in the pack
>   index file, and then at least once or more within commit and tree
>   objects referencing them.  The only SHA1 which is not being listed more
>   than once this way is the one for a branch tip commit object and those
>   are normally very few.  Now all that SHA1 data is represented only once.
>

This tickles my curiosity. Why isn't this SHA-1 table sorted by
reference count the same way as the tree path and the people name
tables to keep the average length of varint references short?

> - The SHA1 references found in commit and tree objects can be obtained
>   on disk directly without having to deflate those objects first.
>
> The SHA1 table size is obtained by multiplying the number of objects by 20.
>
> And then the commit and path dictionary tables are written right after
> the SHA1 table.

> Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
> ---
>  packv4-create.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 55 insertions(+), 5 deletions(-)
>
> diff --git a/packv4-create.c b/packv4-create.c
> index 2956fda..5211f9c 100644
> --- a/packv4-create.c
> +++ b/packv4-create.c
> @@ -605,6 +605,48 @@ static unsigned long write_dict_table(struct sha1file *f, struct dict_table *t)
>  	return hdrlen + datalen;
>  }
>  
> +static struct sha1file * packv4_open(char *path)
> +{
> +	int fd;
> +
> +	fd = open(path, O_CREAT|O_EXCL|O_WRONLY, 0600);
> +	if (fd < 0)
> +		die_errno("unable to create '%s'", path);
> +	return sha1fd(fd, path);
> +}
> +
> +static unsigned int packv4_write_header(struct sha1file *f, unsigned nr_objects)
> +{
> +	struct pack_header hdr;
> +
> +	hdr.hdr_signature = htonl(PACK_SIGNATURE);
> +	hdr.hdr_version = htonl(4);
> +	hdr.hdr_entries = htonl(nr_objects);
> +	sha1write(f, &hdr, sizeof(hdr));
> +
> +	return sizeof(hdr);
> +}
> +
> +static unsigned long packv4_write_tables(struct sha1file *f, unsigned nr_objects,
> +					 struct pack_idx_entry *objs)
> +{
> +	unsigned i;
> +	unsigned long written = 0;
> +
> +	/* The sorted list of object SHA1's is always first */
> +	for (i = 0; i < nr_objects; i++)
> +		sha1write(f, objs[i].sha1, 20);
> +	written = 20 * nr_objects;
> +
> +	/* Then the commit dictionary table */
> +	written += write_dict_table(f, commit_name_table);
> +
> +	/* Followed by the path component dictionary table */
> +	written += write_dict_table(f, tree_path_table);
> +
> +	return written;
> +}
> +
>  static struct packed_git *open_pack(const char *path)
>  {
>  	char arg[PATH_MAX];
> @@ -658,9 +700,10 @@ static struct packed_git *open_pack(const char *path)
>  	return p;
>  }
>  
> -static void process_one_pack(char *src_pack)
> +static void process_one_pack(char *src_pack, char *dst_pack)
>  {
>  	struct packed_git *p;
> +	struct sha1file *f;
>  	struct pack_idx_entry *objs, **p_objs;
>  	unsigned nr_objects;
>  
> @@ -673,15 +716,22 @@ static void process_one_pack(char *src_pack)
>  	p_objs = sort_objs_by_offset(objs, nr_objects);
>  
>  	create_pack_dictionaries(p, p_objs);
> +
> +	f = packv4_open(dst_pack);
> +	if (!f)
> +		die("unable to open destination pack");
> +	packv4_write_header(f, nr_objects);
> +	packv4_write_tables(f, nr_objects, objs);
>  }
>  
>  int main(int argc, char *argv[])
>  {
> -	if (argc != 2) {
> -		fprintf(stderr, "Usage: %s <packfile>\n", argv[0]);
> +	if (argc != 3) {
> +		fprintf(stderr, "Usage: %s <src_packfile> <dst_packfile>\n", argv[0]);
>  		exit(1);
>  	}
> -	process_one_pack(argv[1]);
> -	dict_dump();
> +	process_one_pack(argv[1], argv[2]);
> +	if (0)
> +		dict_dump();
>  	return 0;
>  }

  reply	other threads:[~2013-08-27 15:48 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-27  4:25 [PATCH 00/23] Preliminary pack v4 support Nicolas Pitre
2013-08-27  4:25 ` [PATCH 01/23] pack v4: initial pack dictionary structure and code Nicolas Pitre
2013-08-27 15:08   ` Junio C Hamano
2013-08-27 16:13     ` Nicolas Pitre
2013-08-27  4:25 ` [PATCH 02/23] export packed_object_info() Nicolas Pitre
2013-08-27  4:25 ` [PATCH 03/23] pack v4: scan tree objects Nicolas Pitre
2013-08-27  4:25 ` [PATCH 04/23] pack v4: add tree entry mode support to dictionary entries Nicolas Pitre
2013-08-27  4:25 ` [PATCH 05/23] pack v4: add commit object parsing Nicolas Pitre
2013-08-27 15:26   ` Junio C Hamano
2013-08-27 16:47     ` Nicolas Pitre
2013-08-27 17:42       ` Junio C Hamano
2013-08-27  4:25 ` [PATCH 06/23] pack v4: split the object list and dictionary creation Nicolas Pitre
2013-08-27  4:25 ` [PATCH 07/23] pack v4: move to struct pack_idx_entry and get rid of our own struct idx_entry Nicolas Pitre
2013-08-27  4:25 ` [PATCH 08/23] pack v4: basic references encoding Nicolas Pitre
2013-08-27 15:29   ` Junio C Hamano
2013-08-27 15:53     ` Nicolas Pitre
2013-08-27  4:25 ` [PATCH 09/23] pack v4: commit object encoding Nicolas Pitre
2013-08-27 15:39   ` Junio C Hamano
2013-08-27 16:50     ` Nicolas Pitre
2013-08-27 19:59     ` Nicolas Pitre
2013-08-27 20:15       ` Junio C Hamano
2013-08-27 21:43         ` Nicolas Pitre
2013-09-02 20:48   ` Duy Nguyen
2013-09-03  6:30     ` Nicolas Pitre
2013-09-03  7:41       ` Duy Nguyen
2013-09-05  3:50         ` Nicolas Pitre
2013-08-27  4:25 ` [PATCH 10/23] pack v4: tree " Nicolas Pitre
2013-08-27 15:44   ` Junio C Hamano
2013-08-27 16:52     ` Nicolas Pitre
2013-08-27  4:25 ` [PATCH 11/23] pack v4: dictionary table output Nicolas Pitre
2013-08-27  4:25 ` [PATCH 12/23] pack v4: creation code Nicolas Pitre
2013-08-27 15:48   ` Junio C Hamano [this message]
2013-08-27 16:59     ` Nicolas Pitre
2013-08-27  4:25 ` [PATCH 13/23] pack v4: object headers Nicolas Pitre
2013-08-27  4:25 ` [PATCH 14/23] pack v4: object data copy Nicolas Pitre
2013-08-27 15:53   ` Junio C Hamano
2013-08-27 18:24     ` Nicolas Pitre
2013-08-27  4:25 ` [PATCH 15/23] pack v4: object writing Nicolas Pitre
2013-08-27  4:26 ` [PATCH 16/23] pack v4: tree object delta encoding Nicolas Pitre
2013-08-27  4:26 ` [PATCH 17/23] pack v4: load delta candidate for encoding tree objects Nicolas Pitre
2013-08-27  4:26 ` [PATCH 18/23] pack v4: honor pack.compression config option Nicolas Pitre
2013-08-27  4:26 ` [PATCH 19/23] pack v4: relax commit parsing a bit Nicolas Pitre
2013-08-27  4:26 ` [PATCH 20/23] pack index v3 Nicolas Pitre
2013-08-27  4:26 ` [PATCH 21/23] pack v4: normalize pack name to properly generate the pack index file name Nicolas Pitre
2013-08-27  4:26 ` [PATCH 22/23] pack v4: add progress display Nicolas Pitre
2013-08-27  4:26 ` [PATCH 23/23] initial pack index v3 support on the read side Nicolas Pitre
2013-08-31 11:45   ` Duy Nguyen
2013-09-03  6:09     ` Nicolas Pitre
2013-09-03  7:34       ` Duy Nguyen
2013-08-27 11:17 ` [PATCH] Document pack v4 format Nguyễn Thái Ngọc Duy
2013-08-27 18:25   ` Junio C Hamano
2013-08-27 18:53   ` Nicolas Pitre
2013-08-31  2:49   ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2013-09-03  6:00     ` Nicolas Pitre
2013-09-03  6:46       ` Nicolas Pitre
2013-09-03 11:49         ` Duy Nguyen
2013-09-03 14:54           ` Duy Nguyen
2013-09-05  4:12             ` Nicolas Pitre
2013-09-05  4:19               ` Duy Nguyen
2013-09-05  4:40                 ` Nicolas Pitre
2013-09-05  5:04                   ` Duy Nguyen
2013-09-05  5:39                     ` Nicolas Pitre
2013-09-05 16:52                       ` Duy Nguyen
2013-09-05 17:14                         ` Nicolas Pitre
2013-09-05 20:26                           ` Junio C Hamano
2013-09-05 21:04                             ` Nicolas Pitre
2013-09-06  4:18                         ` Duy Nguyen
2013-09-06 13:19                           ` Nicolas Pitre
2013-09-06  2:14     ` [PATCH v3] " Nguyễn Thái Ngọc Duy
2013-09-06  3:23       ` Nicolas Pitre
2013-09-06  9:48         ` Duy Nguyen
2013-09-06 13:25           ` Nicolas Pitre
2013-09-06 13:44             ` Duy Nguyen
2013-09-06 16:44               ` Nicolas Pitre
2013-09-07  4:57                 ` Nicolas Pitre
2013-09-07  4:52       ` Nicolas Pitre
2013-09-07  8:05         ` Duy Nguyen
2013-08-27 15:03 ` [PATCH 00/23] Preliminary pack v4 support Junio C Hamano
2013-08-27 15:59   ` Nicolas Pitre
2013-08-27 16:44     ` Junio C Hamano
2013-08-28  2:30       ` Duy Nguyen
2013-08-28  2:58         ` Nicolas Pitre
2013-08-28  3:06           ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqppszdtiv.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=nico@fluxnic.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.