git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Improve push performance with lots of refs
@ 2014-12-24 23:05 brian m. carlson
  2014-12-24 23:05 ` [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt brian m. carlson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: brian m. carlson @ 2014-12-24 23:05 UTC (permalink / raw)
  To: git; +Cc: Duy Nguyen, Junio C Hamano

This series contains patches to address a significant push performance
regression in repositories with large amounts of refs.  It avoids
performing expensive edge marking unless the repository is shallow.

The first patch in the series is a fix for a minor typo I discovered
when editing the documentation.  The second patch implements git
rev-list --objects-edge-aggressive, and the third patch ensures it's
used for pushing to and fetching from shallow repos only.

The changes from v3 are to use --objects-edge-aggressive from the point
it's introduced (this preserves bisectability) and to make higher-level
commands pass --shallow for any shallow pushing and fetching instead of
trying to have pack-objects determine it.

The original fix was suggested by Duy Nguyen.

brian m. carlson (3):
  Documentation: add missing article in rev-list-options.txt
  rev-list: add an option to mark fewer edges as uninteresting
  pack-objects: use --objects-edge-aggressive for shallow repos

 Documentation/git-pack-objects.txt | 7 ++++++-
 Documentation/git-rev-list.txt     | 3 ++-
 Documentation/rev-list-options.txt | 7 ++++++-
 builtin/pack-objects.c             | 7 ++++++-
 list-objects.c                     | 4 ++--
 revision.c                         | 6 ++++++
 revision.h                         | 1 +
 send-pack.c                        | 3 +++
 upload-pack.c                      | 4 +++-
 9 files changed, 35 insertions(+), 7 deletions(-)

-- 
2.2.1.209.g41e5f3a

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt
  2014-12-24 23:05 [PATCH v4 0/3] Improve push performance with lots of refs brian m. carlson
@ 2014-12-24 23:05 ` brian m. carlson
  2014-12-24 23:05 ` [PATCH v4 2/3] rev-list: add an option to mark fewer edges as uninteresting brian m. carlson
  2014-12-24 23:05 ` [PATCH v4 3/3] pack-objects: use --objects-edge-aggressive for shallow repos brian m. carlson
  2 siblings, 0 replies; 4+ messages in thread
From: brian m. carlson @ 2014-12-24 23:05 UTC (permalink / raw)
  To: git; +Cc: Duy Nguyen, Junio C Hamano

Add the missing article "a".

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 Documentation/rev-list-options.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index afccfdc..2277fcb 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -653,7 +653,7 @@ These options are mostly targeted for packing of Git repositories.
 --objects-edge::
 	Similar to `--objects`, but also print the IDs of excluded
 	commits prefixed with a ``-'' character.  This is used by
-	linkgit:git-pack-objects[1] to build ``thin'' pack, which records
+	linkgit:git-pack-objects[1] to build a ``thin'' pack, which records
 	objects in deltified form based on objects contained in these
 	excluded commits to reduce network traffic.
 
-- 
2.2.1.209.g41e5f3a

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v4 2/3] rev-list: add an option to mark fewer edges as uninteresting
  2014-12-24 23:05 [PATCH v4 0/3] Improve push performance with lots of refs brian m. carlson
  2014-12-24 23:05 ` [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt brian m. carlson
@ 2014-12-24 23:05 ` brian m. carlson
  2014-12-24 23:05 ` [PATCH v4 3/3] pack-objects: use --objects-edge-aggressive for shallow repos brian m. carlson
  2 siblings, 0 replies; 4+ messages in thread
From: brian m. carlson @ 2014-12-24 23:05 UTC (permalink / raw)
  To: git; +Cc: Duy Nguyen, Junio C Hamano

In commit fbd4a70 (list-objects: mark more commits as edges in
mark_edges_uninteresting - 2013-08-16), we marked an increasing number
of edges uninteresting.  This change, and the subsequent change to make
this conditional on --objects-edge, are used by --thin to make much
smaller packs for shallow clones.

Unfortunately, they cause a significant performance regression when
pushing non-shallow clones with lots of refs (23.322 seconds vs.
4.785 seconds with 22400 refs).  Add an option to git rev-list,
--objects-edge-aggressive, that preserves this more aggressive behavior,
while leaving --objects-edge to provide more performant behavior.
Preserve the current behavior for the moment by using the aggressive
option.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 Documentation/git-rev-list.txt     | 3 ++-
 Documentation/rev-list-options.txt | 4 ++++
 builtin/pack-objects.c             | 2 +-
 list-objects.c                     | 4 ++--
 revision.c                         | 6 ++++++
 revision.h                         | 1 +
 6 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index fd7f8b5..5b11922 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -46,7 +46,8 @@ SYNOPSIS
 	     [ \--extended-regexp | -E ]
 	     [ \--fixed-strings | -F ]
 	     [ \--date=(local|relative|default|iso|iso-strict|rfc|short) ]
-	     [ [\--objects | \--objects-edge] [ \--unpacked ] ]
+	     [ [ \--objects | \--objects-edge | \--objects-edge-aggressive ]
+	       [ \--unpacked ] ]
 	     [ \--pretty | \--header ]
 	     [ \--bisect ]
 	     [ \--bisect-vars ]
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 2277fcb..8cb6f92 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -657,6 +657,10 @@ These options are mostly targeted for packing of Git repositories.
 	objects in deltified form based on objects contained in these
 	excluded commits to reduce network traffic.
 
+--objects-edge-aggressive::
+	Similar to `--objects-edge`, but it tries harder to find excluded
+	commits at the cost of increased time.
+
 --unpacked::
 	Only useful with `--objects`; print the object IDs that are not
 	in packs.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 3f9f5c7..f93a17c 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2711,7 +2711,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	argv_array_push(&rp, "pack-objects");
 	if (thin) {
 		use_internal_rev_list = 1;
-		argv_array_push(&rp, "--objects-edge");
+		argv_array_push(&rp, "--objects-edge-aggressive");
 	} else
 		argv_array_push(&rp, "--objects");
 
diff --git a/list-objects.c b/list-objects.c
index 2910bec..2a139b6 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -157,7 +157,7 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge)
 
 		if (commit->object.flags & UNINTERESTING) {
 			mark_tree_uninteresting(commit->tree);
-			if (revs->edge_hint && !(commit->object.flags & SHOWN)) {
+			if (revs->edge_hint_aggressive && !(commit->object.flags & SHOWN)) {
 				commit->object.flags |= SHOWN;
 				show_edge(commit);
 			}
@@ -165,7 +165,7 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge)
 		}
 		mark_edge_parents_uninteresting(commit, revs, show_edge);
 	}
-	if (revs->edge_hint) {
+	if (revs->edge_hint_aggressive) {
 		for (i = 0; i < revs->cmdline.nr; i++) {
 			struct object *obj = revs->cmdline.rev[i].item;
 			struct commit *commit = (struct commit *)obj;
diff --git a/revision.c b/revision.c
index 75dda92..753dd2f 100644
--- a/revision.c
+++ b/revision.c
@@ -1853,6 +1853,12 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->tree_objects = 1;
 		revs->blob_objects = 1;
 		revs->edge_hint = 1;
+	} else if (!strcmp(arg, "--objects-edge-aggressive")) {
+		revs->tag_objects = 1;
+		revs->tree_objects = 1;
+		revs->blob_objects = 1;
+		revs->edge_hint = 1;
+		revs->edge_hint_aggressive = 1;
 	} else if (!strcmp(arg, "--verify-objects")) {
 		revs->tag_objects = 1;
 		revs->tree_objects = 1;
diff --git a/revision.h b/revision.h
index 9cb5adc..033a244 100644
--- a/revision.h
+++ b/revision.h
@@ -93,6 +93,7 @@ struct rev_info {
 			blob_objects:1,
 			verify_objects:1,
 			edge_hint:1,
+			edge_hint_aggressive:1,
 			limited:1,
 			unpacked:1,
 			boundary:2,
-- 
2.2.1.209.g41e5f3a

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v4 3/3] pack-objects: use --objects-edge-aggressive for shallow repos
  2014-12-24 23:05 [PATCH v4 0/3] Improve push performance with lots of refs brian m. carlson
  2014-12-24 23:05 ` [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt brian m. carlson
  2014-12-24 23:05 ` [PATCH v4 2/3] rev-list: add an option to mark fewer edges as uninteresting brian m. carlson
@ 2014-12-24 23:05 ` brian m. carlson
  2 siblings, 0 replies; 4+ messages in thread
From: brian m. carlson @ 2014-12-24 23:05 UTC (permalink / raw)
  To: git; +Cc: Duy Nguyen, Junio C Hamano

When fetching into or pushing from a shallow repository, we want to
aggressively mark edges as uninteresting, since this decreases the pack
size.  However, aggressively marking edges can negatively affect
performance on large non-shallow repositories with lots of refs.

Teach pack-objects a --shallow option to indicate that we're pushing
from or fetching into a shallow repository.  Use
--objects-edge-aggressive only for shallow repositories and otherwise
use --objects-edge, which performs better in the general case.  Update
the callers to pass the --shallow option when they are dealing with a
shallow repository.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 Documentation/git-pack-objects.txt | 7 ++++++-
 Documentation/rev-list-options.txt | 3 ++-
 builtin/pack-objects.c             | 7 ++++++-
 send-pack.c                        | 3 +++
 upload-pack.c                      | 4 +++-
 5 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index d2d8f47..c2f76fb 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -13,7 +13,7 @@ SYNOPSIS
 	[--no-reuse-delta] [--delta-base-offset] [--non-empty]
 	[--local] [--incremental] [--window=<n>] [--depth=<n>]
 	[--revs [--unpacked | --all]] [--stdout | base-name]
-	[--keep-true-parents] < object-list
+	[--shallow] [--keep-true-parents] < object-list
 
 
 DESCRIPTION
@@ -190,6 +190,11 @@ required objects and is thus unusable by Git without making it
 self-contained. Use `git index-pack --fix-thin`
 (see linkgit:git-index-pack[1]) to restore the self-contained property.
 
+--shallow::
+	Optimize a pack that will be provided to a client with a shallow
+	repository.  This option, combined with \--thin, can result in a
+	smaller pack at the cost of speed.
+
 --delta-base-offset::
 	A packed archive can express the base object of a delta as
 	either a 20-byte object name or as an offset in the
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 8cb6f92..2984f40 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -659,7 +659,8 @@ These options are mostly targeted for packing of Git repositories.
 
 --objects-edge-aggressive::
 	Similar to `--objects-edge`, but it tries harder to find excluded
-	commits at the cost of increased time.
+	commits at the cost of increased time.  This is used instead of
+	`--objects-edge` to build ``thin'' packs for shallow repositories.
 
 --unpacked::
 	Only useful with `--objects`; print the object IDs that are not
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f93a17c..d816587 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2613,6 +2613,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 {
 	int use_internal_rev_list = 0;
 	int thin = 0;
+	int shallow = 0;
 	int all_progress_implied = 0;
 	struct argv_array rp = ARGV_ARRAY_INIT;
 	int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
@@ -2677,6 +2678,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		  PARSE_OPT_OPTARG, option_parse_unpack_unreachable },
 		OPT_BOOL(0, "thin", &thin,
 			 N_("create thin packs")),
+		OPT_BOOL(0, "shallow", &shallow,
+			 N_("create packs suitable for shallow fetches")),
 		OPT_BOOL(0, "honor-pack-keep", &ignore_packed_keep,
 			 N_("ignore packs that have companion .keep file")),
 		OPT_INTEGER(0, "compression", &pack_compression_level,
@@ -2711,7 +2714,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	argv_array_push(&rp, "pack-objects");
 	if (thin) {
 		use_internal_rev_list = 1;
-		argv_array_push(&rp, "--objects-edge-aggressive");
+		argv_array_push(&rp, shallow
+				? "--objects-edge-aggressive"
+				: "--objects-edge");
 	} else
 		argv_array_push(&rp, "--objects");
 
diff --git a/send-pack.c b/send-pack.c
index 949cb61..25947d7 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -47,6 +47,7 @@ static int pack_objects(int fd, struct ref *refs, struct sha1_array *extra, stru
 		NULL,
 		NULL,
 		NULL,
+		NULL,
 	};
 	struct child_process po = CHILD_PROCESS_INIT;
 	int i;
@@ -60,6 +61,8 @@ static int pack_objects(int fd, struct ref *refs, struct sha1_array *extra, stru
 		argv[i++] = "-q";
 	if (args->progress)
 		argv[i++] = "--progress";
+	if (is_repository_shallow())
+		argv[i++] = "--shallow";
 	po.argv = argv;
 	po.in = -1;
 	po.out = args->stateless_rpc ? -1 : fd;
diff --git a/upload-pack.c b/upload-pack.c
index ac9ac15..b531a32 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -86,7 +86,7 @@ static void create_pack_file(void)
 		"corruption on the remote side.";
 	int buffered = -1;
 	ssize_t sz;
-	const char *argv[12];
+	const char *argv[13];
 	int i, arg = 0;
 	FILE *pipe_fd;
 
@@ -100,6 +100,8 @@ static void create_pack_file(void)
 		argv[arg++] = "--thin";
 
 	argv[arg++] = "--stdout";
+	if (shallow_nr)
+		argv[arg++] = "--shallow";
 	if (!no_progress)
 		argv[arg++] = "--progress";
 	if (use_ofs_delta)
-- 
2.2.1.209.g41e5f3a

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-12-24 23:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-24 23:05 [PATCH v4 0/3] Improve push performance with lots of refs brian m. carlson
2014-12-24 23:05 ` [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt brian m. carlson
2014-12-24 23:05 ` [PATCH v4 2/3] rev-list: add an option to mark fewer edges as uninteresting brian m. carlson
2014-12-24 23:05 ` [PATCH v4 3/3] pack-objects: use --objects-edge-aggressive for shallow repos brian m. carlson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).