* [PATCH v4 0/3] Improve push performance with lots of refs
@ 2014-12-24 23:05 brian m. carlson
2014-12-24 23:05 ` [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt brian m. carlson
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: brian m. carlson @ 2014-12-24 23:05 UTC (permalink / raw)
To: git; +Cc: Duy Nguyen, Junio C Hamano
This series contains patches to address a significant push performance
regression in repositories with large amounts of refs. It avoids
performing expensive edge marking unless the repository is shallow.
The first patch in the series is a fix for a minor typo I discovered
when editing the documentation. The second patch implements git
rev-list --objects-edge-aggressive, and the third patch ensures it's
used for pushing to and fetching from shallow repos only.
The changes from v3 are to use --objects-edge-aggressive from the point
it's introduced (this preserves bisectability) and to make higher-level
commands pass --shallow for any shallow pushing and fetching instead of
trying to have pack-objects determine it.
The original fix was suggested by Duy Nguyen.
brian m. carlson (3):
Documentation: add missing article in rev-list-options.txt
rev-list: add an option to mark fewer edges as uninteresting
pack-objects: use --objects-edge-aggressive for shallow repos
Documentation/git-pack-objects.txt | 7 ++++++-
Documentation/git-rev-list.txt | 3 ++-
Documentation/rev-list-options.txt | 7 ++++++-
builtin/pack-objects.c | 7 ++++++-
list-objects.c | 4 ++--
revision.c | 6 ++++++
revision.h | 1 +
send-pack.c | 3 +++
upload-pack.c | 4 +++-
9 files changed, 35 insertions(+), 7 deletions(-)
--
2.2.1.209.g41e5f3a
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt
2014-12-24 23:05 [PATCH v4 0/3] Improve push performance with lots of refs brian m. carlson
@ 2014-12-24 23:05 ` brian m. carlson
2014-12-24 23:05 ` [PATCH v4 2/3] rev-list: add an option to mark fewer edges as uninteresting brian m. carlson
2014-12-24 23:05 ` [PATCH v4 3/3] pack-objects: use --objects-edge-aggressive for shallow repos brian m. carlson
2 siblings, 0 replies; 4+ messages in thread
From: brian m. carlson @ 2014-12-24 23:05 UTC (permalink / raw)
To: git; +Cc: Duy Nguyen, Junio C Hamano
Add the missing article "a".
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
Documentation/rev-list-options.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index afccfdc..2277fcb 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -653,7 +653,7 @@ These options are mostly targeted for packing of Git repositories.
--objects-edge::
Similar to `--objects`, but also print the IDs of excluded
commits prefixed with a ``-'' character. This is used by
- linkgit:git-pack-objects[1] to build ``thin'' pack, which records
+ linkgit:git-pack-objects[1] to build a ``thin'' pack, which records
objects in deltified form based on objects contained in these
excluded commits to reduce network traffic.
--
2.2.1.209.g41e5f3a
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v4 2/3] rev-list: add an option to mark fewer edges as uninteresting
2014-12-24 23:05 [PATCH v4 0/3] Improve push performance with lots of refs brian m. carlson
2014-12-24 23:05 ` [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt brian m. carlson
@ 2014-12-24 23:05 ` brian m. carlson
2014-12-24 23:05 ` [PATCH v4 3/3] pack-objects: use --objects-edge-aggressive for shallow repos brian m. carlson
2 siblings, 0 replies; 4+ messages in thread
From: brian m. carlson @ 2014-12-24 23:05 UTC (permalink / raw)
To: git; +Cc: Duy Nguyen, Junio C Hamano
In commit fbd4a70 (list-objects: mark more commits as edges in
mark_edges_uninteresting - 2013-08-16), we marked an increasing number
of edges uninteresting. This change, and the subsequent change to make
this conditional on --objects-edge, are used by --thin to make much
smaller packs for shallow clones.
Unfortunately, they cause a significant performance regression when
pushing non-shallow clones with lots of refs (23.322 seconds vs.
4.785 seconds with 22400 refs). Add an option to git rev-list,
--objects-edge-aggressive, that preserves this more aggressive behavior,
while leaving --objects-edge to provide more performant behavior.
Preserve the current behavior for the moment by using the aggressive
option.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
Documentation/git-rev-list.txt | 3 ++-
Documentation/rev-list-options.txt | 4 ++++
builtin/pack-objects.c | 2 +-
list-objects.c | 4 ++--
revision.c | 6 ++++++
revision.h | 1 +
6 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index fd7f8b5..5b11922 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -46,7 +46,8 @@ SYNOPSIS
[ \--extended-regexp | -E ]
[ \--fixed-strings | -F ]
[ \--date=(local|relative|default|iso|iso-strict|rfc|short) ]
- [ [\--objects | \--objects-edge] [ \--unpacked ] ]
+ [ [ \--objects | \--objects-edge | \--objects-edge-aggressive ]
+ [ \--unpacked ] ]
[ \--pretty | \--header ]
[ \--bisect ]
[ \--bisect-vars ]
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 2277fcb..8cb6f92 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -657,6 +657,10 @@ These options are mostly targeted for packing of Git repositories.
objects in deltified form based on objects contained in these
excluded commits to reduce network traffic.
+--objects-edge-aggressive::
+ Similar to `--objects-edge`, but it tries harder to find excluded
+ commits at the cost of increased time.
+
--unpacked::
Only useful with `--objects`; print the object IDs that are not
in packs.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 3f9f5c7..f93a17c 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2711,7 +2711,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
argv_array_push(&rp, "pack-objects");
if (thin) {
use_internal_rev_list = 1;
- argv_array_push(&rp, "--objects-edge");
+ argv_array_push(&rp, "--objects-edge-aggressive");
} else
argv_array_push(&rp, "--objects");
diff --git a/list-objects.c b/list-objects.c
index 2910bec..2a139b6 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -157,7 +157,7 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge)
if (commit->object.flags & UNINTERESTING) {
mark_tree_uninteresting(commit->tree);
- if (revs->edge_hint && !(commit->object.flags & SHOWN)) {
+ if (revs->edge_hint_aggressive && !(commit->object.flags & SHOWN)) {
commit->object.flags |= SHOWN;
show_edge(commit);
}
@@ -165,7 +165,7 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge)
}
mark_edge_parents_uninteresting(commit, revs, show_edge);
}
- if (revs->edge_hint) {
+ if (revs->edge_hint_aggressive) {
for (i = 0; i < revs->cmdline.nr; i++) {
struct object *obj = revs->cmdline.rev[i].item;
struct commit *commit = (struct commit *)obj;
diff --git a/revision.c b/revision.c
index 75dda92..753dd2f 100644
--- a/revision.c
+++ b/revision.c
@@ -1853,6 +1853,12 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
revs->tree_objects = 1;
revs->blob_objects = 1;
revs->edge_hint = 1;
+ } else if (!strcmp(arg, "--objects-edge-aggressive")) {
+ revs->tag_objects = 1;
+ revs->tree_objects = 1;
+ revs->blob_objects = 1;
+ revs->edge_hint = 1;
+ revs->edge_hint_aggressive = 1;
} else if (!strcmp(arg, "--verify-objects")) {
revs->tag_objects = 1;
revs->tree_objects = 1;
diff --git a/revision.h b/revision.h
index 9cb5adc..033a244 100644
--- a/revision.h
+++ b/revision.h
@@ -93,6 +93,7 @@ struct rev_info {
blob_objects:1,
verify_objects:1,
edge_hint:1,
+ edge_hint_aggressive:1,
limited:1,
unpacked:1,
boundary:2,
--
2.2.1.209.g41e5f3a
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v4 3/3] pack-objects: use --objects-edge-aggressive for shallow repos
2014-12-24 23:05 [PATCH v4 0/3] Improve push performance with lots of refs brian m. carlson
2014-12-24 23:05 ` [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt brian m. carlson
2014-12-24 23:05 ` [PATCH v4 2/3] rev-list: add an option to mark fewer edges as uninteresting brian m. carlson
@ 2014-12-24 23:05 ` brian m. carlson
2 siblings, 0 replies; 4+ messages in thread
From: brian m. carlson @ 2014-12-24 23:05 UTC (permalink / raw)
To: git; +Cc: Duy Nguyen, Junio C Hamano
When fetching into or pushing from a shallow repository, we want to
aggressively mark edges as uninteresting, since this decreases the pack
size. However, aggressively marking edges can negatively affect
performance on large non-shallow repositories with lots of refs.
Teach pack-objects a --shallow option to indicate that we're pushing
from or fetching into a shallow repository. Use
--objects-edge-aggressive only for shallow repositories and otherwise
use --objects-edge, which performs better in the general case. Update
the callers to pass the --shallow option when they are dealing with a
shallow repository.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
Documentation/git-pack-objects.txt | 7 ++++++-
Documentation/rev-list-options.txt | 3 ++-
builtin/pack-objects.c | 7 ++++++-
send-pack.c | 3 +++
upload-pack.c | 4 +++-
5 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index d2d8f47..c2f76fb 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -13,7 +13,7 @@ SYNOPSIS
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
[--local] [--incremental] [--window=<n>] [--depth=<n>]
[--revs [--unpacked | --all]] [--stdout | base-name]
- [--keep-true-parents] < object-list
+ [--shallow] [--keep-true-parents] < object-list
DESCRIPTION
@@ -190,6 +190,11 @@ required objects and is thus unusable by Git without making it
self-contained. Use `git index-pack --fix-thin`
(see linkgit:git-index-pack[1]) to restore the self-contained property.
+--shallow::
+ Optimize a pack that will be provided to a client with a shallow
+ repository. This option, combined with \--thin, can result in a
+ smaller pack at the cost of speed.
+
--delta-base-offset::
A packed archive can express the base object of a delta as
either a 20-byte object name or as an offset in the
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 8cb6f92..2984f40 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -659,7 +659,8 @@ These options are mostly targeted for packing of Git repositories.
--objects-edge-aggressive::
Similar to `--objects-edge`, but it tries harder to find excluded
- commits at the cost of increased time.
+ commits at the cost of increased time. This is used instead of
+ `--objects-edge` to build ``thin'' packs for shallow repositories.
--unpacked::
Only useful with `--objects`; print the object IDs that are not
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f93a17c..d816587 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2613,6 +2613,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
{
int use_internal_rev_list = 0;
int thin = 0;
+ int shallow = 0;
int all_progress_implied = 0;
struct argv_array rp = ARGV_ARRAY_INIT;
int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
@@ -2677,6 +2678,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
PARSE_OPT_OPTARG, option_parse_unpack_unreachable },
OPT_BOOL(0, "thin", &thin,
N_("create thin packs")),
+ OPT_BOOL(0, "shallow", &shallow,
+ N_("create packs suitable for shallow fetches")),
OPT_BOOL(0, "honor-pack-keep", &ignore_packed_keep,
N_("ignore packs that have companion .keep file")),
OPT_INTEGER(0, "compression", &pack_compression_level,
@@ -2711,7 +2714,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
argv_array_push(&rp, "pack-objects");
if (thin) {
use_internal_rev_list = 1;
- argv_array_push(&rp, "--objects-edge-aggressive");
+ argv_array_push(&rp, shallow
+ ? "--objects-edge-aggressive"
+ : "--objects-edge");
} else
argv_array_push(&rp, "--objects");
diff --git a/send-pack.c b/send-pack.c
index 949cb61..25947d7 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -47,6 +47,7 @@ static int pack_objects(int fd, struct ref *refs, struct sha1_array *extra, stru
NULL,
NULL,
NULL,
+ NULL,
};
struct child_process po = CHILD_PROCESS_INIT;
int i;
@@ -60,6 +61,8 @@ static int pack_objects(int fd, struct ref *refs, struct sha1_array *extra, stru
argv[i++] = "-q";
if (args->progress)
argv[i++] = "--progress";
+ if (is_repository_shallow())
+ argv[i++] = "--shallow";
po.argv = argv;
po.in = -1;
po.out = args->stateless_rpc ? -1 : fd;
diff --git a/upload-pack.c b/upload-pack.c
index ac9ac15..b531a32 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -86,7 +86,7 @@ static void create_pack_file(void)
"corruption on the remote side.";
int buffered = -1;
ssize_t sz;
- const char *argv[12];
+ const char *argv[13];
int i, arg = 0;
FILE *pipe_fd;
@@ -100,6 +100,8 @@ static void create_pack_file(void)
argv[arg++] = "--thin";
argv[arg++] = "--stdout";
+ if (shallow_nr)
+ argv[arg++] = "--shallow";
if (!no_progress)
argv[arg++] = "--progress";
if (use_ofs_delta)
--
2.2.1.209.g41e5f3a
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-12-24 23:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-24 23:05 [PATCH v4 0/3] Improve push performance with lots of refs brian m. carlson
2014-12-24 23:05 ` [PATCH v4 1/3] Documentation: add missing article in rev-list-options.txt brian m. carlson
2014-12-24 23:05 ` [PATCH v4 2/3] rev-list: add an option to mark fewer edges as uninteresting brian m. carlson
2014-12-24 23:05 ` [PATCH v4 3/3] pack-objects: use --objects-edge-aggressive for shallow repos brian m. carlson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).