From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Matthijs Kooijman" <matthijs@stdin.nl>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 6/6] list-objects: mark more commits as edges in mark_edges_uninteresting
Date: Fri, 16 Aug 2013 16:52:07 +0700 [thread overview]
Message-ID: <1376646727-22318-6-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <1376646727-22318-1-git-send-email-pclouds@gmail.com>
The purpose of edge commits is to let pack-objects know what objects
it can use as base, but does not need to include in the thin pack
because the other side is supposed to already have them. So far we
mark uninteresting parents of interesting commits as edges. But even
an unrelated uninteresting commit (that the other side has) may become
a good base for pack-objects and help produce more efficient packs.
This is especially true for shallow clone, when the client issues a
fetch with a depth smaller or equal to the number of commits the
server is ahead of the client. For example, in this commit history the
client has up to "A" and the server has up to "B":
-------A---B
have--^ ^
/
want--+
If depth 1 is requested, the commit list to send to the client
includes only B. The way m_e_u is working, it checks if parent commits
of B are uninteresting, if so mark them as edges. Due to shallow
effect, commit B is grafted to have no parents and the revision walker
never sees A as the parent of B. In fact it marks no edges at all in
this simple case and sends everything B has to the client even if it
could have excluded what A and also the client already have. In a
slightly different case where A is not a direct parent of B (iow there
are commits in between A and B), marking A as an edge can still save
some because B may still have stuff from the far ancestor A.
There is another case from the previous patch, when we deepen a ref
from C->E to A->E:
---A---B C---D---E
want--^ ^ ^
shallow-+ /
have-------+
In this case we need to send A and B to the client, and C (i.e. the
current shallow point that the client informs the server) is a very
good base because it's closet to A and B. Normal m_e_u won't recognize
C as an edge because it only looks back to parents (i.e. A<-B) not the
opposite way B->C even if C is already marked as uninteresting commit
by the previous patch.
This patch includes all uninteresting commits from command line as
edges and lets pack-objects decide what's best to do. The upside is we
have better chance of producing better packs in certain cases. The
downside is we may need to process some extra objects on the server
side.
For the shallow case on git.git, when the client is 5 commits behind
and does "fetch --depth=3", the result pack is 99.26 KiB instead of
4.92 MiB.
Reported-and-analyzed-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
list-objects.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/list-objects.c b/list-objects.c
index db8ee4f..05c8c5c 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -148,15 +148,32 @@ static void mark_edge_parents_uninteresting(struct commit *commit,
void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge)
{
struct commit_list *list;
+ int i;
+
for (list = revs->commits; list; list = list->next) {
struct commit *commit = list->item;
if (commit->object.flags & UNINTERESTING) {
mark_tree_uninteresting(commit->tree);
+ if (revs->edge_hint && !(commit->object.flags & SHOWN)) {
+ commit->object.flags |= SHOWN;
+ show_edge(commit);
+ }
continue;
}
mark_edge_parents_uninteresting(commit, revs, show_edge);
}
+ for (i = 0; i < revs->cmdline.nr; i++) {
+ struct object *obj = revs->cmdline.rev[i].item;
+ struct commit *commit = (struct commit *)obj;
+ if (obj->type != OBJ_COMMIT || !(obj->flags & UNINTERESTING))
+ continue;
+ mark_tree_uninteresting(commit->tree);
+ if (revs->edge_hint && !(obj->flags & SHOWN)) {
+ obj->flags |= SHOWN;
+ show_edge(commit);
+ }
+ }
}
static void add_pending_tree(struct rev_info *revs, struct tree *tree)
--
1.8.2.82.gc24b958
next prev parent reply other threads:[~2013-08-16 9:52 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-11 22:01 [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects Matthijs Kooijman
2013-07-11 22:53 ` Junio C Hamano
2013-07-12 7:11 ` Matthijs Kooijman
2013-08-07 10:27 ` Matthijs Kooijman
2013-08-08 1:01 ` Junio C Hamano
2013-08-08 1:09 ` Duy Nguyen
2013-08-08 6:39 ` Junio C Hamano
2013-08-08 4:50 ` Duy Nguyen
2013-08-08 6:51 ` Junio C Hamano
2013-08-08 7:21 ` Duy Nguyen
2013-08-08 17:10 ` Junio C Hamano
2013-08-09 13:13 ` Duy Nguyen
2013-08-12 8:02 ` Matthijs Kooijman
2013-08-16 9:51 ` Duy Nguyen
2013-08-16 9:52 ` [PATCH 1/6] Move setup_alternate_shallow and write_shallow_commits to shallow.c Nguyễn Thái Ngọc Duy
2013-08-16 9:52 ` [PATCH 2/6] shallow: only add shallow graft points to new shallow file Nguyễn Thái Ngọc Duy
2013-08-16 23:50 ` Eric Sunshine
2013-08-16 9:52 ` [PATCH 3/6] shallow: add setup_temporary_shallow() Nguyễn Thái Ngọc Duy
2013-08-16 23:52 ` Eric Sunshine
2013-08-16 9:52 ` [PATCH 4/6] upload-pack: delegate rev walking in shallow fetch to pack-objects Nguyễn Thái Ngọc Duy
2013-08-28 14:52 ` Matthijs Kooijman
2013-08-29 9:48 ` Duy Nguyen
2013-08-16 9:52 ` [PATCH 5/6] list-objects: reduce one argument in mark_edges_uninteresting Nguyễn Thái Ngọc Duy
2013-08-16 9:52 ` Nguyễn Thái Ngọc Duy [this message]
2013-08-28 15:36 ` [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects Matthijs Kooijman
2013-08-28 16:02 ` [PATCH] Add testcase for needless objects during a shallow fetch Matthijs Kooijman
2013-08-29 9:50 ` Duy Nguyen
2013-08-31 1:25 ` Duy Nguyen
2013-10-21 7:51 ` [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects Matthijs Kooijman
2013-10-26 10:49 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1376646727-22318-6-git-send-email-pclouds@gmail.com \
--to=pclouds@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=matthijs@stdin.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).