* "HEAD -> branch" decoration doesn't work with "--decorate=full"
@ 2015-05-13 13:11 Michael Haggerty
2015-05-13 14:51 ` Junio C Hamano
2015-05-13 17:11 ` Junio C Hamano
0 siblings, 2 replies; 17+ messages in thread
From: Michael Haggerty @ 2015-05-13 13:11 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git discussion list
Hi,
The new-style "HEAD -> branch" style decoration doesn't work when
"--decorate=full" is used:
> $ bin-wrappers/git show --oneline --decorate
> c518059 (HEAD -> master, gitster/master) Merge branch 'maint'
>
> $ bin-wrappers/git show --oneline --decorate=full
> c518059 (HEAD, refs/remotes/gitster/master, refs/heads/master) Merge branch 'maint'
I would have expected the second invocation to show "HEAD ->
refs/heads/master".
Was that an oversight or a conscious decision?
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "HEAD -> branch" decoration doesn't work with "--decorate=full" 2015-05-13 13:11 "HEAD -> branch" decoration doesn't work with "--decorate=full" Michael Haggerty @ 2015-05-13 14:51 ` Junio C Hamano 2015-05-13 15:26 ` Michael J Gruber 2015-05-13 17:11 ` Junio C Hamano 1 sibling, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2015-05-13 14:51 UTC (permalink / raw) To: Michael Haggerty; +Cc: git discussion list, Michael J Gruber Michael Haggerty <mhagger@alum.mit.edu> writes: > The new-style "HEAD -> branch" style decoration doesn't work when > "--decorate=full" is used: > ... > Was that an oversight or a conscious decision? I do not recall making such a decision, and I doubt (the other) Michael wanted that way, either, so patches welcome, perhaps? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "HEAD -> branch" decoration doesn't work with "--decorate=full" 2015-05-13 14:51 ` Junio C Hamano @ 2015-05-13 15:26 ` Michael J Gruber 0 siblings, 0 replies; 17+ messages in thread From: Michael J Gruber @ 2015-05-13 15:26 UTC (permalink / raw) To: Junio C Hamano, Michael Haggerty; +Cc: git discussion list Junio C Hamano venit, vidit, dixit 13.05.2015 16:51: > Michael Haggerty <mhagger@alum.mit.edu> writes: > >> The new-style "HEAD -> branch" style decoration doesn't work when >> "--decorate=full" is used: >> ... >> Was that an oversight or a conscious decision? > > I do not recall making such a decision, and I doubt (the other) > Michael wanted that way, either, so patches welcome, perhaps? I was not aware of an alternate codepath for full decorations (and am surprised about it). I can't look into this before next week, but --decorate=full should produce "HEAD -> refs/heads/master" and such. I even vaguely remember testing it, but that may have been for v1 which turned different knobs than the end version. Michael ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: "HEAD -> branch" decoration doesn't work with "--decorate=full" 2015-05-13 13:11 "HEAD -> branch" decoration doesn't work with "--decorate=full" Michael Haggerty 2015-05-13 14:51 ` Junio C Hamano @ 2015-05-13 17:11 ` Junio C Hamano 2015-05-13 17:13 ` Junio C Hamano 1 sibling, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2015-05-13 17:11 UTC (permalink / raw) To: Michael Haggerty; +Cc: git discussion list Michael Haggerty <mhagger@alum.mit.edu> writes: > The new-style "HEAD -> branch" style decoration doesn't work when > "--decorate=full" is used: > >> $ bin-wrappers/git show --oneline --decorate >> c518059 (HEAD -> master, gitster/master) Merge branch 'maint' >> >> $ bin-wrappers/git show --oneline --decorate=full >> c518059 (HEAD, refs/remotes/gitster/master, refs/heads/master) Merge branch 'maint' > > I would have expected the second invocation to show "HEAD -> > refs/heads/master". > > Was that an oversight or a conscious decision? I actually think this ultimately comes from a poor design of the name-decorations infrastructure. The program is expected to call load_ref_decorations() only once and make the choice between the full/short at that point, which is passed to add_ref_decoration() to record either 'refs/heads/master' or 'master' in the singleton name_decoration decoration. But it does not store which one was chosen by the caller of load_ref_decorations() anywhere in the subsystem. When current_pointed_by_HEAD() wants to see if decorations on an object, e.g. 'master', matches what 'HEAD' resolves to, it cannot tell if the original set-up was done for the full decoration, and the current code just assumes (without even realizing that it is making that assumption) the decoration must have been set up for the short ones. Perhaps something like this, but I am not committing it without tests or a proper log messge. I moved "static loaded" outside as it is in the same category as the global name-decoration and decoration-flags, i.e. to be initialised once at the beginning to a fixed setting and then be used with that setting. log-tree.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/log-tree.c b/log-tree.c index 2c1ed0f..92259bc 100644 --- a/log-tree.c +++ b/log-tree.c @@ -13,6 +13,8 @@ #include "line-log.h" static struct decoration name_decoration = { "object names" }; +static int decoration_loaded; +static int decoration_flags; static char decoration_colors[][COLOR_MAXLEN] = { GIT_COLOR_RESET, @@ -146,9 +148,9 @@ static int add_graft_decoration(const struct commit_graft *graft, void *cb_data) void load_ref_decorations(int flags) { - static int loaded; - if (!loaded) { - loaded = 1; + if (!decoration_loaded) { + decoration_loaded = 1; + decoration_flags = flags; for_each_ref(add_ref_decoration, &flags); head_ref(add_ref_decoration, &flags); for_each_commit_graft(add_graft_decoration, NULL); @@ -196,8 +198,19 @@ static const struct name_decoration *current_pointed_by_HEAD(const struct name_d branch_name = resolve_ref_unsafe("HEAD", 0, unused, &rru_flags); if (!(rru_flags & REF_ISSYMREF)) return NULL; - if (!skip_prefix(branch_name, "refs/heads/", &branch_name)) - return NULL; + + if ((decoration_flags == DECORATE_SHORT_REFS)) { + if (!skip_prefix(branch_name, "refs/heads/", &branch_name)) + return NULL; + } else { + /* + * Each decoration has a refname in full; keep + * branch_name also in full, but still make sure + * HEAD is a reasonable ref. + */ + if (!starts_with(branch_name, "refs/")) + return NULL; + } /* OK, do we have that ref in the list? */ for (list = decoration; list; list = list->next) ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: "HEAD -> branch" decoration doesn't work with "--decorate=full" 2015-05-13 17:11 ` Junio C Hamano @ 2015-05-13 17:13 ` Junio C Hamano 2015-05-13 19:40 ` [PATCH 2/2] log: do not shorten decoration names too early Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2015-05-13 17:13 UTC (permalink / raw) To: Michael Haggerty; +Cc: git discussion list Junio C Hamano <gitster@pobox.com> writes: > Michael Haggerty <mhagger@alum.mit.edu> writes: > >> Was that an oversight or a conscious decision? > > I actually think this ultimately comes from a poor design of the > name-decorations infrastructure. I should have said "poor design of the way how the name-decorations infrastructure is used in log-tree (and commands in the log family)". The name-decorations infrastructure itself is just fine. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/2] log: do not shorten decoration names too early 2015-05-13 17:13 ` Junio C Hamano @ 2015-05-13 19:40 ` Junio C Hamano 2015-05-14 6:33 ` Jeff King 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2015-05-13 19:40 UTC (permalink / raw) To: Michael Haggerty; +Cc: git discussion list The DECORATE_SHORT_REFS option given to load_ref_decorations() affects the way a copy of the refname is stored for each decorated commit, and this forces later steps like current_pointed_by_HEAD() to adjust their behaviour based on this initial settings. Instead, we can always store the full refname and then shorten them when producing the output. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- * [1/2] is just the earlier "this should fix it" patch, with adjustments to the existing tests. I suspect that it may be a good idea to lose the decoration_flags from load_ref_decorations() and instead make that a new parameter to format_decorations(). That way, the caller could decide which ones to use. It is not unconceivable to extend "log --format=%d" that shows the decoration in the style given by --decorate arg and let the callers specify two additional formats (i.e. decorate always short, decorate always in full), and for that kind of work, this patch will become a prerequisite. log-tree.c | 34 ++++++++++++++++------------------ 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/log-tree.c b/log-tree.c index 92259bc..c931615 100644 --- a/log-tree.c +++ b/log-tree.c @@ -94,6 +94,8 @@ static int add_ref_decoration(const char *refname, const unsigned char *sha1, in struct object *obj; enum decoration_type type = DECORATION_NONE; + assert(cb_data == NULL); + if (starts_with(refname, "refs/replace/")) { unsigned char original_sha1[20]; if (!check_replace_refs) @@ -123,8 +125,6 @@ static int add_ref_decoration(const char *refname, const unsigned char *sha1, in else if (!strcmp(refname, "HEAD")) type = DECORATION_REF_HEAD; - if (!cb_data || *(int *)cb_data == DECORATE_SHORT_REFS) - refname = prettify_refname(refname); add_name_decoration(type, refname, obj); while (obj->type == OBJ_TAG) { obj = ((struct tag *)obj)->tagged; @@ -151,8 +151,8 @@ void load_ref_decorations(int flags) if (!decoration_loaded) { decoration_loaded = 1; decoration_flags = flags; - for_each_ref(add_ref_decoration, &flags); - head_ref(add_ref_decoration, &flags); + for_each_ref(add_ref_decoration, NULL); + head_ref(add_ref_decoration, NULL); for_each_commit_graft(add_graft_decoration, NULL); } } @@ -199,18 +199,8 @@ static const struct name_decoration *current_pointed_by_HEAD(const struct name_d if (!(rru_flags & REF_ISSYMREF)) return NULL; - if ((decoration_flags == DECORATE_SHORT_REFS)) { - if (!skip_prefix(branch_name, "refs/heads/", &branch_name)) - return NULL; - } else { - /* - * Each decoration has a refname in full; keep - * branch_name also in full, but still make sure - * HEAD is a reasonable ref. - */ - if (!starts_with(branch_name, "refs/")) - return NULL; - } + if (!starts_with(branch_name, "refs/")) + return NULL; /* OK, do we have that ref in the list? */ for (list = decoration; list; list = list->next) @@ -222,6 +212,14 @@ static const struct name_decoration *current_pointed_by_HEAD(const struct name_d return NULL; } +static void show_name(struct strbuf *sb, const struct name_decoration *decoration) +{ + if (decoration_flags == DECORATE_SHORT_REFS) + strbuf_addstr(sb, prettify_refname(decoration->name)); + else + strbuf_addstr(sb, decoration->name); +} + /* * The caller makes sure there is no funny color before calling. * format_decorations_extended makes sure the same after return. @@ -259,7 +257,7 @@ void format_decorations_extended(struct strbuf *sb, if (decoration->type == DECORATION_REF_TAG) strbuf_addstr(sb, "tag: "); - strbuf_addstr(sb, decoration->name); + show_name(sb, decoration); if (current_and_HEAD && decoration->type == DECORATION_REF_HEAD) { @@ -268,7 +266,7 @@ void format_decorations_extended(struct strbuf *sb, strbuf_addstr(sb, " -> "); strbuf_addstr(sb, color_reset); strbuf_addstr(sb, decorate_get_color(use_color, current_and_HEAD->type)); - strbuf_addstr(sb, current_and_HEAD->name); + show_name(sb, current_and_HEAD); } strbuf_addstr(sb, color_reset); ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-13 19:40 ` [PATCH 2/2] log: do not shorten decoration names too early Junio C Hamano @ 2015-05-14 6:33 ` Jeff King 2015-05-14 17:37 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Jeff King @ 2015-05-14 6:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list On Wed, May 13, 2015 at 12:40:35PM -0700, Junio C Hamano wrote: > The DECORATE_SHORT_REFS option given to load_ref_decorations() > affects the way a copy of the refname is stored for each decorated > commit, and this forces later steps like current_pointed_by_HEAD() > to adjust their behaviour based on this initial settings. > > Instead, we can always store the full refname and then shorten them > when producing the output. > > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > > * [1/2] is just the earlier "this should fix it" patch, with > adjustments to the existing tests. Nice. After reading the first one, I was wondering why it did not look like this one. :) > I suspect that it may be a good idea to lose the decoration_flags > from load_ref_decorations() and instead make that a new parameter > to format_decorations(). That way, the caller could decide which > ones to use. It is not unconceivable to extend "log --format=%d" > that shows the decoration in the style given by --decorate arg > and let the callers specify two additional formats (i.e. decorate > always short, decorate always in full), and for that kind of > work, this patch will become a prerequisite. Yeah, agreed. While we are on the subject of the name_decoration code, I had considered at one point replacing the use of the decorate.[ch] hash table with a commit_slab (you can't do it in the general case, because decorate.[ch] handles arbitrary objects, but the name_decoration code only does commits). It would in theory be faster, though I don't know if the time we spend on the hash table is actually measurable (we make a lot of queries on it, but it doesn't actually get that big in the first place). In case you are looking for something to do with your copious free time. :) -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 6:33 ` Jeff King @ 2015-05-14 17:37 ` Junio C Hamano 2015-05-14 17:49 ` Jeff King 2015-05-14 21:49 ` Junio C Hamano 0 siblings, 2 replies; 17+ messages in thread From: Junio C Hamano @ 2015-05-14 17:37 UTC (permalink / raw) To: Jeff King; +Cc: Michael Haggerty, git discussion list Jeff King <peff@peff.net> writes: > While we are on the subject of the name_decoration code, I had > considered at one point replacing the use of the decorate.[ch] hash > table with a commit_slab (you can't do it in the general case, because > decorate.[ch] handles arbitrary objects, but the name_decoration code > only does commits). It would in theory be faster, though I don't know if > the time we spend on the hash table is actually measurable (we make a > lot of queries on it, but it doesn't actually get that big in the first > place). Hmmm, but I do not know if commit_slab is a good fit for the usage pattern. I expect commit objects to be fairly sparsely decorated (e.g. a tag or ref pointing at say 1-2% of commits or fewer). Wouldn't the hashtable used by decorate.[ch] with the max load factor capped to 66% a better economy? I notice that there is no API into commit_slab to ask "Does this commit have data in the slab?" *slabname##_at() may be the closest thing, but that would allocate the space and then says "This is the slot for that commit; go check if there is data there already." In the context of using commit_slab in log-tree.c for decoration, it would mean that we assign low slab indices to commits at the tips by first calling "for_each_ref(add_ref_decoration)" and populate the slab fairly densely at the beginning. But when we check if a commit that we encountered during a traversal is decorated or not, we would ask *slabname##_at() and that ends up enlarging the slab, even at that point the only thing we are interested in is if the commit is decorated and we are not adding a new decoration for it. For example, we have this in commit.c: const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) { struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); if (sizep) *sizep = v->size; return v->buffer; } But if we do not have the "buffer" data cached for that commit (via an earlier call to set_commit_buffer()), we don't have to enlarge the slab, as we are not adding anything to the slab system with this call. Perhaps we want a new function *slabname##_peek() with the same signature as *slabname##_at() that returns NULL when commit->index is larger than the last existing element in the slab? Then the above would become more like: const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) { struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); if (!v) return NULL; if (sizep) *sizep = v->size; return v->buffer; } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 17:37 ` Junio C Hamano @ 2015-05-14 17:49 ` Jeff King 2015-05-14 18:01 ` Junio C Hamano 2015-05-14 21:49 ` Junio C Hamano 1 sibling, 1 reply; 17+ messages in thread From: Jeff King @ 2015-05-14 17:49 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list On Thu, May 14, 2015 at 10:37:39AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > While we are on the subject of the name_decoration code, I had > > considered at one point replacing the use of the decorate.[ch] hash > > table with a commit_slab (you can't do it in the general case, because > > decorate.[ch] handles arbitrary objects, but the name_decoration code > > only does commits). It would in theory be faster, though I don't know if > > the time we spend on the hash table is actually measurable (we make a > > lot of queries on it, but it doesn't actually get that big in the first > > place). > > Hmmm, but I do not know if commit_slab is a good fit for the usage > pattern. I expect commit objects to be fairly sparsely decorated > (e.g. a tag or ref pointing at say 1-2% of commits or fewer). > Wouldn't the hashtable used by decorate.[ch] with the max load > factor capped to 66% a better economy? Good point. A slab would be less memory efficient, but cheaper to look up (it is a direct memory reference, with no probing and no hashcmp()). But cache effects matter, so it could even be slower. On the other hand, the slab makes it easy to actually store the real type (struct name_decoration), whereas the decorate hash stores only pointers. So we'd save an extra malloc/pointer in each case. So with your slab_peek() below, I'd guess that the slab would actually be faster. But I'd also be unsurprised if it makes no appreciable difference to the overall runtime of "git log --decorate". I think we'd have to build it and profile (and please feel free to say "eh, not worth the time to think about further"). > I notice that there is no API into commit_slab to ask "Does this > commit have data in the slab?" *slabname##_at() may be the closest > thing, but that would allocate the space and then says "This is the > slot for that commit; go check if there is data there already." Yes. I think it's not a big deal if your slab is reasonably full (you'll extend it to the full size of your commit space eventually either way). But for a sparse slab, it does make that query much more expensive than it needs to be. > Perhaps we want a new function *slabname##_peek() with the same > signature as *slabname##_at() that returns NULL when commit->index > is larger than the last existing element in the slab? Then the > above would become more like: > > const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) > { > struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); > if (!v) > return NULL; > if (sizep) > *sizep = v->size; > return v->buffer; > } Yeah, I agree that is a good interface. It is basically a read-only version of slab_at(). And it should be similarly agnostic as to the type we are storing, because the pointer is into the slab array. I'm not sure that the memory overhead is that big a deal (even in the kernel, we are only talking about a few megabytes). But it is wasteful, and the interface above should be trivial to write, so it might be worth doing anyway. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 17:49 ` Jeff King @ 2015-05-14 18:01 ` Junio C Hamano 2015-05-14 18:10 ` Jeff King 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2015-05-14 18:01 UTC (permalink / raw) To: Jeff King; +Cc: Michael Haggerty, git discussion list Jeff King <peff@peff.net> writes: > On Thu, May 14, 2015 at 10:37:39AM -0700, Junio C Hamano wrote: > ... >> Wouldn't the hashtable used by decorate.[ch] with the max load >> factor capped to 66% a better economy? > > Good point. A slab would be less memory efficient, but cheaper to look > up (it is a direct memory reference, with no probing and no hashcmp()). > But cache effects matter, so it could even be slower. Yes, that was what I meant by economy. I do not think memory footprint is free in that sense. > On the other hand, the slab makes it easy to actually store the real > type (struct name_decoration), whereas the decorate hash stores only > pointers. So we'd save an extra malloc/pointer in each case. > > So with your slab_peek() below, I'd guess that the slab would actually > be faster. But I'd also be unsurprised if it makes no appreciable > difference to the overall runtime of "git log --decorate". I think we'd > have to build it and profile (and please feel free to say "eh, not worth > the time to think about further"). While I think *slabname##_peek() would be worth doing regardless of this caller, I suspect that the major overhead of decorate code would come from the for_each_ref() that jumps deep into the history to parse old commits; it would trigger a lot of unpacking of objects deep in the delta chain, which would be expensive than table look-up in either scheme. >> I notice that there is no API into commit_slab to ask "Does this >> commit have data in the slab?" *slabname##_at() may be the closest >> thing, but that would allocate the space and then says "This is the >> slot for that commit; go check if there is data there already." > > Yes. I think it's not a big deal if your slab is reasonably full (you'll > extend it to the full size of your commit space eventually either way). > But for a sparse slab, it does make that query much more expensive than > it needs to be. Yes, and I think that commit decoration is such a use case. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 18:01 ` Junio C Hamano @ 2015-05-14 18:10 ` Jeff King 0 siblings, 0 replies; 17+ messages in thread From: Jeff King @ 2015-05-14 18:10 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list On Thu, May 14, 2015 at 11:01:03AM -0700, Junio C Hamano wrote: > > So with your slab_peek() below, I'd guess that the slab would actually > > be faster. But I'd also be unsurprised if it makes no appreciable > > difference to the overall runtime of "git log --decorate". I think we'd > > have to build it and profile (and please feel free to say "eh, not worth > > the time to think about further"). > > While I think *slabname##_peek() would be worth doing regardless of > this caller, I suspect that the major overhead of decorate code > would come from the for_each_ref() that jumps deep into the history > to parse old commits; it would trigger a lot of unpacking of objects > deep in the delta chain, which would be expensive than table look-up > in either scheme. That would depend on the particular repository and traversal. The expensive "load an old commit" work is done once per ref in the repo. The lookup work is done once per commit traversed. So even if the latter is much less work, we are typically doing it many more times, and there is probably a balance point. But I suspect all of it compares to the actual work of a "git log" which has to read all of the commits we are looking at anyway. IOW, we are probably talking about optimizing 1%, while the other 99% is spent on inflate(), etc. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 17:37 ` Junio C Hamano 2015-05-14 17:49 ` Jeff King @ 2015-05-14 21:49 ` Junio C Hamano 2015-05-14 21:54 ` Jeff King 1 sibling, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2015-05-14 21:49 UTC (permalink / raw) To: Jeff King; +Cc: Michael Haggerty, git discussion list Junio C Hamano <gitster@pobox.com> writes: > For example, we have this in commit.c: > > const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) > { > struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); > if (sizep) > *sizep = v->size; > return v->buffer; > } > > But if we do not have the "buffer" data cached for that commit (via > an earlier call to set_commit_buffer()), we don't have to enlarge > the slab, as we are not adding anything to the slab system with this > call. The change may look something like this. I do not think it would make a difference to the get_cached_commit_buffer() codepath (when we use the commit->buffer, we pretty much know we use that for all commits), though. commit-slab.h | 27 ++++++++++++++++++++++++--- commit.c | 7 ++++++- 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/commit-slab.h b/commit-slab.h index 375c9c7..3334ab2 100644 --- a/commit-slab.h +++ b/commit-slab.h @@ -15,7 +15,13 @@ * - int *indegree_at(struct indegree *, struct commit *); * * This function locates the data associated with the given commit in - * the indegree slab, and returns the pointer to it. + * the indegree slab, and returns the pointer to it. The location to + * store the data is allocated as necessary. + * + * - int *indegree_peek(struct indegree *, struct commit *); + * + * This function is similar to indegree_at(), but it will return NULL + * until a call to indegree_at() was made for the commit. * * - void init_indegree(struct indegree *); * void init_indegree_with_stride(struct indegree *, int); @@ -80,8 +86,9 @@ static MAYBE_UNUSED void clear_ ##slabname(struct slabname *s) \ s->slab = NULL; \ } \ \ -static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ - const struct commit *c) \ +static MAYBE_UNUSED elemtype *slabname## _at_peek(struct slabname *s, \ + const struct commit *c, \ + int add_if_missing) \ { \ int nth_slab, nth_slot; \ \ @@ -90,6 +97,8 @@ static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ \ if (s->slab_count <= nth_slab) { \ int i; \ + if (!add_if_missing) \ + return NULL; \ s->slab = xrealloc(s->slab, \ (nth_slab + 1) * sizeof(*s->slab)); \ stat_ ##slabname## realloc++; \ @@ -103,6 +112,18 @@ static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ return &s->slab[nth_slab][nth_slot * s->stride]; \ } \ \ +static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ + const struct commit *c) \ +{ \ + return slabname##_at_peek(s, c, 1); \ +} \ + \ +static MAYBE_UNUSED elemtype *slabname## _peek(struct slabname *s, \ + const struct commit *c) \ +{ \ + return slabname##_at_peek(s, c, 0); \ +} \ + \ static int stat_ ##slabname## realloc /* diff --git a/commit.c b/commit.c index 65179f9..2d1901f 100644 --- a/commit.c +++ b/commit.c @@ -244,7 +244,12 @@ void set_commit_buffer(struct commit *commit, void *buffer, unsigned long size) const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); + if (!v) { + if (sizep) + *sizep = 0; + return NULL; + } if (sizep) *sizep = v->size; return v->buffer; ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 21:49 ` Junio C Hamano @ 2015-05-14 21:54 ` Jeff King 2015-05-14 22:25 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Jeff King @ 2015-05-14 21:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list On Thu, May 14, 2015 at 02:49:10PM -0700, Junio C Hamano wrote: > > But if we do not have the "buffer" data cached for that commit (via > > an earlier call to set_commit_buffer()), we don't have to enlarge > > the slab, as we are not adding anything to the slab system with this > > call. > > The change may look something like this. I do not think it would > make a difference to the get_cached_commit_buffer() codepath (when > we use the commit->buffer, we pretty much know we use that for all > commits), though. I'm not sure that's true. Most of the _users_ of the commit buffer will try to look in the cache, and if it's not there, do a one-off read. But they don't attach the result to the commit; they throw it away. The reasoning is that we don't have a cached buffer because we are going to look at a lot of commits (i.e., save_commit_buffer is off). So basically anytime save_commit_buffer is off (e.g., in rev-list) we are expanding the slab unnecessarily, even though literally nobody will write to it. > diff --git a/commit.c b/commit.c > index 65179f9..2d1901f 100644 > --- a/commit.c > +++ b/commit.c > @@ -244,7 +244,12 @@ void set_commit_buffer(struct commit *commit, void *buffer, unsigned long size) > > const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) > { > - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); > + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); > + if (!v) { > + if (sizep) > + *sizep = 0; > + return NULL; > + } > if (sizep) > *sizep = v->size; > return v->buffer; I think you need matching changes in unused_commit_buffer and free_commit_buffer. And detach_commit_buffer, too, I guess. Basically everywhere except set_commit_buffer would want to use the peek version. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 21:54 ` Jeff King @ 2015-05-14 22:25 ` Junio C Hamano 2015-05-14 22:33 ` Jeff King 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2015-05-14 22:25 UTC (permalink / raw) To: Jeff King Cc: Michael Haggerty, git, Nguyễn Thái Ngọc Duy, Thomas Rast Jeff King <peff@peff.net> writes: >> The change may look something like this. I do not think it would >> make a difference to the get_cached_commit_buffer() codepath (when >> we use the commit->buffer, we pretty much know we use that for all >> commits), though. > > I'm not sure that's true. Most of the _users_ of the commit buffer will > try to look in the cache, and if it's not there, do a one-off read. But > they don't attach the result to the commit; they throw it away. The > reasoning is that we don't have a cached buffer because we are going to > look at a lot of commits (i.e., save_commit_buffer is off). > > So basically anytime save_commit_buffer is off (e.g., in rev-list) we > are expanding the slab unnecessarily, even though literally nobody will > write to it. > ... > I think you need matching changes in unused_commit_buffer and > free_commit_buffer. And detach_commit_buffer, too, I guess. Basically > everywhere except set_commit_buffer would want to use the peek version. Yeah, I didn't consider the "we are not using buffer but are still calling into the slab code" case at all. The other two slabs (indegree and author) in commit.c are used only when needed, and when they are used they are fully populated anyway, so I think this patch covers that file fully. There are saved_parents slab used in revision.c and ref_bitmap slab used in shallow.c; they may also need similar fixups to save memory. commit-slab.h | 27 ++++++++++++++++++++++++--- commit.c | 28 ++++++++++++++++++++-------- 2 files changed, 44 insertions(+), 11 deletions(-) diff --git a/commit-slab.h b/commit-slab.h index 375c9c7..3334ab2 100644 --- a/commit-slab.h +++ b/commit-slab.h @@ -15,7 +15,13 @@ * - int *indegree_at(struct indegree *, struct commit *); * * This function locates the data associated with the given commit in - * the indegree slab, and returns the pointer to it. + * the indegree slab, and returns the pointer to it. The location to + * store the data is allocated as necessary. + * + * - int *indegree_peek(struct indegree *, struct commit *); + * + * This function is similar to indegree_at(), but it will return NULL + * until a call to indegree_at() was made for the commit. * * - void init_indegree(struct indegree *); * void init_indegree_with_stride(struct indegree *, int); @@ -80,8 +86,9 @@ static MAYBE_UNUSED void clear_ ##slabname(struct slabname *s) \ s->slab = NULL; \ } \ \ -static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ - const struct commit *c) \ +static MAYBE_UNUSED elemtype *slabname## _at_peek(struct slabname *s, \ + const struct commit *c, \ + int add_if_missing) \ { \ int nth_slab, nth_slot; \ \ @@ -90,6 +97,8 @@ static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ \ if (s->slab_count <= nth_slab) { \ int i; \ + if (!add_if_missing) \ + return NULL; \ s->slab = xrealloc(s->slab, \ (nth_slab + 1) * sizeof(*s->slab)); \ stat_ ##slabname## realloc++; \ @@ -103,6 +112,18 @@ static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ return &s->slab[nth_slab][nth_slot * s->stride]; \ } \ \ +static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ + const struct commit *c) \ +{ \ + return slabname##_at_peek(s, c, 1); \ +} \ + \ +static MAYBE_UNUSED elemtype *slabname## _peek(struct slabname *s, \ + const struct commit *c) \ +{ \ + return slabname##_at_peek(s, c, 0); \ +} \ + \ static int stat_ ##slabname## realloc /* diff --git a/commit.c b/commit.c index 65179f9..4f2ce9f 100644 --- a/commit.c +++ b/commit.c @@ -244,7 +244,12 @@ void set_commit_buffer(struct commit *commit, void *buffer, unsigned long size) const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); + if (!v) { + if (sizep) + *sizep = 0; + return NULL; + } if (sizep) *sizep = v->size; return v->buffer; @@ -271,24 +276,31 @@ const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep) void unuse_commit_buffer(const struct commit *commit, const void *buffer) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); - if (v->buffer != buffer) + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); + if (v && v->buffer != buffer) free((void *)buffer); } void free_commit_buffer(struct commit *commit) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); - free(v->buffer); - v->buffer = NULL; - v->size = 0; + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); + if (v) { + free(v->buffer); + v->buffer = NULL; + v->size = 0; + } } const void *detach_commit_buffer(struct commit *commit, unsigned long *sizep) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); void *ret; + if (!v) { + if (sizep) + *sizep = 0; + return NULL; + } ret = v->buffer; if (sizep) *sizep = v->size; ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 22:25 ` Junio C Hamano @ 2015-05-14 22:33 ` Jeff King 2015-05-22 21:21 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Jeff King @ 2015-05-14 22:33 UTC (permalink / raw) To: Junio C Hamano Cc: Michael Haggerty, git, Nguyễn Thái Ngọc Duy, Thomas Rast On Thu, May 14, 2015 at 03:25:33PM -0700, Junio C Hamano wrote: > @@ -90,6 +97,8 @@ static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ > \ > if (s->slab_count <= nth_slab) { \ > int i; \ > + if (!add_if_missing) \ > + return NULL; \ > s->slab = xrealloc(s->slab, \ > (nth_slab + 1) * sizeof(*s->slab)); \ > stat_ ##slabname## realloc++; \ This skips extending the list of slabs if we would go past the nth slab. But we don't fill in each slab in the first place. I.e., we may have 10 slabs, but only s->slab[10] is non-NULL. A few lines below this, we xcalloc() it if necessary. I think that needs to respect add_if_missing, as well. > void unuse_commit_buffer(const struct commit *commit, const void *buffer) > { > - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); > - if (v->buffer != buffer) > + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); > + if (v && v->buffer != buffer) > free((void *)buffer); > } I think you want: if (!v || v->buffer != buffer) here. IOW, we free it only if it is not our cached buffer, and it cannot be if we do not have a cached buffer. It may be easier to read by flipping the logic: if (v && v->buffer == buffer) return; /* it is saved in the cache */ free((void *)buffer); Or some variation on that. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-14 22:33 ` Jeff King @ 2015-05-22 21:21 ` Junio C Hamano 2015-05-22 21:38 ` Jeff King 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2015-05-22 21:21 UTC (permalink / raw) To: Jeff King Cc: Michael Haggerty, git, Nguyễn Thái Ngọc Duy, Thomas Rast Jeff King <peff@peff.net> writes: > On Thu, May 14, 2015 at 03:25:33PM -0700, Junio C Hamano wrote: > >> @@ -90,6 +97,8 @@ static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ >> \ >> if (s->slab_count <= nth_slab) { \ >> int i; \ >> + if (!add_if_missing) \ >> + return NULL; \ >> s->slab = xrealloc(s->slab, \ >> (nth_slab + 1) * sizeof(*s->slab)); \ >> stat_ ##slabname## realloc++; \ > > This skips extending the list of slabs if we would go past the nth slab. > But we don't fill in each slab in the first place. I.e., we may have 10 > slabs, but only s->slab[10] is non-NULL. > > A few lines below this, we xcalloc() it if necessary. I think that needs > to respect add_if_missing, as well. Yup, thanks. > >> void unuse_commit_buffer(const struct commit *commit, const void *buffer) >> { >> - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); >> - if (v->buffer != buffer) >> + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); >> + if (v && v->buffer != buffer) >> free((void *)buffer); >> } > > I think you want: > > if (!v || v->buffer != buffer) > > here. IOW, we free it only if it is not our cached buffer, and it cannot > be if we do not have a cached buffer. It may be easier to read by > flipping the logic: > > if (v && v->buffer == buffer) > return; /* it is saved in the cache */ > free((void *)buffer); > > Or some variation on that. I ended up doing it as a variant of the latter, "free unless we have v->buffer pointing at it". Sorry for a long delay. -- >8 -- Subject: [PATCH] commit-slab: introduce slabname##_peek() function There is no API to ask "Does this commit have associated data in slab?". If an application wants to (1) parse just a few commits at the beginning of a process, (2) store data for only these commits, and then (3) start processing many commits, taking into account the data stored (for a few of them) in the slab, the application would use slabname##_at() to allocate a space to store data in (2), but there is no API other than slabname##_at() to use in step (3). This allocates and wasts new space for these commits the caller is only interested in checking if they have data stored in step (2). Introduce slabname##_peek(), which is similar to slabname##_at() but returns NULL when there is no data already associated to it in such a use case. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- commit-slab.h | 34 +++++++++++++++++++++++++++++----- commit.c | 28 ++++++++++++++++++++-------- 2 files changed, 49 insertions(+), 13 deletions(-) diff --git a/commit-slab.h b/commit-slab.h index 375c9c7..9d12ce2 100644 --- a/commit-slab.h +++ b/commit-slab.h @@ -15,7 +15,13 @@ * - int *indegree_at(struct indegree *, struct commit *); * * This function locates the data associated with the given commit in - * the indegree slab, and returns the pointer to it. + * the indegree slab, and returns the pointer to it. The location to + * store the data is allocated as necessary. + * + * - int *indegree_peek(struct indegree *, struct commit *); + * + * This function is similar to indegree_at(), but it will return NULL + * until a call to indegree_at() was made for the commit. * * - void init_indegree(struct indegree *); * void init_indegree_with_stride(struct indegree *, int); @@ -80,8 +86,9 @@ static MAYBE_UNUSED void clear_ ##slabname(struct slabname *s) \ s->slab = NULL; \ } \ \ -static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ - const struct commit *c) \ +static MAYBE_UNUSED elemtype *slabname## _at_peek(struct slabname *s, \ + const struct commit *c, \ + int add_if_missing) \ { \ int nth_slab, nth_slot; \ \ @@ -90,6 +97,8 @@ static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ \ if (s->slab_count <= nth_slab) { \ int i; \ + if (!add_if_missing) \ + return NULL; \ s->slab = xrealloc(s->slab, \ (nth_slab + 1) * sizeof(*s->slab)); \ stat_ ##slabname## realloc++; \ @@ -97,10 +106,25 @@ static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ s->slab[i] = NULL; \ s->slab_count = nth_slab + 1; \ } \ - if (!s->slab[nth_slab]) \ + if (!s->slab[nth_slab]) { \ + if (!add_if_missing) \ + return NULL; \ s->slab[nth_slab] = xcalloc(s->slab_size, \ sizeof(**s->slab) * s->stride); \ - return &s->slab[nth_slab][nth_slot * s->stride]; \ + } \ + return &s->slab[nth_slab][nth_slot * s->stride]; \ +} \ + \ +static MAYBE_UNUSED elemtype *slabname## _at(struct slabname *s, \ + const struct commit *c) \ +{ \ + return slabname##_at_peek(s, c, 1); \ +} \ + \ +static MAYBE_UNUSED elemtype *slabname## _peek(struct slabname *s, \ + const struct commit *c) \ +{ \ + return slabname##_at_peek(s, c, 0); \ } \ \ static int stat_ ##slabname## realloc diff --git a/commit.c b/commit.c index 65179f9..5fb9496 100644 --- a/commit.c +++ b/commit.c @@ -244,7 +244,12 @@ void set_commit_buffer(struct commit *commit, void *buffer, unsigned long size) const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); + if (!v) { + if (sizep) + *sizep = 0; + return NULL; + } if (sizep) *sizep = v->size; return v->buffer; @@ -271,24 +276,31 @@ const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep) void unuse_commit_buffer(const struct commit *commit, const void *buffer) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); - if (v->buffer != buffer) + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); + if (!(v && v->buffer == buffer)) free((void *)buffer); } void free_commit_buffer(struct commit *commit) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); - free(v->buffer); - v->buffer = NULL; - v->size = 0; + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); + if (v) { + free(v->buffer); + v->buffer = NULL; + v->size = 0; + } } const void *detach_commit_buffer(struct commit *commit, unsigned long *sizep) { - struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit); + struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit); void *ret; + if (!v) { + if (sizep) + *sizep = 0; + return NULL; + } ret = v->buffer; if (sizep) *sizep = v->size; -- 2.4.1-449-g1f6c7df ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] log: do not shorten decoration names too early 2015-05-22 21:21 ` Junio C Hamano @ 2015-05-22 21:38 ` Jeff King 0 siblings, 0 replies; 17+ messages in thread From: Jeff King @ 2015-05-22 21:38 UTC (permalink / raw) To: Junio C Hamano Cc: Michael Haggerty, git, Nguyễn Thái Ngọc Duy, Thomas Rast On Fri, May 22, 2015 at 02:21:16PM -0700, Junio C Hamano wrote: > I ended up doing it as a variant of the latter, "free unless we have > v->buffer pointing at it". Thanks, this version looks good to me minus one micro-nit below. > Sorry for a long delay. No problem. I'm sometimes amazed you find time to write any patches at all. :) > -- >8 -- > Subject: [PATCH] commit-slab: introduce slabname##_peek() function > > There is no API to ask "Does this commit have associated data in > slab?". If an application wants to (1) parse just a few commits at > the beginning of a process, (2) store data for only these commits, > and then (3) start processing many commits, taking into account the > data stored (for a few of them) in the slab, the application would > use slabname##_at() to allocate a space to store data in (2), but > there is no API other than slabname##_at() to use in step (3). This > allocates and wasts new space for these commits the caller is only > interested in checking if they have data stored in step (2). s/wasts/wastes/ -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2015-05-22 21:39 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-05-13 13:11 "HEAD -> branch" decoration doesn't work with "--decorate=full" Michael Haggerty 2015-05-13 14:51 ` Junio C Hamano 2015-05-13 15:26 ` Michael J Gruber 2015-05-13 17:11 ` Junio C Hamano 2015-05-13 17:13 ` Junio C Hamano 2015-05-13 19:40 ` [PATCH 2/2] log: do not shorten decoration names too early Junio C Hamano 2015-05-14 6:33 ` Jeff King 2015-05-14 17:37 ` Junio C Hamano 2015-05-14 17:49 ` Jeff King 2015-05-14 18:01 ` Junio C Hamano 2015-05-14 18:10 ` Jeff King 2015-05-14 21:49 ` Junio C Hamano 2015-05-14 21:54 ` Jeff King 2015-05-14 22:25 ` Junio C Hamano 2015-05-14 22:33 ` Jeff King 2015-05-22 21:21 ` Junio C Hamano 2015-05-22 21:38 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).