* [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization
@ 2021-08-17 7:14 ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases ZheNing Hu via GitGitGadget
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu
Last time I submitted a very long patch series:
https://lore.kernel.org/git/pull.1016.git.1628842990.gitgitgadget@gmail.com/
My mentor Christian suggested to split the performance optimization part
out, so this patch series used to optimize performance optimization in
ref-filter.
Changes in this patch series:
1. Skip parse_object_buffer(), which can reduce object content parsing.
2. Use linked list to record parsing results, which can reduce second
format string parsing.
3. Reuse final buffer, which can reduce memcpy();
4. Add a need_get_object_info flag instead of compare two object_info,
which can reduce memcmp();
5. Use ALLOC_ARRAY() instead of CALLOC_ARRAY(), which can reduce memset();
Since the relevant part of git cat-file --batch has been deleted, the
execution time of git for-each-ref is very short, I haven’t added
performance tests yet for git for-each-ref.
ZheNing Hu (5):
[GSOC] ref-filter: skip parse_object_buffer in some cases
[GSOC] ref-filter: remove second parsing in format_ref_array_item
[GSOC] ref-filter: reuse final buffer
[GSOC] ref-filter: reduce unnecessary object_info comparisons
[GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY
builtin/branch.c | 2 +
builtin/for-each-ref.c | 3 +-
builtin/tag.c | 2 +
builtin/verify-tag.c | 2 +
ref-filter.c | 168 ++++++++++++++++++++++++++++++++---------
ref-filter.h | 17 ++++-
6 files changed, 154 insertions(+), 40 deletions(-)
base-commit: f000ecbed922c727382651490e75014f003c89ca
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1020%2Fadlternative%2Fref-filter-opt-perf-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1020/adlternative/ref-filter-opt-perf-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1020
--
gitgitgadget
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 2/5] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
When we are using some atoms such as %(raw), %(objectname),
%(objecttype), we don't need to parse the content of the object,
so we can skip parse_object_buffer() for performance optimization.
It is worth noting that in these cases, we still need to call
parse_object_buffer() for parsing:
1. The atom type is one of %(tag), %(type), %(object), %(tree),
%(numparent) or %(parent).
2. The type of the object is tag and the atom need to be
dereferenced e.g. %(*objecttype).
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
ref-filter.c | 49 +++++++++++++++++++++++++++++++++++++------------
ref-filter.h | 5 +++--
2 files changed, 40 insertions(+), 14 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 93ce2a6ef2e..65ba00633dc 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1009,6 +1009,20 @@ static int reject_atom(enum atom_type atom_type)
return atom_type == ATOM_REST;
}
+static int need_parse_buffer(enum atom_type atom_type) {
+ switch (atom_type) {
+ case ATOM_TAG:
+ case ATOM_TYPE:
+ case ATOM_OBJECT:
+ case ATOM_TREE:
+ case ATOM_NUMPARENT:
+ case ATOM_PARENT:
+ return 1;
+ default:
+ return 0;
+ }
+}
+
/*
* Make sure the format string is well formed, and parse out
* the used atoms.
@@ -1029,6 +1043,8 @@ int verify_ref_format(struct ref_format *format)
at = parse_ref_filter_atom(format, sp + 2, ep, &err);
if (at < 0)
die("%s", err.buf);
+ if (need_parse_buffer(used_atom[at].atom_type))
+ format->can_skip_parse_buffer = 0;
if (reject_atom(used_atom[at].atom_type))
die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
@@ -1524,14 +1540,16 @@ static void grab_values(struct atom_value *val, int deref, struct object *obj, s
{
void *buf = data->content;
- switch (obj->type) {
+ switch (data->type) {
case OBJ_TAG:
- grab_tag_values(val, deref, obj);
+ if (obj)
+ grab_tag_values(val, deref, obj);
grab_sub_body_contents(val, deref, data);
grab_person("tagger", val, deref, buf);
break;
case OBJ_COMMIT:
- grab_commit_values(val, deref, obj);
+ if (obj)
+ grab_commit_values(val, deref, obj);
grab_sub_body_contents(val, deref, data);
grab_person("author", val, deref, buf);
grab_person("committer", val, deref, buf);
@@ -1757,14 +1775,21 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
BUG("Object size is less than zero.");
if (oi->info.contentp) {
- *obj = parse_object_buffer(the_repository, &oi->oid, oi->type, oi->size, oi->content, &eaten);
- if (!*obj) {
- if (!eaten)
- free(oi->content);
- return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
- oid_to_hex(&oi->oid), ref->refname);
+ if (ref->can_skip_parse_buffer &&
+ ((!deref &&
+ (!need_tagged || oi->type != OBJ_TAG)) ||
+ deref)) {
+ grab_values(ref->value, deref, NULL, oi);
+ } else {
+ *obj = parse_object_buffer(the_repository, &oi->oid, oi->type, oi->size, oi->content, &eaten);
+ if (!*obj) {
+ if (!eaten)
+ free(oi->content);
+ return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
+ oid_to_hex(&oi->oid), ref->refname);
+ }
+ grab_values(ref->value, deref, *obj, oi);
}
- grab_values(ref->value, deref, *obj, oi);
}
grab_common_values(ref->value, deref, oi);
@@ -1988,7 +2013,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
* If there is no atom that wants to know about tagged
* object, we are done.
*/
- if (!need_tagged || (obj->type != OBJ_TAG))
+ if (!need_tagged || (oi.type != OBJ_TAG))
return 0;
/*
@@ -2595,7 +2620,7 @@ int format_ref_array_item(struct ref_array_item *info,
state.quote_style = format->quote_style;
push_stack_element(&state.stack);
-
+ info->can_skip_parse_buffer = format->can_skip_parse_buffer;
for (cp = format->format; *cp && (sp = find_next(cp)); cp = ep + 1) {
struct atom_value *atomv;
int pos;
diff --git a/ref-filter.h b/ref-filter.h
index c15dee8d6b9..5bceae1dac9 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -40,6 +40,7 @@ struct ref_array_item {
struct object_id objectname;
const char *rest;
int flag;
+ int can_skip_parse_buffer;
unsigned int kind;
const char *symref;
struct commit *commit;
@@ -81,12 +82,12 @@ struct ref_format {
int quote_style;
int use_rest;
int use_color;
-
+ int can_skip_parse_buffer;
/* Internal state to ref-filter */
int need_color_reset_at_eol;
};
-#define REF_FORMAT_INIT { .use_color = -1 }
+#define REF_FORMAT_INIT { .use_color = -1, .can_skip_parse_buffer = 1 }
/* Macros for checking --merged and --no-merged options */
#define _OPT_MERGED_NO_MERGED(option, filter, h) \
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/5] [GSOC] ref-filter: remove second parsing in format_ref_array_item
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 3/5] [GSOC] ref-filter: reuse final buffer ZheNing Hu via GitGitGadget
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
We parsed format string in verify_ref_format() and stored the parsed
atom in used_atom array. But in format_ref_array_item() we have
another round of parsing format string. This affects performance.
Introducing the struct parsed_atom_list which can save the current
atom's start and end address in format string and its index in
used_atom. All parsed_atom_list entry are linked together in the
form of linked list, and the head node of the linked list is stored
in struct ref_format. Create clear_parsed_atom_list() which can used
to clear the nodes on the linked list.
This can bring performance improvement.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
builtin/branch.c | 2 ++
builtin/for-each-ref.c | 3 ++-
builtin/tag.c | 2 ++
builtin/verify-tag.c | 2 ++
ref-filter.c | 45 +++++++++++++++++++++++++++++++++---------
ref-filter.h | 11 +++++++++++
6 files changed, 55 insertions(+), 10 deletions(-)
diff --git a/builtin/branch.c b/builtin/branch.c
index b23b1d1752a..e361f8cc661 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -459,6 +459,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
strbuf_release(&err);
strbuf_release(&out);
ref_array_clear(&array);
+ clear_parsed_atom_list(&format->parsed_atom_head);
free(to_free);
}
@@ -678,6 +679,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
memset(&filter, 0, sizeof(filter));
filter.kind = FILTER_REFS_BRANCHES;
filter.abbrev = -1;
+ INIT_LIST_HEAD(&format.parsed_atom_head);
if (argc == 2 && !strcmp(argv[1], "-h"))
usage_with_options(builtin_branch_usage, options);
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 89cb6307d46..6e22d80d5b5 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -53,8 +53,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
memset(&array, 0, sizeof(array));
memset(&filter, 0, sizeof(filter));
-
format.format = "%(objectname) %(objecttype)\t%(refname)";
+ INIT_LIST_HEAD(&format.parsed_atom_head);
git_config(git_default_config, NULL);
@@ -96,6 +96,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
ref_array_clear(&array);
free_commit_list(filter.with_commit);
free_commit_list(filter.no_commit);
+ clear_parsed_atom_list(&format.parsed_atom_head);
UNLEAK(sorting);
return 0;
}
diff --git a/builtin/tag.c b/builtin/tag.c
index 452558ec957..549339cbbe4 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -78,6 +78,7 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
strbuf_release(&output);
ref_array_clear(&array);
free(to_free);
+ clear_parsed_atom_list(&format->parsed_atom_head);
return 0;
}
@@ -493,6 +494,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
memset(&filter, 0, sizeof(filter));
filter.lines = -1;
opt.sign = -1;
+ INIT_LIST_HEAD(&format.parsed_atom_head);
argc = parse_options(argc, argv, prefix, options, git_tag_usage, 0);
diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c
index f45136a06ba..8b0a2b2587c 100644
--- a/builtin/verify-tag.c
+++ b/builtin/verify-tag.c
@@ -39,6 +39,7 @@ int cmd_verify_tag(int argc, const char **argv, const char *prefix)
OPT_END()
};
+ INIT_LIST_HEAD(&format.parsed_atom_head);
git_config(git_verify_tag_config, NULL);
argc = parse_options(argc, argv, prefix, verify_tag_options,
@@ -73,5 +74,6 @@ int cmd_verify_tag(int argc, const char **argv, const char *prefix)
if (format.format)
pretty_print_ref(name, &oid, &format);
}
+ clear_parsed_atom_list(&format.parsed_atom_head);
return had_error;
}
diff --git a/ref-filter.c b/ref-filter.c
index 65ba00633dc..76a31fb79b1 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1035,6 +1035,7 @@ int verify_ref_format(struct ref_format *format)
for (cp = format->format; *cp && (sp = find_next(cp)); ) {
struct strbuf err = STRBUF_INIT;
const char *color, *ep = strchr(sp, ')');
+ struct parsed_atom_list *e;
int at;
if (!ep)
@@ -1043,6 +1044,12 @@ int verify_ref_format(struct ref_format *format)
at = parse_ref_filter_atom(format, sp + 2, ep, &err);
if (at < 0)
die("%s", err.buf);
+ e = xmalloc(sizeof(*e));
+ e->beg = sp + 2;
+ e->end = ep;
+ e->at = at;
+ list_add_tail(&e->list, &format->parsed_atom_head);
+
if (need_parse_buffer(used_atom[at].atom_type))
format->can_skip_parse_buffer = 0;
if (reject_atom(used_atom[at].atom_type))
@@ -2615,25 +2622,31 @@ int format_ref_array_item(struct ref_array_item *info,
struct strbuf *final_buf,
struct strbuf *error_buf)
{
- const char *cp, *sp, *ep;
+ const char *cp, *sp;
+ struct list_head *item;
struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
state.quote_style = format->quote_style;
push_stack_element(&state.stack);
info->can_skip_parse_buffer = format->can_skip_parse_buffer;
- for (cp = format->format; *cp && (sp = find_next(cp)); cp = ep + 1) {
+
+ cp = format->format;
+
+ list_for_each(item, &format->parsed_atom_head) {
struct atom_value *atomv;
- int pos;
+ struct parsed_atom_list *e =
+ list_entry(item, struct parsed_atom_list, list);
- ep = strchr(sp, ')');
- if (cp < sp)
- append_literal(cp, sp, &state);
- pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
- if (pos < 0 || get_ref_atom_value(info, pos, &atomv, error_buf) ||
+ if (cp < e->beg - 2)
+ append_literal(cp, e->beg - 2, &state);
+ if (get_ref_atom_value(info, e->at, &atomv, error_buf) ||
atomv->handler(atomv, &state, error_buf)) {
pop_stack_element(&state.stack);
return -1;
}
+ cp = e->end + 1;
+ if (!*cp)
+ break;
}
if (*cp) {
sp = cp + strlen(cp);
@@ -2681,10 +2694,12 @@ static int parse_sorting_atom(const char *atom)
* This parses an atom using a dummy ref_format, since we don't
* actually care about the formatting details.
*/
+ int res;
struct ref_format dummy = REF_FORMAT_INIT;
const char *end = atom + strlen(atom);
struct strbuf err = STRBUF_INIT;
- int res = parse_ref_filter_atom(&dummy, atom, end, &err);
+
+ res = parse_ref_filter_atom(&dummy, atom, end, &err);
if (res < 0)
die("%s", err.buf);
strbuf_release(&err);
@@ -2757,3 +2772,15 @@ int parse_opt_merge_filter(const struct option *opt, const char *arg, int unset)
return 0;
}
+
+void clear_parsed_atom_list(struct list_head *parsed_atom_head)
+{
+ struct list_head *pos, *tmp;
+ struct parsed_atom_list *item;
+
+ list_for_each_safe(pos, tmp, parsed_atom_head) {
+ item = list_entry(pos, struct parsed_atom_list, list);
+ list_del(pos);
+ free(item);
+ }
+}
diff --git a/ref-filter.h b/ref-filter.h
index 5bceae1dac9..df54836a643 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -72,6 +72,13 @@ struct ref_filter {
verbose;
};
+struct parsed_atom_list {
+ const char *beg;
+ const char *end;
+ int at;
+ struct list_head list;
+};
+
struct ref_format {
/*
* Set these to define the format; make sure you call
@@ -85,6 +92,7 @@ struct ref_format {
int can_skip_parse_buffer;
/* Internal state to ref-filter */
int need_color_reset_at_eol;
+ struct list_head parsed_atom_head;
};
#define REF_FORMAT_INIT { .use_color = -1, .can_skip_parse_buffer = 1 }
@@ -112,6 +120,9 @@ struct ref_format {
int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int type);
/* Clear all memory allocated to ref_array */
void ref_array_clear(struct ref_array *array);
+/* Clear the parsed_atom_list in ref_format*/
+void clear_parsed_atom_list(struct list_head *parsed_atom_head);
+
/* Used to verify if the given format is correct and to parse out the used atoms */
int verify_ref_format(struct ref_format *format);
/* Sort the given ref_array as per the ref_sorting provided */
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/5] [GSOC] ref-filter: reuse final buffer
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 2/5] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 4/5] [GSOC] ref-filter: reduce unnecessary object_info comparisons ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 5/5] [GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY ZheNing Hu via GitGitGadget
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
In format_ref_array_item(), we add the object data
to ref_formatting_state, and copy the data from
ref_formatting_state to final_buf at the end. There
are huge copies of data.
Because final_buf will be cleared before every time
we call format_ref_array_item(). So we actually add
content to an empty strbuf. We can add the object's
data directly to this final_buffer instead of adding
objects' data to state.stack->output first, then
copy to final_buf.
Add a can_reuse_final_buffer flag to struct ref_format
and create can_reuse_final_buffer() to check if we are
use %(align), %(end), %(if), %(then), %(else). If not,
we can reuse the buf of finnal_buf.
Reuse the buffer address of final_buf in
format_ref_array_item(), we directly add the data to the
final buffer, return the content to final_buf at the end.
This will bring performance improvements.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
ref-filter.c | 39 ++++++++++++++++++++++++++++++++++-----
ref-filter.h | 3 ++-
2 files changed, 36 insertions(+), 6 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 76a31fb79b1..7106d4c1c4c 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1023,6 +1023,19 @@ static int need_parse_buffer(enum atom_type atom_type) {
}
}
+static int can_reuse_final_buffer(enum atom_type atom_type) {
+ switch (atom_type) {
+ case ATOM_ALIGN:
+ case ATOM_END:
+ case ATOM_IF:
+ case ATOM_THEN:
+ case ATOM_ELSE:
+ return 0;
+ default:
+ return 1;
+ }
+}
+
/*
* Make sure the format string is well formed, and parse out
* the used atoms.
@@ -1054,6 +1067,7 @@ int verify_ref_format(struct ref_format *format)
format->can_skip_parse_buffer = 0;
if (reject_atom(used_atom[at].atom_type))
die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
+ format->can_reuse_final_buffer = can_reuse_final_buffer(used_atom[at].atom_type);
if ((format->quote_style == QUOTE_PYTHON ||
format->quote_style == QUOTE_SHELL ||
@@ -2627,7 +2641,14 @@ int format_ref_array_item(struct ref_array_item *info,
struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
state.quote_style = format->quote_style;
- push_stack_element(&state.stack);
+ if (format->can_reuse_final_buffer) {
+ struct ref_formatting_stack *s = xmalloc(sizeof(struct ref_formatting_stack));
+ s->output = *final_buf;
+ s->prev = state.stack;
+ state.stack = s;
+ } else {
+ push_stack_element(&state.stack);
+ }
info->can_skip_parse_buffer = format->can_skip_parse_buffer;
cp = format->format;
@@ -2641,7 +2662,8 @@ int format_ref_array_item(struct ref_array_item *info,
append_literal(cp, e->beg - 2, &state);
if (get_ref_atom_value(info, e->at, &atomv, error_buf) ||
atomv->handler(atomv, &state, error_buf)) {
- pop_stack_element(&state.stack);
+ if (!format->can_reuse_final_buffer)
+ pop_stack_element(&state.stack);
return -1;
}
cp = e->end + 1;
@@ -2656,16 +2678,23 @@ int format_ref_array_item(struct ref_array_item *info,
struct atom_value resetv = ATOM_VALUE_INIT;
resetv.s = GIT_COLOR_RESET;
if (append_atom(&resetv, &state, error_buf)) {
- pop_stack_element(&state.stack);
+ if (!format->can_reuse_final_buffer)
+ pop_stack_element(&state.stack);
return -1;
}
}
if (state.stack->prev) {
+ assert(!format->can_reuse_final_buffer);
pop_stack_element(&state.stack);
return strbuf_addf_ret(error_buf, -1, _("format: %%(end) atom missing"));
}
- strbuf_addbuf(final_buf, &state.stack->output);
- pop_stack_element(&state.stack);
+ if(format->can_reuse_final_buffer) {
+ *final_buf = state.stack->output;
+ free(state.stack);
+ } else {
+ strbuf_addbuf(final_buf, &state.stack->output);
+ pop_stack_element(&state.stack);
+ }
return 0;
}
diff --git a/ref-filter.h b/ref-filter.h
index df54836a643..a62a14a2e43 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -92,10 +92,11 @@ struct ref_format {
int can_skip_parse_buffer;
/* Internal state to ref-filter */
int need_color_reset_at_eol;
+ int can_reuse_final_buffer;
struct list_head parsed_atom_head;
};
-#define REF_FORMAT_INIT { .use_color = -1, .can_skip_parse_buffer = 1 }
+#define REF_FORMAT_INIT { .use_color = -1, .can_skip_parse_buffer = 1, .can_reuse_final_buffer = 1 }
/* Macros for checking --merged and --no-merged options */
#define _OPT_MERGED_NO_MERGED(option, filter, h) \
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/5] [GSOC] ref-filter: reduce unnecessary object_info comparisons
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
` (2 preceding siblings ...)
2021-08-17 7:14 ` [PATCH 3/5] [GSOC] ref-filter: reuse final buffer ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 5/5] [GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY ZheNing Hu via GitGitGadget
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
In populate_value() we use an empty struct object_info
compare with oi and oi_deref, which determine if we can
return early, this is actually very inefficient.
So add a need_get_object_info flag in global variables,
which used to indicate whether we need call get_object()
or we can return early.
So we only need to compare only one byte instead of comparing
the entire huge object_info struct.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
ref-filter.c | 34 +++++++++++++++++++++++-----------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 7106d4c1c4c..35e221e7ec8 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -212,7 +212,7 @@ static struct used_atom {
char *head;
} u;
} *used_atom;
-static int used_atom_cnt, need_tagged, need_symref;
+static int used_atom_cnt, need_tagged, need_symref, need_get_object_info;
/*
* Expand string, append it to strbuf *sb, then return error code ret.
@@ -318,10 +318,13 @@ static int objecttype_atom_parser(struct ref_format *format, struct used_atom *a
{
if (arg)
return strbuf_addf_ret(err, -1, _("%%(objecttype) does not take arguments"));
- if (*atom->name == '*')
+ if (*atom->name == '*') {
oi_deref.info.typep = &oi_deref.type;
- else
+ need_get_object_info = 1;
+ } else {
oi.info.typep = &oi.type;
+ need_get_object_info = 1;
+ }
return 0;
}
@@ -330,16 +333,23 @@ static int objectsize_atom_parser(struct ref_format *format, struct used_atom *a
{
if (!arg) {
atom->u.objectsize.option = O_SIZE;
- if (*atom->name == '*')
+ if (*atom->name == '*') {
oi_deref.info.sizep = &oi_deref.size;
- else
+ need_get_object_info = 1;
+ } else {
oi.info.sizep = &oi.size;
+ need_get_object_info = 1;
+ }
} else if (!strcmp(arg, "disk")) {
atom->u.objectsize.option = O_SIZE_DISK;
- if (*atom->name == '*')
+ if (*atom->name == '*') {
oi_deref.info.disk_sizep = &oi_deref.disk_size;
- else
+ need_get_object_info = 1;
+ }
+ else {
oi.info.disk_sizep = &oi.disk_size;
+ need_get_object_info = 1;
+ }
} else
return strbuf_addf_ret(err, -1, _("unrecognized %%(objectsize) argument: %s"), arg);
return 0;
@@ -354,6 +364,7 @@ static int deltabase_atom_parser(struct ref_format *format, struct used_atom *at
oi_deref.info.delta_base_oid = &oi_deref.delta_base_oid;
else
oi.info.delta_base_oid = &oi.delta_base_oid;
+ need_get_object_info = 1;
return 0;
}
@@ -720,6 +731,7 @@ static int parse_ref_filter_atom(struct ref_format *format,
used_atom[at].type = valid_atom[i].cmp_type;
used_atom[at].source = valid_atom[i].source;
if (used_atom[at].source == SOURCE_OBJ) {
+ need_get_object_info = 1;
if (*atom == '*')
oi_deref.info.contentp = &oi_deref.content;
else
@@ -1871,7 +1883,6 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
{
struct object *obj;
int i;
- struct object_info empty = OBJECT_INFO_INIT;
CALLOC_ARRAY(ref->value, used_atom_cnt);
@@ -2019,10 +2030,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
oid_to_hex(&ref->objectname), ref->refname);
}
- if (need_tagged)
+ if (need_tagged) {
oi.info.contentp = &oi.content;
- if (!memcmp(&oi.info, &empty, sizeof(empty)) &&
- !memcmp(&oi_deref.info, &empty, sizeof(empty)))
+ need_get_object_info = 1;
+ }
+ if (!need_get_object_info)
return 0;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 5/5] [GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
` (3 preceding siblings ...)
2021-08-17 7:14 ` [PATCH 4/5] [GSOC] ref-filter: reduce unnecessary object_info comparisons ZheNing Hu via GitGitGadget
@ 2021-08-17 7:14 ` ZheNing Hu via GitGitGadget
4 siblings, 0 replies; 6+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-17 7:14 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
Philip Oakley, ZheNing Hu, ZheNing Hu
From: ZheNing Hu <adlternative@gmail.com>
populate_value() uses CALLOC_ARRAY() to allocate dynamic memory for
ref->value. But after that, we immediately assign values to its
members in the for loop. This shows that it is not necessary to
use CALLOC_ARRAY() for ref->value to clear the memory.
Therefore, use ALLOC_ARRAY() instead of CALLOC_ARRAY() to reduce the
overhead caused by clearing the memory. This can bring performance
optimizations.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
ref-filter.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/ref-filter.c b/ref-filter.c
index 35e221e7ec8..fe2df82067f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1884,7 +1884,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
struct object *obj;
int i;
- CALLOC_ARRAY(ref->value, used_atom_cnt);
+ ALLOC_ARRAY(ref->value, used_atom_cnt);
if (need_symref && (ref->flag & REF_ISSYMREF) && !ref->symref) {
ref->symref = resolve_refdup(ref->refname, RESOLVE_REF_READING,
@@ -1903,6 +1903,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
const char *refname;
struct branch *branch = NULL;
+ v->s = NULL;
v->s_size = ATOM_SIZE_UNSPECIFIED;
v->handler = append_atom;
v->atom = atom;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-08-17 7:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-17 7:14 [PATCH 0/5] [GSOC] [RFC] ref-filter: performance optimization ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 1/5] [GSOC] ref-filter: skip parse_object_buffer in some cases ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 2/5] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 3/5] [GSOC] ref-filter: reuse final buffer ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 4/5] [GSOC] ref-filter: reduce unnecessary object_info comparisons ZheNing Hu via GitGitGadget
2021-08-17 7:14 ` [PATCH 5/5] [GSOC]: ref-filter: instead CALLOC_ARRAY to ALLOC_ARRAY ZheNing Hu via GitGitGadget
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.