* [PATCH 0/2] Add additional trace2 regions for fetch and push
@ 2024-08-15 18:51 Josh Steadmon
2024-08-15 18:51 ` [PATCH 1/2] fetch: add top-level trace2 regions Josh Steadmon
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Josh Steadmon @ 2024-08-15 18:51 UTC (permalink / raw)
To: git
Last year at $DAYJOB we were having issues with slow
fetches/pulls/pushes. We added some additional trace2 regions which
helped us narrow down the issue to some server-side negotiation
problems. We've been carrying these patches downstream ever since, but
they might be useful to others as well.
Calvin Wan (1):
send-pack: add new tracing regions for push
Josh Steadmon (1):
fetch: add top-level trace2 regions
builtin/fetch.c | 27 +++++++++++++++++++++++----
send-pack.c | 16 +++++++++++++---
2 files changed, 36 insertions(+), 7 deletions(-)
base-commit: 25673b1c476756ec0587fb0596ab3c22b96dc52a
--
2.46.0.184.g6999bdac58-goog
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] fetch: add top-level trace2 regions
2024-08-15 18:51 [PATCH 0/2] Add additional trace2 regions for fetch and push Josh Steadmon
@ 2024-08-15 18:51 ` Josh Steadmon
2024-08-15 19:47 ` Junio C Hamano
2024-08-15 18:51 ` [PATCH 2/2] send-pack: add new tracing regions for push Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Josh Steadmon
2 siblings, 1 reply; 13+ messages in thread
From: Josh Steadmon @ 2024-08-15 18:51 UTC (permalink / raw)
To: git
At $DAYJOB we experienced some slow fetch operations and needed some
additional data to help diagnose the issue.
Add top-level trace2 regions for the various modes of operation of
`git-fetch`. None of these regions are in recursive code, so any
enclosed trace messages should only see their nesting level increase by
one.
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
builtin/fetch.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 693f02b958..950cd79baa 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2353,9 +2353,14 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
if (!max_jobs)
max_jobs = online_cpus();
- if (!git_config_get_string_tmp("fetch.bundleuri", &bundle_uri) &&
- fetch_bundle_uri(the_repository, bundle_uri, NULL))
- warning(_("failed to fetch bundles from '%s'"), bundle_uri);
+ if (!git_config_get_string_tmp("fetch.bundleuri", &bundle_uri)) {
+ int result = 0;
+ trace2_region_enter("fetch", "fetch-bundle-uri", the_repository);
+ result = fetch_bundle_uri(the_repository, bundle_uri, NULL);
+ trace2_region_leave("fetch", "fetch-bundle-uri", the_repository);
+ if (result)
+ warning(_("failed to fetch bundles from '%s'"), bundle_uri);
+ }
if (all < 0) {
/*
@@ -2407,6 +2412,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
struct oidset_iter iter;
const struct object_id *oid;
+ trace2_region_enter("fetch", "negotiate-only", the_repository);
if (!remote)
die(_("must supply remote when using --negotiate-only"));
gtransport = prepare_transport(remote, 1);
@@ -2415,6 +2421,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
} else {
warning(_("protocol does not support --negotiate-only, exiting"));
result = 1;
+ trace2_region_leave("fetch", "negotiate-only", the_repository);
goto cleanup;
}
if (server_options.nr)
@@ -2425,11 +2432,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
while ((oid = oidset_iter_next(&iter)))
printf("%s\n", oid_to_hex(oid));
oidset_clear(&acked_commits);
+ trace2_region_leave("fetch", "negotiate-only", the_repository);
} else if (remote) {
- if (filter_options.choice || repo_has_promisor_remote(the_repository))
+ if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
+ trace2_region_enter("fetch", "setup-partial", the_repository);
fetch_one_setup_partial(remote);
+ trace2_region_leave("fetch", "setup-partial", the_repository);
+ }
+ trace2_region_enter("fetch", "fetch-one", the_repository);
result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs,
&config);
+ trace2_region_leave("fetch", "fetch-one", the_repository);
} else {
int max_children = max_jobs;
@@ -2449,7 +2462,9 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
max_children = config.parallel;
/* TODO should this also die if we have a previous partial-clone? */
+ trace2_region_enter("fetch", "fetch-multiple", the_repository);
result = fetch_multiple(&list, max_children, &config);
+ trace2_region_leave("fetch", "fetch-multiple", the_repository);
}
/*
@@ -2471,6 +2486,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
max_children = config.parallel;
add_options_to_argv(&options, &config);
+ trace2_region_enter_printf("fetch", "recurse-submodule", the_repository, "%s", submodule_prefix);
result = fetch_submodules(the_repository,
&options,
submodule_prefix,
@@ -2478,6 +2494,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
recurse_submodules_default,
verbosity < 0,
max_children);
+ trace2_region_leave_printf("fetch", "recurse-submodule", the_repository, "%s", submodule_prefix);
strvec_clear(&options);
}
@@ -2501,9 +2518,11 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
if (progress)
commit_graph_flags |= COMMIT_GRAPH_WRITE_PROGRESS;
+ trace2_region_enter("fetch", "write-commit-graph", the_repository);
write_commit_graph_reachable(the_repository->objects->odb,
commit_graph_flags,
NULL);
+ trace2_region_leave("fetch", "write-commit-graph", the_repository);
}
if (enable_auto_gc) {
--
2.46.0.184.g6999bdac58-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] send-pack: add new tracing regions for push
2024-08-15 18:51 [PATCH 0/2] Add additional trace2 regions for fetch and push Josh Steadmon
2024-08-15 18:51 ` [PATCH 1/2] fetch: add top-level trace2 regions Josh Steadmon
@ 2024-08-15 18:51 ` Josh Steadmon
2024-08-15 20:06 ` Junio C Hamano
2024-08-22 21:57 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Josh Steadmon
2 siblings, 1 reply; 13+ messages in thread
From: Josh Steadmon @ 2024-08-15 18:51 UTC (permalink / raw)
To: git
From: Calvin Wan <calvinwan@google.com>
At $DAYJOB we experienced some slow pushes and needed additional trace
data to diagnose them.
Add trace2 regions for various sections of send_pack().
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
send-pack.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/send-pack.c b/send-pack.c
index fa2f5eec17..de8ba46ad5 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -512,8 +512,11 @@ int send_pack(struct send_pack_args *args,
}
git_config_get_bool("push.negotiate", &push_negotiate);
- if (push_negotiate)
+ if (push_negotiate) {
+ trace2_region_enter("send_pack", "push_negotiate", the_repository);
get_commons_through_negotiation(args->url, remote_refs, &commons);
+ trace2_region_leave("send_pack", "push_negotiate", the_repository);
+ }
if (!git_config_get_bool("push.usebitmaps", &use_bitmaps))
args->disable_bitmaps = !use_bitmaps;
@@ -641,10 +644,12 @@ int send_pack(struct send_pack_args *args,
/*
* Finally, tell the other end!
*/
- if (!args->dry_run && push_cert_nonce)
+ if (!args->dry_run && push_cert_nonce) {
+ trace2_region_enter("send_pack", "push_cert", the_repository);
cmds_sent = generate_push_cert(&req_buf, remote_refs, args,
cap_buf.buf, push_cert_nonce);
- else if (!args->dry_run)
+ trace2_region_leave("send_pack", "push_cert", the_repository);
+ } else if (!args->dry_run) {
for (ref = remote_refs; ref; ref = ref->next) {
char *old_hex, *new_hex;
@@ -664,6 +669,7 @@ int send_pack(struct send_pack_args *args,
old_hex, new_hex, ref->name);
}
}
+ }
if (use_push_options) {
struct string_list_item *item;
@@ -686,6 +692,7 @@ int send_pack(struct send_pack_args *args,
strbuf_release(&cap_buf);
if (use_sideband && cmds_sent) {
+ trace2_region_enter("send_pack", "sideband_demux", the_repository);
memset(&demux, 0, sizeof(demux));
demux.proc = sideband_demux;
demux.data = fd;
@@ -719,6 +726,8 @@ int send_pack(struct send_pack_args *args,
if (use_sideband) {
close(demux.out);
finish_async(&demux);
+ if (cmds_sent)
+ trace2_region_leave("send_pack", "sideband_demux", the_repository);
}
fd[1] = -1;
return -1;
@@ -743,6 +752,7 @@ int send_pack(struct send_pack_args *args,
error("error in sideband demultiplexer");
ret = -1;
}
+ trace2_region_leave("send_pack", "sideband_demux", the_repository);
}
if (ret < 0)
--
2.46.0.184.g6999bdac58-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] fetch: add top-level trace2 regions
2024-08-15 18:51 ` [PATCH 1/2] fetch: add top-level trace2 regions Josh Steadmon
@ 2024-08-15 19:47 ` Junio C Hamano
2024-08-19 18:26 ` Josh Steadmon
0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2024-08-15 19:47 UTC (permalink / raw)
To: Josh Steadmon; +Cc: git
Josh Steadmon <steadmon@google.com> writes:
> - if (!git_config_get_string_tmp("fetch.bundleuri", &bundle_uri) &&
> - fetch_bundle_uri(the_repository, bundle_uri, NULL))
> - warning(_("failed to fetch bundles from '%s'"), bundle_uri);
> + if (!git_config_get_string_tmp("fetch.bundleuri", &bundle_uri)) {
> + int result = 0;
This needs no initialization.
> + trace2_region_enter("fetch", "fetch-bundle-uri", the_repository);
> + result = fetch_bundle_uri(the_repository, bundle_uri, NULL);
> + trace2_region_leave("fetch", "fetch-bundle-uri", the_repository);
> + if (result)
> + warning(_("failed to fetch bundles from '%s'"), bundle_uri);
> + }
It is a bit sad that the concise original with straight-forward
control flow had to be butchered like this to sprinkle tracing code
in it, but I guess that cannot be helped? I wonder if it becomes
much less invasive and more future proof to define the trace region
in the fetch_bundle_uri() function itself. Has it been considered?
> @@ -2407,6 +2412,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> struct oidset_iter iter;
> const struct object_id *oid;
>
> + trace2_region_enter("fetch", "negotiate-only", the_repository);
> if (!remote)
> die(_("must supply remote when using --negotiate-only"));
> gtransport = prepare_transport(remote, 1);
> @@ -2415,6 +2421,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> } else {
> warning(_("protocol does not support --negotiate-only, exiting"));
> result = 1;
> + trace2_region_leave("fetch", "negotiate-only", the_repository);
> goto cleanup;
> }
> if (server_options.nr)
> @@ -2425,11 +2432,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> while ((oid = oidset_iter_next(&iter)))
> printf("%s\n", oid_to_hex(oid));
> oidset_clear(&acked_commits);
> + trace2_region_leave("fetch", "negotiate-only", the_repository);
OK. Both error path and normal path we leave the region we entered.
A complete tangent, but do we have an automated test or code
analysis that catches us if we forget to leave an entered region
(i.e., imagine we didn't leave in the else clause after issuing the
warning---we remain in the region in such an error case, even though
normally we leave the region correctly)?
> } else if (remote) {
> - if (filter_options.choice || repo_has_promisor_remote(the_repository))
> + if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
> + trace2_region_enter("fetch", "setup-partial", the_repository);
> fetch_one_setup_partial(remote);
> + trace2_region_leave("fetch", "setup-partial", the_repository);
> + }
OK. That's nice and straight-forward.
> + trace2_region_enter("fetch", "fetch-one", the_repository);
> result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs,
> &config);
> + trace2_region_leave("fetch", "fetch-one", the_repository);
This one, too.
> @@ -2449,7 +2462,9 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> max_children = config.parallel;
>
> /* TODO should this also die if we have a previous partial-clone? */
> + trace2_region_enter("fetch", "fetch-multiple", the_repository);
> result = fetch_multiple(&list, max_children, &config);
> + trace2_region_leave("fetch", "fetch-multiple", the_repository);
So is this.
> @@ -2471,6 +2486,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> max_children = config.parallel;
>
> add_options_to_argv(&options, &config);
> + trace2_region_enter_printf("fetch", "recurse-submodule", the_repository, "%s", submodule_prefix);
> result = fetch_submodules(the_repository,
> &options,
> submodule_prefix,
> @@ -2478,6 +2494,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> recurse_submodules_default,
> verbosity < 0,
> max_children);
> + trace2_region_leave_printf("fetch", "recurse-submodule", the_repository, "%s", submodule_prefix);
> strvec_clear(&options);
> }
Ditto.
> @@ -2501,9 +2518,11 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> if (progress)
> commit_graph_flags |= COMMIT_GRAPH_WRITE_PROGRESS;
>
> + trace2_region_enter("fetch", "write-commit-graph", the_repository);
> write_commit_graph_reachable(the_repository->objects->odb,
> commit_graph_flags,
> NULL);
> + trace2_region_leave("fetch", "write-commit-graph", the_repository);
OK.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] send-pack: add new tracing regions for push
2024-08-15 18:51 ` [PATCH 2/2] send-pack: add new tracing regions for push Josh Steadmon
@ 2024-08-15 20:06 ` Junio C Hamano
2024-08-22 20:20 ` Josh Steadmon
0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2024-08-15 20:06 UTC (permalink / raw)
To: Josh Steadmon; +Cc: git
Josh Steadmon <steadmon@google.com> writes:
> From: Calvin Wan <calvinwan@google.com>
>
> At $DAYJOB we experienced some slow pushes and needed additional trace
> data to diagnose them.
>
> Add trace2 regions for various sections of send_pack().
>
> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---
> send-pack.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/send-pack.c b/send-pack.c
> index fa2f5eec17..de8ba46ad5 100644
> --- a/send-pack.c
> +++ b/send-pack.c
> @@ -512,8 +512,11 @@ int send_pack(struct send_pack_args *args,
> }
>
> git_config_get_bool("push.negotiate", &push_negotiate);
> - if (push_negotiate)
> + if (push_negotiate) {
> + trace2_region_enter("send_pack", "push_negotiate", the_repository);
> get_commons_through_negotiation(args->url, remote_refs, &commons);
> + trace2_region_leave("send_pack", "push_negotiate", the_repository);
> + }
> @@ -641,10 +644,12 @@ int send_pack(struct send_pack_args *args,
> /*
> * Finally, tell the other end!
> */
> - if (!args->dry_run && push_cert_nonce)
> + if (!args->dry_run && push_cert_nonce) {
> + trace2_region_enter("send_pack", "push_cert", the_repository);
> cmds_sent = generate_push_cert(&req_buf, remote_refs, args,
> cap_buf.buf, push_cert_nonce);
> - else if (!args->dry_run)
> + trace2_region_leave("send_pack", "push_cert", the_repository);
> + } else if (!args->dry_run) {
Misleading "diff" but this is correct.
But makes me wonder if we really want to express these (and other
events we saw in [PATCH 1/2]) as regions we enter and leave.
Presumably we would generate a certificate instantly, compared to
all the other things happening in this process, like talking over
the network, waiting for the other end, and packing the payload, and
I suspect that the single bit the debuggers would want to learn from
the trace is "did we get asked to give a certificate?".
Sandwitching a rather expensive operation inside a pair of
enter/leave would give us a way to measure how long that operation
took in exchanges for one extra trace log entry, and "ah, we need to
first fetch the bundle and process it" we saw in [PATCH 1/2] is
something that is worth timing, but I am finding a bit hard to
believe it is worth doing for push cert generation. It is
understandable if there weren't any suitable mechanism to simply log
"the control passed at this spot at this time" kind of event in the
trace2 subsystem, but I do not think it is the case.
> @@ -686,6 +692,7 @@ int send_pack(struct send_pack_args *args,
> strbuf_release(&cap_buf);
>
> if (use_sideband && cmds_sent) {
> + trace2_region_enter("send_pack", "sideband_demux", the_repository);
> memset(&demux, 0, sizeof(demux));
> demux.proc = sideband_demux;
> demux.data = fd;
> @@ -719,6 +726,8 @@ int send_pack(struct send_pack_args *args,
> if (use_sideband) {
> close(demux.out);
> finish_async(&demux);
> + if (cmds_sent)
> + trace2_region_leave("send_pack", "sideband_demux", the_repository);
> }
> fd[1] = -1;
> return -1;
> @@ -743,6 +752,7 @@ int send_pack(struct send_pack_args *args,
> error("error in sideband demultiplexer");
> ret = -1;
> }
> + trace2_region_leave("send_pack", "sideband_demux", the_repository);
> }
This is also dubious. When sideband is in effect, this records the
fact that we did ran pack-objects and allows as to measure how much
time it was spent. But on a connection without sideband enabled, it
does not record anything. But if we start the region a line sooner,
and finish the region a line later, we should be able to record the
same facts even for a connection without sideband enabled. I also
find the name given to this region ultra-iffy. Is it so important
that sideband_demux was used to communicate with the other end that
received the data our pack_objects() produced that the word
"sideband_demux" deserves to be in the name of the region, more than
the fact that this is the crux of sending pack data from us to them
(i.e. the main part of the "send-pack")?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] fetch: add top-level trace2 regions
2024-08-15 19:47 ` Junio C Hamano
@ 2024-08-19 18:26 ` Josh Steadmon
0 siblings, 0 replies; 13+ messages in thread
From: Josh Steadmon @ 2024-08-19 18:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 2024.08.15 12:47, Junio C Hamano wrote:
> Josh Steadmon <steadmon@google.com> writes:
>
> > - if (!git_config_get_string_tmp("fetch.bundleuri", &bundle_uri) &&
> > - fetch_bundle_uri(the_repository, bundle_uri, NULL))
> > - warning(_("failed to fetch bundles from '%s'"), bundle_uri);
> > + if (!git_config_get_string_tmp("fetch.bundleuri", &bundle_uri)) {
> > + int result = 0;
>
> This needs no initialization.
>
> > + trace2_region_enter("fetch", "fetch-bundle-uri", the_repository);
> > + result = fetch_bundle_uri(the_repository, bundle_uri, NULL);
> > + trace2_region_leave("fetch", "fetch-bundle-uri", the_repository);
> > + if (result)
> > + warning(_("failed to fetch bundles from '%s'"), bundle_uri);
> > + }
>
> It is a bit sad that the concise original with straight-forward
> control flow had to be butchered like this to sprinkle tracing code
> in it, but I guess that cannot be helped? I wonder if it becomes
> much less invasive and more future proof to define the trace region
> in the fetch_bundle_uri() function itself. Has it been considered?
Moved to fetch_bundle_uri() in v2.
> > @@ -2407,6 +2412,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> > struct oidset_iter iter;
> > const struct object_id *oid;
> >
> > + trace2_region_enter("fetch", "negotiate-only", the_repository);
> > if (!remote)
> > die(_("must supply remote when using --negotiate-only"));
> > gtransport = prepare_transport(remote, 1);
> > @@ -2415,6 +2421,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> > } else {
> > warning(_("protocol does not support --negotiate-only, exiting"));
> > result = 1;
> > + trace2_region_leave("fetch", "negotiate-only", the_repository);
> > goto cleanup;
> > }
> > if (server_options.nr)
> > @@ -2425,11 +2432,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> > while ((oid = oidset_iter_next(&iter)))
> > printf("%s\n", oid_to_hex(oid));
> > oidset_clear(&acked_commits);
> > + trace2_region_leave("fetch", "negotiate-only", the_repository);
>
> OK. Both error path and normal path we leave the region we entered.
>
> A complete tangent, but do we have an automated test or code
> analysis that catches us if we forget to leave an entered region
> (i.e., imagine we didn't leave in the else clause after issuing the
> warning---we remain in the region in such an error case, even though
> normally we leave the region correctly)?
It's been discussed before [1], but the general feeling seems to be that
it's not worth the effort / test runtime.
[1] https://lore.kernel.org/git/xmqqbka27zu9.fsf@gitster.g/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] send-pack: add new tracing regions for push
2024-08-15 20:06 ` Junio C Hamano
@ 2024-08-22 20:20 ` Josh Steadmon
2024-08-22 20:30 ` Junio C Hamano
0 siblings, 1 reply; 13+ messages in thread
From: Josh Steadmon @ 2024-08-22 20:20 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 2024.08.15 13:06, Junio C Hamano wrote:
> Josh Steadmon <steadmon@google.com> writes:
>
> > From: Calvin Wan <calvinwan@google.com>
> >
> > At $DAYJOB we experienced some slow pushes and needed additional trace
> > data to diagnose them.
> >
> > Add trace2 regions for various sections of send_pack().
> >
> > Signed-off-by: Josh Steadmon <steadmon@google.com>
> > ---
> > send-pack.c | 16 +++++++++++++---
> > 1 file changed, 13 insertions(+), 3 deletions(-)
> >
> > diff --git a/send-pack.c b/send-pack.c
> > index fa2f5eec17..de8ba46ad5 100644
> > --- a/send-pack.c
> > +++ b/send-pack.c
> > @@ -512,8 +512,11 @@ int send_pack(struct send_pack_args *args,
> > }
> >
> > git_config_get_bool("push.negotiate", &push_negotiate);
> > - if (push_negotiate)
> > + if (push_negotiate) {
> > + trace2_region_enter("send_pack", "push_negotiate", the_repository);
> > get_commons_through_negotiation(args->url, remote_refs, &commons);
> > + trace2_region_leave("send_pack", "push_negotiate", the_repository);
> > + }
>
> > @@ -641,10 +644,12 @@ int send_pack(struct send_pack_args *args,
> > /*
> > * Finally, tell the other end!
> > */
> > - if (!args->dry_run && push_cert_nonce)
> > + if (!args->dry_run && push_cert_nonce) {
> > + trace2_region_enter("send_pack", "push_cert", the_repository);
> > cmds_sent = generate_push_cert(&req_buf, remote_refs, args,
> > cap_buf.buf, push_cert_nonce);
> > - else if (!args->dry_run)
> > + trace2_region_leave("send_pack", "push_cert", the_repository);
> > + } else if (!args->dry_run) {
>
> Misleading "diff" but this is correct.
>
> But makes me wonder if we really want to express these (and other
> events we saw in [PATCH 1/2]) as regions we enter and leave.
> Presumably we would generate a certificate instantly, compared to
> all the other things happening in this process, like talking over
> the network, waiting for the other end, and packing the payload, and
> I suspect that the single bit the debuggers would want to learn from
> the trace is "did we get asked to give a certificate?".
> Sandwitching a rather expensive operation inside a pair of
> enter/leave would give us a way to measure how long that operation
> took in exchanges for one extra trace log entry, and "ah, we need to
> first fetch the bundle and process it" we saw in [PATCH 1/2] is
> something that is worth timing, but I am finding a bit hard to
> believe it is worth doing for push cert generation. It is
> understandable if there weren't any suitable mechanism to simply log
> "the control passed at this spot at this time" kind of event in the
> trace2 subsystem, but I do not think it is the case.
Ack, changed this to a "trace2_printf()" instead. Annoyingly the JSON
Event trace2 target that we use at $DAYJOB doesn't log these events, but
I can add another patch to enable that.
> > @@ -686,6 +692,7 @@ int send_pack(struct send_pack_args *args,
> > strbuf_release(&cap_buf);
> >
> > if (use_sideband && cmds_sent) {
> > + trace2_region_enter("send_pack", "sideband_demux", the_repository);
> > memset(&demux, 0, sizeof(demux));
> > demux.proc = sideband_demux;
> > demux.data = fd;
> > @@ -719,6 +726,8 @@ int send_pack(struct send_pack_args *args,
> > if (use_sideband) {
> > close(demux.out);
> > finish_async(&demux);
> > + if (cmds_sent)
> > + trace2_region_leave("send_pack", "sideband_demux", the_repository);
> > }
> > fd[1] = -1;
> > return -1;
> > @@ -743,6 +752,7 @@ int send_pack(struct send_pack_args *args,
> > error("error in sideband demultiplexer");
> > ret = -1;
> > }
> > + trace2_region_leave("send_pack", "sideband_demux", the_repository);
> > }
>
> This is also dubious. When sideband is in effect, this records the
> fact that we did ran pack-objects and allows as to measure how much
> time it was spent. But on a connection without sideband enabled, it
> does not record anything. But if we start the region a line sooner,
> and finish the region a line later, we should be able to record the
> same facts even for a connection without sideband enabled. I also
> find the name given to this region ultra-iffy. Is it so important
> that sideband_demux was used to communicate with the other end that
> received the data our pack_objects() produced that the word
> "sideband_demux" deserves to be in the name of the region, more than
> the fact that this is the crux of sending pack data from us to them
> (i.e. the main part of the "send-pack")?
Yeah, thanks, this did need to be reworked. I pushed the regions down
into pack_objects() and receive_status(), which look like the only two
places we might spend much time.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] send-pack: add new tracing regions for push
2024-08-22 20:20 ` Josh Steadmon
@ 2024-08-22 20:30 ` Junio C Hamano
0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2024-08-22 20:30 UTC (permalink / raw)
To: Josh Steadmon; +Cc: git
Josh Steadmon <steadmon@google.com> writes:
>> ... understandable if there weren't any suitable mechanism to simply log
>> "the control passed at this spot at this time" kind of event in the
>> trace2 subsystem, but I do not think it is the case.
>
> Ack, changed this to a "trace2_printf()" instead. Annoyingly the JSON
> Event trace2 target that we use at $DAYJOB doesn't log these events, but
> I can add another patch to enable that.
Ahh, OK, I was concentrating solely on the producing side, and
forgot to consider that the consuming side may not be prepared for
non enter/leave pair of events. That's understandable, but if you
are updating the consuming side to be able to do so, that would be
even better.
> Yeah, thanks, this did need to be reworked. I pushed the regions down
> into pack_objects() and receive_status(), which look like the only two
> places we might spend much time.
Sounds good.
This is a tangent, but I doubt we have many users without sideband
support. In the longer term we may be able to drop the non-sideband
codepath, which would automatically simplify the flow quite a bit
around here. But that is totally outside of this topic.
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 0/3] Add additional trace2 regions for fetch and push
2024-08-15 18:51 [PATCH 0/2] Add additional trace2 regions for fetch and push Josh Steadmon
2024-08-15 18:51 ` [PATCH 1/2] fetch: add top-level trace2 regions Josh Steadmon
2024-08-15 18:51 ` [PATCH 2/2] send-pack: add new tracing regions for push Josh Steadmon
@ 2024-08-22 21:57 ` Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 1/3] trace2: implement trace2_printf() for event target Josh Steadmon
` (3 more replies)
2 siblings, 4 replies; 13+ messages in thread
From: Josh Steadmon @ 2024-08-22 21:57 UTC (permalink / raw)
To: git; +Cc: gitster
Last year at $DAYJOB we were having issues with slow
fetches/pulls/pushes. We added some additional trace2 regions which
helped us narrow down the issue to some server-side negotiation
problems. We've been carrying these patches downstream ever since, but
they might be useful to others as well.
Changes in V2:
* Added a new patch to implement trace2_printf() for event targets.
* Move some fetch trace regions deeper in the call stack to simplify
control flow.
* Move some push trace regions deeper in the call stack to target the
actual slow functions, and to not tie tracing to whether or not
sideband communication is used.
Calvin Wan (1):
send-pack: add new tracing regions for push
Josh Steadmon (2):
trace2: implement trace2_printf() for event target
fetch: add top-level trace2 regions
Documentation/technical/api-trace2.txt | 17 +++++++++++++++--
builtin/fetch.c | 16 +++++++++++++++-
bundle-uri.c | 4 ++++
send-pack.c | 16 +++++++++++++---
trace2/tr2_tgt_event.c | 22 ++++++++++++++++++++--
5 files changed, 67 insertions(+), 8 deletions(-)
base-commit: 25673b1c476756ec0587fb0596ab3c22b96dc52a
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/3] trace2: implement trace2_printf() for event target
2024-08-22 21:57 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Josh Steadmon
@ 2024-08-22 21:57 ` Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 2/3] fetch: add top-level trace2 regions Josh Steadmon
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: Josh Steadmon @ 2024-08-22 21:57 UTC (permalink / raw)
To: git; +Cc: gitster
The trace2 event target does not have an implementation for
trace2_printf(). While the event target is for structured events, and
trace2_printf() is for unstructured, human-readable messages, it may
still be useful to wrap these unstructured messages in a structured JSON
object. Among other things, it may reduce confusion when manually
debugging using event trace data.
Add a simple implementation for the event target that wraps
trace2_printf() messages in a minimal JSON object. Document this in
Documentation/technical/api-trace2.txt, and bump the event format
version since we're adding a new event type.
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
Documentation/technical/api-trace2.txt | 17 +++++++++++++++--
trace2/tr2_tgt_event.c | 22 ++++++++++++++++++++--
2 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt
index de5fc25059..5817b18310 100644
--- a/Documentation/technical/api-trace2.txt
+++ b/Documentation/technical/api-trace2.txt
@@ -128,7 +128,7 @@ yields
------------
$ cat ~/log.event
-{"event":"version","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.620713Z","file":"common-main.c","line":38,"evt":"3","exe":"2.20.1.155.g426c96fcdb"}
+{"event":"version","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.620713Z","file":"common-main.c","line":38,"evt":"4","exe":"2.20.1.155.g426c96fcdb"}
{"event":"start","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621027Z","file":"common-main.c","line":39,"t_abs":0.001173,"argv":["git","version"]}
{"event":"cmd_name","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621122Z","file":"git.c","line":432,"name":"version","hierarchy":"version"}
{"event":"exit","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621236Z","file":"git.c","line":662,"t_abs":0.001227,"code":0}
@@ -344,7 +344,7 @@ only present on the "start" and "atexit" events.
{
"event":"version",
...
- "evt":"3", # EVENT format version
+ "evt":"4", # EVENT format version
"exe":"2.20.1.155.g426c96fcdb" # git version
}
------------
@@ -835,6 +835,19 @@ The "value" field may be an integer or a string.
}
------------
+`"printf"`::
+ This event logs a human-readable message with no particular formatting
+ guidelines.
++
+------------
+{
+ "event":"printf",
+ ...
+ "t_abs":0.015905, # elapsed time in seconds
+ "msg":"Hello world" # optional
+}
+------------
+
== Example Trace2 API Usage
diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c
index 59910a1a4f..45b0850a5e 100644
--- a/trace2/tr2_tgt_event.c
+++ b/trace2/tr2_tgt_event.c
@@ -24,7 +24,7 @@ static struct tr2_dst tr2dst_event = {
* a new field to an existing event, do not require an increment to the EVENT
* format version.
*/
-#define TR2_EVENT_VERSION "3"
+#define TR2_EVENT_VERSION "4"
/*
* Region nesting limit for messages written to the event target.
@@ -622,6 +622,24 @@ static void fn_data_json_fl(const char *file, int line,
}
}
+static void fn_printf_va_fl(const char *file, int line,
+ uint64_t us_elapsed_absolute,
+ const char *fmt, va_list ap)
+{
+ const char *event_name = "printf";
+ struct json_writer jw = JSON_WRITER_INIT;
+ double t_abs = (double)us_elapsed_absolute / 1000000.0;
+
+ jw_object_begin(&jw, 0);
+ event_fmt_prepare(event_name, file, line, NULL, &jw);
+ jw_object_double(&jw, "t_abs", 6, t_abs);
+ maybe_add_string_va(&jw, "msg", fmt, ap);
+ jw_end(&jw);
+
+ tr2_dst_write_line(&tr2dst_event, &jw.json);
+ jw_release(&jw);
+}
+
static void fn_timer(const struct tr2_timer_metadata *meta,
const struct tr2_timer *timer,
int is_final_data)
@@ -694,7 +712,7 @@ struct tr2_tgt tr2_tgt_event = {
.pfn_region_leave_printf_va_fl = fn_region_leave_printf_va_fl,
.pfn_data_fl = fn_data_fl,
.pfn_data_json_fl = fn_data_json_fl,
- .pfn_printf_va_fl = NULL,
+ .pfn_printf_va_fl = fn_printf_va_fl,
.pfn_timer = fn_timer,
.pfn_counter = fn_counter,
};
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/3] fetch: add top-level trace2 regions
2024-08-22 21:57 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 1/3] trace2: implement trace2_printf() for event target Josh Steadmon
@ 2024-08-22 21:57 ` Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 3/3] send-pack: add new tracing regions for push Josh Steadmon
2024-08-22 22:10 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Junio C Hamano
3 siblings, 0 replies; 13+ messages in thread
From: Josh Steadmon @ 2024-08-22 21:57 UTC (permalink / raw)
To: git; +Cc: gitster
At $DAYJOB we experienced some slow fetch operations and needed some
additional data to help diagnose the issue.
Add top-level trace2 regions for the various modes of operation of
`git-fetch`. None of these regions are in recursive code, so any
enclosed trace messages should only see their nesting level increase by
one.
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
builtin/fetch.c | 16 +++++++++++++++-
bundle-uri.c | 4 ++++
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 693f02b958..9e20a41d2a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2407,6 +2407,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
struct oidset_iter iter;
const struct object_id *oid;
+ trace2_region_enter("fetch", "negotiate-only", the_repository);
if (!remote)
die(_("must supply remote when using --negotiate-only"));
gtransport = prepare_transport(remote, 1);
@@ -2415,6 +2416,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
} else {
warning(_("protocol does not support --negotiate-only, exiting"));
result = 1;
+ trace2_region_leave("fetch", "negotiate-only", the_repository);
goto cleanup;
}
if (server_options.nr)
@@ -2425,11 +2427,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
while ((oid = oidset_iter_next(&iter)))
printf("%s\n", oid_to_hex(oid));
oidset_clear(&acked_commits);
+ trace2_region_leave("fetch", "negotiate-only", the_repository);
} else if (remote) {
- if (filter_options.choice || repo_has_promisor_remote(the_repository))
+ if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
+ trace2_region_enter("fetch", "setup-partial", the_repository);
fetch_one_setup_partial(remote);
+ trace2_region_leave("fetch", "setup-partial", the_repository);
+ }
+ trace2_region_enter("fetch", "fetch-one", the_repository);
result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs,
&config);
+ trace2_region_leave("fetch", "fetch-one", the_repository);
} else {
int max_children = max_jobs;
@@ -2449,7 +2457,9 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
max_children = config.parallel;
/* TODO should this also die if we have a previous partial-clone? */
+ trace2_region_enter("fetch", "fetch-multiple", the_repository);
result = fetch_multiple(&list, max_children, &config);
+ trace2_region_leave("fetch", "fetch-multiple", the_repository);
}
/*
@@ -2471,6 +2481,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
max_children = config.parallel;
add_options_to_argv(&options, &config);
+ trace2_region_enter_printf("fetch", "recurse-submodule", the_repository, "%s", submodule_prefix);
result = fetch_submodules(the_repository,
&options,
submodule_prefix,
@@ -2478,6 +2489,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
recurse_submodules_default,
verbosity < 0,
max_children);
+ trace2_region_leave_printf("fetch", "recurse-submodule", the_repository, "%s", submodule_prefix);
strvec_clear(&options);
}
@@ -2501,9 +2513,11 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
if (progress)
commit_graph_flags |= COMMIT_GRAPH_WRITE_PROGRESS;
+ trace2_region_enter("fetch", "write-commit-graph", the_repository);
write_commit_graph_reachable(the_repository->objects->odb,
commit_graph_flags,
NULL);
+ trace2_region_leave("fetch", "write-commit-graph", the_repository);
}
if (enable_auto_gc) {
diff --git a/bundle-uri.c b/bundle-uri.c
index 1e0ee156ba..dc0c96955b 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -13,6 +13,7 @@
#include "config.h"
#include "fetch-pack.h"
#include "remote.h"
+#include "trace2.h"
static struct {
enum bundle_list_heuristic heuristic;
@@ -799,6 +800,8 @@ int fetch_bundle_uri(struct repository *r, const char *uri,
.id = xstrdup(""),
};
+ trace2_region_enter("fetch", "fetch-bundle-uri", the_repository);
+
init_bundle_list(&list);
/*
@@ -824,6 +827,7 @@ int fetch_bundle_uri(struct repository *r, const char *uri,
for_all_bundles_in_list(&list, unlink_bundle, NULL);
clear_bundle_list(&list);
clear_remote_bundle_info(&bundle, NULL);
+ trace2_region_leave("fetch", "fetch-bundle-uri", the_repository);
return result;
}
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 3/3] send-pack: add new tracing regions for push
2024-08-22 21:57 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 1/3] trace2: implement trace2_printf() for event target Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 2/3] fetch: add top-level trace2 regions Josh Steadmon
@ 2024-08-22 21:57 ` Josh Steadmon
2024-08-22 22:10 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Junio C Hamano
3 siblings, 0 replies; 13+ messages in thread
From: Josh Steadmon @ 2024-08-22 21:57 UTC (permalink / raw)
To: git; +Cc: gitster
From: Calvin Wan <calvinwan@google.com>
At $DAYJOB we experienced some slow pushes and needed additional trace
data to diagnose them.
Add trace2 regions for various sections of send_pack().
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
send-pack.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/send-pack.c b/send-pack.c
index fa2f5eec17..9666b2c995 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -75,6 +75,7 @@ static int pack_objects(int fd, struct ref *refs, struct oid_array *advertised,
int i;
int rc;
+ trace2_region_enter("send_pack", "pack_objects", the_repository);
strvec_push(&po.args, "pack-objects");
strvec_push(&po.args, "--all-progress-implied");
strvec_push(&po.args, "--revs");
@@ -146,8 +147,10 @@ static int pack_objects(int fd, struct ref *refs, struct oid_array *advertised,
*/
if (rc > 128 && rc != 141)
error("pack-objects died of signal %d", rc - 128);
+ trace2_region_leave("send_pack", "pack_objects", the_repository);
return -1;
}
+ trace2_region_leave("send_pack", "pack_objects", the_repository);
return 0;
}
@@ -170,6 +173,7 @@ static int receive_status(struct packet_reader *reader, struct ref *refs)
int new_report = 0;
int once = 0;
+ trace2_region_enter("send_pack", "receive_status", the_repository);
hint = NULL;
ret = receive_unpack_status(reader);
while (1) {
@@ -268,6 +272,7 @@ static int receive_status(struct packet_reader *reader, struct ref *refs)
new_report = 1;
}
}
+ trace2_region_leave("send_pack", "receive_status", the_repository);
return ret;
}
@@ -512,8 +517,11 @@ int send_pack(struct send_pack_args *args,
}
git_config_get_bool("push.negotiate", &push_negotiate);
- if (push_negotiate)
+ if (push_negotiate) {
+ trace2_region_enter("send_pack", "push_negotiate", the_repository);
get_commons_through_negotiation(args->url, remote_refs, &commons);
+ trace2_region_leave("send_pack", "push_negotiate", the_repository);
+ }
if (!git_config_get_bool("push.usebitmaps", &use_bitmaps))
args->disable_bitmaps = !use_bitmaps;
@@ -641,10 +649,11 @@ int send_pack(struct send_pack_args *args,
/*
* Finally, tell the other end!
*/
- if (!args->dry_run && push_cert_nonce)
+ if (!args->dry_run && push_cert_nonce) {
cmds_sent = generate_push_cert(&req_buf, remote_refs, args,
cap_buf.buf, push_cert_nonce);
- else if (!args->dry_run)
+ trace2_printf("Generated push certificate");
+ } else if (!args->dry_run) {
for (ref = remote_refs; ref; ref = ref->next) {
char *old_hex, *new_hex;
@@ -664,6 +673,7 @@ int send_pack(struct send_pack_args *args,
old_hex, new_hex, ref->name);
}
}
+ }
if (use_push_options) {
struct string_list_item *item;
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/3] Add additional trace2 regions for fetch and push
2024-08-22 21:57 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Josh Steadmon
` (2 preceding siblings ...)
2024-08-22 21:57 ` [PATCH v2 3/3] send-pack: add new tracing regions for push Josh Steadmon
@ 2024-08-22 22:10 ` Junio C Hamano
3 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2024-08-22 22:10 UTC (permalink / raw)
To: Josh Steadmon; +Cc: git
Josh Steadmon <steadmon@google.com> writes:
> Last year at $DAYJOB we were having issues with slow
> fetches/pulls/pushes. We added some additional trace2 regions which
> helped us narrow down the issue to some server-side negotiation
> problems. We've been carrying these patches downstream ever since, but
> they might be useful to others as well.
Thanks, will queue.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-08-22 22:10 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-15 18:51 [PATCH 0/2] Add additional trace2 regions for fetch and push Josh Steadmon
2024-08-15 18:51 ` [PATCH 1/2] fetch: add top-level trace2 regions Josh Steadmon
2024-08-15 19:47 ` Junio C Hamano
2024-08-19 18:26 ` Josh Steadmon
2024-08-15 18:51 ` [PATCH 2/2] send-pack: add new tracing regions for push Josh Steadmon
2024-08-15 20:06 ` Junio C Hamano
2024-08-22 20:20 ` Josh Steadmon
2024-08-22 20:30 ` Junio C Hamano
2024-08-22 21:57 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 1/3] trace2: implement trace2_printf() for event target Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 2/3] fetch: add top-level trace2 regions Josh Steadmon
2024-08-22 21:57 ` [PATCH v2 3/3] send-pack: add new tracing regions for push Josh Steadmon
2024-08-22 22:10 ` [PATCH v2 0/3] Add additional trace2 regions for fetch and push Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).