From: Junio C Hamano <gitster@pobox.com>
To: "Haritha via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Haritha <harithamma.d@ibm.com>
Subject: Re: [PATCH v2] Fix to avoid high memory footprint
Date: Wed, 24 Jul 2024 14:41:57 -0700 [thread overview]
Message-ID: <xmqqmsm6sc0q.fsf@gitster.g> (raw)
In-Reply-To: <pull.1744.v2.git.git.1721821503173.gitgitgadget@gmail.com> (Haritha via GitGitGadget's message of "Wed, 24 Jul 2024 11:45:03 +0000")
"Haritha via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: D Harithamma <harithamma.d@ibm.com>
>
> This fix avoids high memory footprint when adding files that require
> conversion. Git has a trace_encoding routine that prints trace output
> when GIT_TRACE_WORKING_TREE_ENCODING=1 is set. This environment
> variable is used to debug the encoding contents. When a 40MB file is
> added, it requests close to 1.8GB of storage from xrealloc which can
> lead to out of memory errors. However, the check for
> GIT_TRACE_WORKING_TREE_ENCODING is done after the string is allocated.
> This resolves high memory footprints even when
> GIT_TRACE_WORKING_TREE_ENCODING is not active. This fix adds an early
> exit to avoid the unnecessary memory allocation.
The sentences jump around and the logic flow is hard to follow. The
first sentence makes a claim of what it does (but the readers have
not bee told where that problem comes from). The second sentence
makes a statement of a fact, but the readers do not yet know at that
point what relevance the fact has to the issue at hand, etc.
The usual way to compose a log message of this project is to
- Give an observation on how the current system work in the present
tense (so no need to say "Currently X is Y", just "X is Y"), and
discuss what you perceive as a problem in it.
- Propose a solution (optional---often, problem description
trivially leads to an obvious solution in reader's minds).
- Give commands to the codebase to "become like so".
in this order.
When Git needs to add a file that require encoding conversion,
but tracing of encoding conversion is *not* requested via
setting GIT_TRACE_WORKING_TREE_ENCODING environment variable,
the trace_encoding() function still allocated and prepared
"human readable" copies of the file contents before and after
conversion to show in the trace. This wasted a lot of memory
footprint and runtime cycles without giving any user-visible
benefit.
Exit early from the function when we we are not tracing before
we spend all the effort, not after.
or something, perhaps?
I am wondering if we should be able to test this, but "git grep
GIT_TRACE_WORKING_TREE_ENCODING t/" is not finding any existing test
in the area.
> Signed-off-by: Harithamma D <harithamma.d@ibm.com>
This does not match the "From: " line above. Please pick one way to
spell your name and identify yourself to this project, and use it
consistently.
Thanks.
> diff --git a/convert.c b/convert.c
> index d8737fe0f2d..c4ddc4de81b 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -324,6 +324,9 @@ static void trace_encoding(const char *context, const char *path,
> struct strbuf trace = STRBUF_INIT;
> int i;
>
> + if (!trace_want(&coe))
> + return;
> +
The actual fix is so simple and nice ;-)
next prev parent reply other threads:[~2024-07-24 21:42 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-16 8:03 [PATCH] Fix to avoid high memory footprint Haritha via GitGitGadget
2024-07-17 6:16 ` Jeff King
2024-07-24 11:45 ` [PATCH v2] " Haritha via GitGitGadget
2024-07-24 21:41 ` Junio C Hamano [this message]
2024-07-24 22:16 ` Jeff King
2024-07-26 6:27 ` [PATCH v3] " Haritha via GitGitGadget
2024-07-26 9:55 ` Torsten Bögershausen
2024-07-26 14:00 ` [PATCH v4] convert: " Haritha via GitGitGadget
2024-07-30 3:42 ` [PATCH v5] convert: return early when not tracing Haritha via GitGitGadget
2024-07-31 2:42 ` Junio C Hamano
2024-07-31 9:32 ` Haritha D
2024-07-31 13:33 ` [PATCH v6] " Haritha via GitGitGadget
2024-07-26 15:06 ` [PATCH v3] Fix to avoid high memory footprint Junio C Hamano
2024-07-26 15:12 ` Junio C Hamano
2024-07-30 3:41 ` Haritha D
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqmsm6sc0q.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=harithamma.d@ibm.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.