From: "Haritha via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Jeff King" <peff@peff.net>,
"Torsten Bögershausen" <tboegi@web.de>,
Haritha <harithamma.d@ibm.com>,
"D Harithamma" <harithamma.d@ibm.com>
Subject: [PATCH v4] convert: avoid high memory footprint
Date: Fri, 26 Jul 2024 14:00:32 +0000 [thread overview]
Message-ID: <pull.1744.v4.git.git.1722002432630.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1744.v3.git.git.1721975234873.gitgitgadget@gmail.com>
From: D Harithamma <harithamma.d@ibm.com>
When Git adds a file requiring encoding conversion and tracing of encoding
conversion is not requested via the GIT_TRACE_WORKING_TREE_ENCODING
environment variable, the `trace_encoding()` function still allocates &
prepares "human readable" copies of the file contents before and after
conversion to show in the trace. This results in a high memory footprint
and increased runtime without providing any user-visible benefit.
This fix introduces an early exit from the `trace_encoding()` function
when tracing is not requested, preventing unnecessary memory allocation
and processing.
Signed-off-by: Harithamma D <harithamma.d@ibm.com>
---
Fix to avoid high memory footprint
This fix avoids high memory footprint when adding files that require
conversion
Git has a trace_encoding routine that prints trace output when
GIT_TRACE_WORKING_TREE_ENCODING=1 is set. This environment variable is
used to debug the encoding contents. When a 40MB file is added, it
requests close to 1.8GB of storage from xrealloc which can lead to out
of memory errors. However, the check for GIT_TRACE_WORKING_TREE_ENCODING
is done after the string is allocated. This resolves high memory
footprints even when GIT_TRACE_WORKING_TREE_ENCODING is not active. This
fix adds an early exit to avoid the unnecessary memory allocation.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1744%2FHarithaIBM%2FmemFootprintFix-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1744/HarithaIBM/memFootprintFix-v4
Pull-Request: https://github.com/git/git/pull/1744
Range-diff vs v3:
1: d864de64380 ! 1: 50758a4fb94 Fix to avoid high memory footprint
@@ Metadata
Author: D Harithamma <harithamma.d@ibm.com>
## Commit message ##
- Fix to avoid high memory footprint
+ convert: avoid high memory footprint
When Git adds a file requiring encoding conversion and tracing of encoding
conversion is not requested via the GIT_TRACE_WORKING_TREE_ENCODING
convert.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/convert.c b/convert.c
index d8737fe0f2d..c4ddc4de81b 100644
--- a/convert.c
+++ b/convert.c
@@ -324,6 +324,9 @@ static void trace_encoding(const char *context, const char *path,
struct strbuf trace = STRBUF_INIT;
int i;
+ if (!trace_want(&coe))
+ return;
+
strbuf_addf(&trace, "%s (%s, considered %s):\n", context, path, encoding);
for (i = 0; i < len && buf; ++i) {
strbuf_addf(
base-commit: 557ae147e6cdc9db121269b058c757ac5092f9c9
--
gitgitgadget
next prev parent reply other threads:[~2024-07-26 14:00 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-16 8:03 [PATCH] Fix to avoid high memory footprint Haritha via GitGitGadget
2024-07-17 6:16 ` Jeff King
2024-07-24 11:45 ` [PATCH v2] " Haritha via GitGitGadget
2024-07-24 21:41 ` Junio C Hamano
2024-07-24 22:16 ` Jeff King
2024-07-26 6:27 ` [PATCH v3] " Haritha via GitGitGadget
2024-07-26 9:55 ` Torsten Bögershausen
2024-07-26 14:00 ` Haritha via GitGitGadget [this message]
2024-07-30 3:42 ` [PATCH v5] convert: return early when not tracing Haritha via GitGitGadget
2024-07-31 2:42 ` Junio C Hamano
2024-07-31 9:32 ` Haritha D
2024-07-31 13:33 ` [PATCH v6] " Haritha via GitGitGadget
2024-07-26 15:06 ` [PATCH v3] Fix to avoid high memory footprint Junio C Hamano
2024-07-26 15:12 ` Junio C Hamano
2024-07-30 3:41 ` Haritha D
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1744.v4.git.git.1722002432630.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=harithamma.d@ibm.com \
--cc=peff@peff.net \
--cc=tboegi@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.