All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Haritha  via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Haritha  <harithamma.d@ibm.com>,
	D Harithamma <harithamma.d@ibm.com>
Subject: [PATCH v2] Fix to avoid high memory footprint
Date: Wed, 24 Jul 2024 11:45:03 +0000	[thread overview]
Message-ID: <pull.1744.v2.git.git.1721821503173.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1744.git.git.1721117039874.gitgitgadget@gmail.com>

From: D Harithamma <harithamma.d@ibm.com>

This fix avoids high memory footprint when adding files that require
conversion.  Git has a trace_encoding routine that prints trace output
when GIT_TRACE_WORKING_TREE_ENCODING=1 is set. This environment
variable is used to debug the encoding contents.  When a 40MB file is
added, it requests close to 1.8GB of storage from xrealloc which can
lead to out of memory errors.  However, the check for
GIT_TRACE_WORKING_TREE_ENCODING is done after the string is allocated.
This resolves high memory footprints even when
GIT_TRACE_WORKING_TREE_ENCODING is not active.  This fix adds an early
exit to avoid the unnecessary memory allocation.

Signed-off-by: Harithamma D <harithamma.d@ibm.com>
---
    Fix to avoid high memory footprint
    
    This fix avoids high memory footprint when adding files that require
    conversion
    
    Git has a trace_encoding routine that prints trace output when
    GIT_TRACE_WORKING_TREE_ENCODING=1 is set. This environment variable is
    used to debug the encoding contents. When a 40MB file is added, it
    requests close to 1.8GB of storage from xrealloc which can lead to out
    of memory errors. However, the check for GIT_TRACE_WORKING_TREE_ENCODING
    is done after the string is allocated. This resolves high memory
    footprints even when GIT_TRACE_WORKING_TREE_ENCODING is not active. This
    fix adds an early exit to avoid the unnecessary memory allocation.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1744%2FHarithaIBM%2FmemFootprintFix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1744/HarithaIBM/memFootprintFix-v2
Pull-Request: https://github.com/git/git/pull/1744

Range-diff vs v1:

 1:  51c02f58fd6 ! 1:  500b7eacf2a Fix to avoid high memory footprint
     @@ Metadata
       ## Commit message ##
          Fix to avoid high memory footprint
      
     -    This fix avoids high memory footprint when
     -    adding files that require conversion.
     -    Git has a trace_encoding routine that prints trace
     -    output when GIT_TRACE_WORKING_TREE_ENCODING=1 is
     -    set. This environment variable is used to debug
     -    the encoding contents.
     -    When a 40MB file is added, it requests close to
     -    1.8GB of storage from xrealloc which can lead
     -    to out of memory errors.
     -    However, the check for
     -    GIT_TRACE_WORKING_TREE_ENCODING is done after
     -    the string is allocated. This resolves high
     -    memory footprints even when
     -    GIT_TRACE_WORKING_TREE_ENCODING is not active.
     -    This fix adds an early exit to avoid the
     -    unnecessary memory allocation.
     +    This fix avoids high memory footprint when adding files that require
     +    conversion.  Git has a trace_encoding routine that prints trace output
     +    when GIT_TRACE_WORKING_TREE_ENCODING=1 is set. This environment
     +    variable is used to debug the encoding contents.  When a 40MB file is
     +    added, it requests close to 1.8GB of storage from xrealloc which can
     +    lead to out of memory errors.  However, the check for
     +    GIT_TRACE_WORKING_TREE_ENCODING is done after the string is allocated.
     +    This resolves high memory footprints even when
     +    GIT_TRACE_WORKING_TREE_ENCODING is not active.  This fix adds an early
     +    exit to avoid the unnecessary memory allocation.
      
     -    Signed-off-by: Haritha D <harithamma.d@ibm.com>
     +    Signed-off-by: Harithamma D <harithamma.d@ibm.com>
      
       ## convert.c ##
      @@ convert.c: static void trace_encoding(const char *context, const char *path,
       	struct strbuf trace = STRBUF_INIT;
       	int i;
       
     -+	// If tracing is not on, exit early to avoid high memory footprint
     -+	if (!trace_pass_fl(&coe)) {
     ++	if (!trace_want(&coe))
      +		return;
     -+	}
      +
       	strbuf_addf(&trace, "%s (%s, considered %s):\n", context, path, encoding);
       	for (i = 0; i < len && buf; ++i) {


 convert.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/convert.c b/convert.c
index d8737fe0f2d..c4ddc4de81b 100644
--- a/convert.c
+++ b/convert.c
@@ -324,6 +324,9 @@ static void trace_encoding(const char *context, const char *path,
 	struct strbuf trace = STRBUF_INIT;
 	int i;
 
+	if (!trace_want(&coe))
+		return;
+
 	strbuf_addf(&trace, "%s (%s, considered %s):\n", context, path, encoding);
 	for (i = 0; i < len && buf; ++i) {
 		strbuf_addf(

base-commit: 557ae147e6cdc9db121269b058c757ac5092f9c9
-- 
gitgitgadget

  parent reply	other threads:[~2024-07-24 11:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-16  8:03 [PATCH] Fix to avoid high memory footprint Haritha  via GitGitGadget
2024-07-17  6:16 ` Jeff King
2024-07-24 11:45 ` Haritha  via GitGitGadget [this message]
2024-07-24 21:41   ` [PATCH v2] " Junio C Hamano
2024-07-24 22:16   ` Jeff King
2024-07-26  6:27   ` [PATCH v3] " Haritha  via GitGitGadget
2024-07-26  9:55     ` Torsten Bögershausen
2024-07-26 14:00     ` [PATCH v4] convert: " Haritha  via GitGitGadget
2024-07-30  3:42       ` [PATCH v5] convert: return early when not tracing Haritha  via GitGitGadget
2024-07-31  2:42         ` Junio C Hamano
2024-07-31  9:32           ` Haritha D
2024-07-31 13:33         ` [PATCH v6] " Haritha  via GitGitGadget
2024-07-26 15:06     ` [PATCH v3] Fix to avoid high memory footprint Junio C Hamano
2024-07-26 15:12     ` Junio C Hamano
2024-07-30  3:41       ` Haritha D

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1744.v2.git.git.1721821503173.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=harithamma.d@ibm.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.