From: Taylor Blau <me@ttaylorr.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>,
man dog <dogman888888@gmail.com>
Subject: Re: [PATCH 1/3] line-log: free diff queue when processing non-merge commits
Date: Wed, 2 Nov 2022 20:20:21 -0400 [thread overview]
Message-ID: <Y2MJRRfwG7rSp6Ra@nand.local> (raw)
In-Reply-To: <20221102220142.574890-2-szeder.dev@gmail.com>
On Wed, Nov 02, 2022 at 11:01:40PM +0100, SZEDER Gábor wrote:
> When processing a non-merge commit, the line-level log first asks the
> tree-diff machinery whether any of the files in the given line ranges
> were modified between the current commit and its parent, and if some
> of them were, then it loads the contents of those files from both
> commits to see whether their line ranges were modified and/or need to
> be adjusted. Alas, it doesn't free() the diff queue holding the
> results of that query and the contents of those files once its done.
> This can add up to a substantial amount of leaked memory, especially
> when the file in question is big and is frequently modified: a user
> reported "Out of memory, malloc failed" errors with a 2MB text file
> that was modified ~2800 times [1] (I estimate the leak would use up
> almost 11GB memory in that case).
>
> Free that diff queue to plug this memory leak. However, instead of
> simply open-coding the necessary three lines, add them as a helper
> function to the diff API, because it will be useful elsewhere as well.
Nicely explained.
> ---
> diff.c | 7 +++++++
> diffcore.h | 1 +
> line-log.c | 1 +
> 3 files changed, 9 insertions(+)
And all looks reasonable here, good...
> diff --git a/diff.c b/diff.c
> index 35e46dd968..ef94175163 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -5773,6 +5773,13 @@ void diff_free_filepair(struct diff_filepair *p)
> free(p);
> }
>
> +void diff_free_queue(struct diff_queue_struct *q)
> +{
> + for (int i = 0; i < q->nr; i++)
> + diff_free_filepair(q->queue[i]);
> + free(q->queue);
> +}
Though I wonder, should diff_free_queue() be a noop when q is NULL? The
caller in process_ranges_ordinary_commit() doesn't care, of course,
since q is always non-NULL there.
But if we're making it part of the diff API, we should probably err on
the side of flexibility.
Thanks,
Taylor
next prev parent reply other threads:[~2022-11-03 0:20 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-29 16:59 Bug report: git -L requires excessive memory man dog
2022-10-31 21:45 ` SZEDER Gábor
2022-10-31 21:56 ` Taylor Blau
2022-11-02 22:01 ` [PATCH 0/3] line-log: plug some memory leaks SZEDER Gábor
2022-11-02 22:01 ` [PATCH 1/3] line-log: free diff queue when processing non-merge commits SZEDER Gábor
2022-11-03 0:20 ` Taylor Blau [this message]
2022-11-07 15:11 ` SZEDER Gábor
2022-11-07 15:29 ` Ævar Arnfjörð Bjarmason
2022-11-07 15:57 ` SZEDER Gábor
2022-11-08 2:14 ` Taylor Blau
2022-11-02 22:01 ` [PATCH 2/3] line-log: free the diff queues' arrays when processing merge commits SZEDER Gábor
2022-11-03 0:21 ` Taylor Blau
2022-11-02 22:01 ` [PATCH 3/3] diff.c: use diff_free_queue() SZEDER Gábor
2022-11-03 0:24 ` Taylor Blau
2022-11-07 16:13 ` SZEDER Gábor
2022-11-08 2:14 ` Taylor Blau
2022-11-03 9:05 ` [PATCH 0/3] line-log: plug some memory leaks Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2MJRRfwG7rSp6Ra@nand.local \
--to=me@ttaylorr.com \
--cc=dogman888888@gmail.com \
--cc=git@vger.kernel.org \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.