git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Alban Gruin <alban.gruin@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [RFC PATCH 0/4] name-rev: improve memory usage
Date: Fri, 1 Mar 2019 14:39:12 -0500	[thread overview]
Message-ID: <20190301193912.GA28195@sigill.intra.peff.net> (raw)
In-Reply-To: <c496ca1a-d493-64cf-9e9d-d1aa189bd33d@gmail.com>

On Fri, Mar 01, 2019 at 08:14:26PM +0100, Alban Gruin wrote:

> > diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> > index f1cb45c227..7aaa86f1c0 100644
> > --- a/builtin/name-rev.c
> > +++ b/builtin/name-rev.c
> > @@ -431,6 +431,8 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
> >  		OPT_END(),
> >  	};
> >  
> > +	save_commit_buffer = 0;
> > +
> [...]
> 
> Unfortunately this does not work in all cases, apparently.  On my git
> copy, I have 3 origins.  If I run this:
> 
> 	git log --graph --oneline --abbrev=-1 -5 | git name-rev --stdin
> 
> With or without your change, it uses 3GB of RAM.  With this series, it
> uses 25MB of RAM.

Sorry if I was unclear. I meant to try that _in addition_ to your
changes. It helps by avoiding keeping the useless commit-object buffers
in RAM as we traverse. But the most it can save is the total
uncompressed bytes of all commit objects. I.e., in git.git:

  $ git cat-file --batch-check='%(objectsize) %(objecttype)' --batch-all-objects |
    grep commit |
    perl -alne '$total += $F[0]; END { print $total }'
  74678114

or around 70MB. In linux.git, it's more like 700MB.

But in your examples, the problem is the inefficiencies in name-rev's
algorithm, and you're not actually traversing that many commits. So I
think you'd want to turn off save_commit_buffer as an extra patch in
your series. It may or not be a big win for any given case, but it's
quite easy to do.

-Peff

      reply	other threads:[~2019-03-01 19:39 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-01 17:50 [RFC PATCH 0/4] name-rev: improve memory usage Alban Gruin
2019-03-01 17:50 ` [RFC PATCH 1/4] name-rev: improve name_rev() " Alban Gruin
2019-03-01 18:00   ` Eric Sunshine
2019-03-01 18:44   ` Jeff King
2019-03-02 21:28   ` Johannes Schindelin
2019-03-01 17:50 ` [RFC PATCH 2/4] commit-list: add a function to check if a commit is in a list Alban Gruin
2019-03-01 17:50 ` [RFC PATCH 3/4] name-rev: check if a commit should be named before naming it Alban Gruin
2019-03-01 18:05   ` Eric Sunshine
2019-03-01 18:22     ` Alban Gruin
2019-03-01 18:37       ` Jeff King
2019-03-01 17:50 ` [RFC PATCH 4/4] name-rev: avoid naming from a ref if it’s not a descendant of any commit Alban Gruin
2019-03-01 18:07   ` Eric Sunshine
2019-03-03 19:33   ` Christian Couder
2019-03-03 19:46     ` Christian Couder
2019-03-03 20:27     ` Alban Gruin
2019-03-01 18:42 ` [RFC PATCH 0/4] name-rev: improve memory usage Jeff King
2019-03-01 19:14   ` Alban Gruin
2019-03-01 19:39     ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190301193912.GA28195@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=alban.gruin@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).