git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Erik Elfström" <erik.elfstrom@gmail.com>
To: git@vger.kernel.org
Cc: "Erik Elfström" <erik.elfstrom@gmail.com>
Subject: [PATCH 0/3] Improving performance of git clean
Date: Mon,  6 Apr 2015 13:48:21 +0200	[thread overview]
Message-ID: <1428320904-12366-1-git-send-email-erik.elfstrom@gmail.com> (raw)

This series addresses a performance issue of git clean previously
discussed here:
http://thread.gmane.org/gmane.comp.version-control.git/265560/focus=265560
and here:
http://thread.gmane.org/gmane.comp.version-control.git/266777/focus=266777

The issue manifests when trying to clean a large number of untracked
directories. In my case this scenario triggered by a test suite
running in the repository creating a directory for each test resulting
in a build directory with ~100000 sub directories that needs to be
cleaned. For some extreme cases, clean times of more than 1h have been
observed.

With this series, the time to clean an untracked directory containing
100000 sub directories goes from 61s to 1.7s.

The main change is to switch the repository check in
clean.c:remove_dirs from using refs.c:resolve_gitlink_ref to
setup.c:is_git_directory.

One potential issue that is_git_directory contains the following check:

	if (getenv(DB_ENVIRONMENT)) {
		if (access(getenv(DB_ENVIRONMENT), X_OK))
			return 0;
	}

I'm not sure how this will affect this usecase (checking for some
other nested git repo). Please give some thought to this when
reviewing.

Jeff King also expressed concerns that we may have similar performance
issues in other commands and that it could be good to unify these "is
this a repo?"-checks. This series only attempts to solve the git-clean
case.

Erik Elfström (3):
  t7300: add tests to document behavior of clean and nested git
  p7300: added performance tests for clean
  clean: improve performance when removing lots of directories

 builtin/clean.c       | 23 ++++++++++++---
 t/perf/p7300-clean.sh | 37 +++++++++++++++++++++++
 t/t7300-clean.sh      | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 138 insertions(+), 4 deletions(-)
 create mode 100755 t/perf/p7300-clean.sh

-- 
2.4.0.rc0.37.ga3b75b3

             reply	other threads:[~2015-04-06 11:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-06 11:48 Erik Elfström [this message]
2015-04-06 11:48 ` [PATCH 1/3] t7300: add tests to document behavior of clean and nested git Erik Elfström
2015-04-06 22:06   ` Eric Sunshine
2015-04-07 19:27     ` erik elfström
2015-04-07 19:40   ` Eric Sunshine
2015-04-07 19:53     ` Torsten Bögershausen
2015-04-06 11:48 ` [PATCH 2/3] p7300: added performance tests for clean Erik Elfström
2015-04-06 20:40   ` Torsten Bögershausen
2015-04-06 22:09     ` Eric Sunshine
2015-04-07 19:35       ` erik elfström
2015-04-06 11:48 ` [PATCH 3/3] clean: improve performance when removing lots of directories Erik Elfström
2015-04-06 22:10   ` Eric Sunshine
2015-04-07 19:55     ` erik elfström
2015-04-08 21:29       ` Eric Sunshine

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1428320904-12366-1-git-send-email-erik.elfstrom@gmail.com \
    --to=erik.elfstrom@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).