git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: pclouds@gmail.com
To: git@vger.kernel.org
Cc: weigelt@metux.de, spearce@spearce.org, jrnieder@gmail.com,
	Matthieu.Moy@grenoble-inp.fr, raa.lkml@gmail.com,
	"Junio C Hamano" <gitster@pobox.com>,
	judge.packham@gmail.com,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH] git.txt: document limitations on non-typical repos (and hints)
Date: Wed,  6 Oct 2010 21:23:11 +0700	[thread overview]
Message-ID: <4cac8659.0541730a.0ef0.3ff9@mx.google.com> (raw)
In-Reply-To: <1286283653-22616-1-git-send-email-pclouds@gmail.com>

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

---
 Revised version. I dropped shallow clone because it does not really
 relate to performance.

 Documentation/git.txt |   41 +++++++++++++++++++++++++++++++++++++++++
 1 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index dd57bdc..129947f 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -729,6 +729,47 @@ The index is also capable of storing multiple entries (called "stages")
 for a given pathname.  These stages are used to hold the various
 unmerged version of a file when a merge is in progress.
 
+Performance concerns
+--------------------
+
+Git is written with performance in mind and it works extremely well
+with its typical repositories (i.e. source code repositories, with
+a moderate number of small text files, possibly with long history).
+Non-typical repositories (a lot of files, or very large files...)
+may experience mild performance degradation. This section describes
+how Git behaves in such repositories and how to reduce impact.
+
+For repositories with a large number of files (~50k files or more),
+but you only need a few of them present in working tree, you can use
+sparse checkout (see linkgit:git-read-tree[1], section 'Sparse
+checkout'). If you need all of them present in working tree, but you
+know in advance only a few of them may be modified, please consider
+using assume-unchanged bit (see linkgit:git-update-index[1]). This
+helps reduce the number of lstat(2) calls.
+
+Git uses lstat(2) to detect changes in working tree, one call for each
+tracked file, in what is called "index refresh". A significant number of
+lstat(2) calls may create a small delay for many commands, especially
+on systems with slow lstat(2). In some cases you can reduce the number
+of lstat(2) calls by specifying what directories you are interested
+in, so no lstat(2) outside is needed. The following commands are
+however known to do full index refresh in some cases:
+linkgit:git-commit[1], linkgit:git-status[1], linkgit:git-diff[1],
+linkgit:git-reset[1], linkgit:git-checkout[1], linkgit:git-merge[1].
+
+Some commands need entire file content in memory to process.
+Files that have size a significant portion of physical RAM may
+affect performance. You may want to avoid using the following
+commands if possible on such large files:
+
+* All checkout commands (linkgit:git-checkout[1],
+  linkgit:git-checkout-index[1], linkgit:git-read-tree[1],
+  linkgit:git-clone[1]...)
+* All diff-related commands (linkgit:git-diff[1],
+  linkgit:git-log[1] with diff, linkgit:git-show[1] on commits...)
+* All commands that need file conversion processing (see
+  linkgit:gitattributes[5])
+
 Authors
 -------
 * git's founding father is Linus Torvalds <torvalds@osdl.org>.
-- 
1.7.0.2.445.gcbdb3

  parent reply	other threads:[~2010-10-06 14:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-05 13:00 [RFC PATCH] git.txt: document limitations on non-typical repos (and hints) Nguyễn Thái Ngọc Duy
2010-10-05 16:12 ` Alex Riesen
2010-10-05 16:18 ` Chris Packham
2010-10-05 23:52   ` Nguyen Thai Ngoc Duy
2010-10-06 14:21 ` Nguyễn Thái Ngọc Duy
2010-10-06 14:23 ` pclouds [this message]
2010-10-06 16:32   ` [PATCH] " Junio C Hamano
2010-10-07  2:25     ` Nguyen Thai Ngoc Duy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4cac8659.0541730a.0ef0.3ff9@mx.google.com \
    --to=pclouds@gmail.com \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=judge.packham@gmail.com \
    --cc=raa.lkml@gmail.com \
    --cc=spearce@spearce.org \
    --cc=weigelt@metux.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).