All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: git@jeffhostetler.com
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH v11 2/5] p0006-read-tree-checkout: perf test to time read-tree
Date: Tue, 18 Apr 2017 22:40:25 +0100	[thread overview]
Message-ID: <20170418214025.GA4989@hank> (raw)
In-Reply-To: <20170417213734.55373-3-git@jeffhostetler.com>

On 04/17, git@jeffhostetler.com wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Created t/perf/repos/many-files.sh to generate large, but
> artificial repositories.
> 
> Created t/perf/p0006-read-tree-checkout.sh to measure
> performance on various read-tree, checkout, and update-index
> operations.  This test can run using either artificial repos
> described above or normal repos.
> 
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  t/perf/p0006-read-tree-checkout.sh |  67 ++++++++++++++++++++++
>  t/perf/repos/.gitignore            |   1 +
>  t/perf/repos/many-files.sh         | 110 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 178 insertions(+)
>  create mode 100755 t/perf/p0006-read-tree-checkout.sh
>  create mode 100644 t/perf/repos/.gitignore
>  create mode 100755 t/perf/repos/many-files.sh
> 
> diff --git a/t/perf/p0006-read-tree-checkout.sh b/t/perf/p0006-read-tree-checkout.sh
> new file mode 100755
> index 0000000..78cc23f
> --- /dev/null
> +++ b/t/perf/p0006-read-tree-checkout.sh
> @@ -0,0 +1,67 @@
> +#!/bin/sh
> +#
> +# This test measures the performance of various read-tree
> +# and checkout operations.  It is primarily interested in
> +# the algorithmic costs of index operations and recursive
> +# tree traversal -- and NOT disk I/O on thousands of files.
> +
> +test_description="Tests performance of read-tree"
> +
> +. ./perf-lib.sh
> +
> +test_perf_default_repo

I like that it's possible to use a real world repository now instead
of forcing the use of a synthetic repository :)

Is there a reason for this being test_perf_default_repo instead of
test_perf_large_repo?  It seems like generating a large repo is what
you are doing with repos/many-files.sh.

> +
> +# If the test repo was generated by ./repos/many-files.sh
> +# then we know something about the data shape and branches,
> +# so we can isolate testing to the ballast-related commits
> +# and setup sparse-checkout so we don't have to populate
> +# the ballast files and directories.
> +#
> +# Otherwise, we make some general assumptions about the
> +# repo and consider the entire history of the current
> +# branch to be the ballast.
> +
> +test_expect_success "setup repo" '
> +	if git rev-parse --verify refs/heads/p0006-ballast^{commit}
> +	then
> +		echo Assuming synthetic repo from many-files.sh
> +		git branch br_base            master
> +		git branch br_ballast         p0006-ballast^
> +		git branch br_ballast_alias   p0006-ballast^
> +		git branch br_ballast_plus_1  p0006-ballast
> +		git config --local core.sparsecheckout 1
> +		cat >.git/info/sparse-checkout <<-EOF
> +		/*
> +		!ballast/*
> +		EOF
> +	else
> +		echo Assuming non-synthetic repo...
> +		git branch br_base            $(git rev-list HEAD | tail -n 1)
> +		git branch br_ballast         HEAD^ || error "no ancestor commit from current head"
> +		git branch br_ballast_alias   HEAD^
> +		git branch br_ballast_plus_1  HEAD
> +	fi &&
> +	git checkout -q br_ballast &&
> +	nr_files=$(git ls-files | wc -l)
> +'
> +
> +test_perf "read-tree br_base br_ballast ($nr_files)" '
> +	git read-tree -m br_base br_ballast -n
> +'
> +
> +test_perf "switch between br_base br_ballast ($nr_files)" '
> +	git checkout -q br_base &&
> +	git checkout -q br_ballast
> +'
> +
> +test_perf "switch between br_ballast br_ballast_plus_1 ($nr_files)" '
> +	git checkout -q br_ballast_plus_1 &&
> +	git checkout -q br_ballast
> +'
> +
> +test_perf "switch between aliases ($nr_files)" '
> +	git checkout -q br_ballast_alias &&
> +	git checkout -q br_ballast
> +'
> +
> +test_done
> diff --git a/t/perf/repos/.gitignore b/t/perf/repos/.gitignore
> new file mode 100644
> index 0000000..72e3dc3
> --- /dev/null
> +++ b/t/perf/repos/.gitignore
> @@ -0,0 +1 @@
> +gen-*/
> diff --git a/t/perf/repos/many-files.sh b/t/perf/repos/many-files.sh
> new file mode 100755
> index 0000000..5a1d25e
> --- /dev/null
> +++ b/t/perf/repos/many-files.sh
> @@ -0,0 +1,110 @@
> +#!/bin/sh
> +## Generate test data repository using the given parameters.
> +## When omitted, we create "gen-many-files-d-w-f.git".
> +##
> +## Usage: [-r repo] [-d depth] [-w width] [-f files]
> +##
> +## -r repo: path to the new repo to be generated
> +## -d depth: the depth of sub-directories
> +## -w width: the number of sub-directories at each level
> +## -f files: the number of files created in each directory
> +##
> +## Note that all files will have the same SHA-1 and each
> +## directory at a level will have the same SHA-1, so we
> +## will potentially have a large index, but not a large
> +## ODB.
> +##
> +## Ballast will be created under "ballast/".

I think comments should start only with a single '#' in the git
source, as you already have it in p0006.

[...]

  reply	other threads:[~2017-04-18 21:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-17 21:37 [PATCH v11 0/5] read-cache: speed up add_index_entry git
2017-04-17 21:37 ` [PATCH v11 1/5] read-cache: add strcmp_offset function git
2017-04-17 21:37 ` [PATCH v11 2/5] p0006-read-tree-checkout: perf test to time read-tree git
2017-04-18 21:40   ` Thomas Gummerer [this message]
2017-04-19  1:25     ` Jeff King
2017-04-17 21:37 ` [PATCH v11 3/5] read-cache: speed up add_index_entry during checkout git
2017-04-17 21:37 ` [PATCH v11 4/5] read-cache: speed up has_dir_name (part 1) git
2017-04-17 21:37 ` [PATCH v11 5/5] read-cache: speed up has_dir_name (part 2) git

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170418214025.GA4989@hank \
    --to=t.gummerer@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.