Git development
 help / color / mirror / Atom feed
* Re: [PATCH v4 0/4] Preliminary patches before git-std-lib
From: phillip.wood123 @ 2023-10-10 10:05 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: Calvin Wan, Junio C Hamano
In-Reply-To: <cover.1696021277.git.jonathantanmy@google.com>

Hi Jonathan

On 29/09/2023 22:20, Jonathan Tan wrote:
> Calvin will be away for a few weeks and I'll be handling the git-std-lib
> effort in the meantime. My goals will be:
> 
> - Get the preliminary patches in Calvin's patch set (patches 1-4) merged
> first.
> 
> - Updating patches 5-6 based on reviewer feedback (including my
> feedback). I have several aims including reducing or eliminating the
> need for the GIT_STD_LIB preprocessor variable, and making stubs a test-
> only concern (I think Phillip has some similar ideas [1] but I haven't
> looked at their repo on GitHub yet).

It sounds like we're thinking along similar lines, do feel free get in 
touch on or off the list if you want to ask anything about those patches 
I pushed to github.

> [1] https://lore.kernel.org/git/98f3edcf-7f37-45ff-abd2-c0038d4e0589@gmail.com/
> 
> This patch set is in service of the first goal. Because the libification
> patches are no longer included in this patch set, I have rewritten the
> commit messages to justify the patches in terms of code organization.
> There are no changes in the code itself. Also, I have retained Calvin's
> name as the author.

I agree it makes sense to get the preliminary patches merged on their 
own. I think the argument that they reduce the scope of includes is a 
reasonable justification on its own. I've left a couple of comments but 
they're looking pretty good.

Best Wishes

Phillip

^ permalink raw reply

* Re: [PATCH v4 4/4] parse: separate out parsing functions from config.h
From: phillip.wood123 @ 2023-10-10 10:00 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: Calvin Wan, Junio C Hamano
In-Reply-To: <5d9f0b3de08ab8541482b9b640db06b6d3000b86.1696021277.git.jonathantanmy@google.com>

Hi Jonathan

On 29/09/2023 22:20, Jonathan Tan wrote:
> diff --git a/parse.h b/parse.h
> new file mode 100644
> index 0000000000..07d2193d69
> --- /dev/null
> +++ b/parse.h
> @@ -0,0 +1,20 @@
> +#ifndef PARSE_H
> +#define PARSE_H
> +
> +int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);

Previously this function was private to config.c, now it needs to be 
public because it is still called by 
git_config_get_expiry_date_in_days(). As this is essentially an internal 
helper for git_parse_int() and friends it is a bit unfortunate that it 
is now public. Perhaps we should change 
git_config_get_expiry_date_in_days() to call git_parse_int() instead.
Then we can keep git_parse_signed() and git_parse_unsigned() private to 
parse.c.

> +int git_parse_ssize_t(const char *, ssize_t *);
> +int git_parse_ulong(const char *, unsigned long *);
> +int git_parse_int(const char *value, int *ret);
> +int git_parse_int64(const char *value, int64_t *ret);

This was previously private but I think it makes sense for it to be 
publicly available.

> +/**
> + * Same as `git_config_bool`, except that it returns -1 on error rather
> + * than dying.
> + */
> +int git_parse_maybe_bool(const char *);
> +int git_parse_maybe_bool_text(const char *value);

This used to be private to config.c and now has callers in parse.c and 
config.c. We should make it clear that non-config code is likely to want 
git_parse_maybe_bool() rather than this function.

Best Wishes

Phillip

^ permalink raw reply

* Re: [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn()
From: phillip.wood123 @ 2023-10-10  9:59 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: Calvin Wan, Junio C Hamano
In-Reply-To: <c9e7cd78576527571fd70b953e340b5bdd196221.1696021277.git.jonathantanmy@google.com>

Hi Jonathan

On 29/09/2023 22:20, Jonathan Tan wrote:
> From: Calvin Wan <calvinwan@google.com>
> 
> remove_or_warn() is only used by entry.c and apply.c, but it is
> currently declared and defined in wrapper.{h,c}, so it has a scope much
> greater than it needs. This needlessly large scope also causes wrapper.c
> to need to include object.h, when this file is largely unconcerned with
> Git objects.
> 
> Move remove_or_warn() to entry.{h,c}. The file apply.c still has access
> to it, since it already includes entry.h for another reason.

This looks good. On a related note wrapper.c includes repository.h but 
does use anything declared in that header.

Best Wishes

Phillip

^ permalink raw reply

* Re: [PATCH v8 1/3] unit tests: Add a project plan document
From: Oswald Buddenhagen @ 2023-10-10  8:57 UTC (permalink / raw)
  To: Josh Steadmon; +Cc: git, phillip.wood123, linusa, calvinwan, gitster, rsbecker
In-Reply-To: <81c5148a1267b8f9ce432a950340f0fa16b4d773.1696889530.git.steadmon@google.com>

On Mon, Oct 09, 2023 at 03:21:20PM -0700, Josh Steadmon wrote:
>+=== Comparison
>+
>+[format="csv",options="header",width="33%"]
>+|=====
>+Framework,"<<license,License>>","<<vendorable-or-ubiquitous,Vendorable or ubiquitous>>","<<maintainable-extensible,Maintainable / extensible>>","<<major-platform-support,Major platform support>>","<<tap-support,TAP support>>","<<diagnostic-output,Diagnostic output>>","<<runtime--skippable-tests,Runtime- skippable tests>>","<<parallel-execution,Parallel execution>>","<<mock-support,Mock support>>","<<signal-error-handling,Signal & error handling>>","<<project-kloc,Project KLOC>>","<<adoption,Adoption>>"
>
the redundancy seems unnecessary; asciidoc should automatically use each 
target's section title as the xreflabel.

>+https://lore.kernel.org/git/c902a166-98ce-afba-93f2-ea6027557176@gmail.com/[Custom Git impl.],[lime-background]#GPL v2#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,1,0
>+https://github.com/silentbicycle/greatest[Greatest],[lime-background]#ISC#,[lime-background]#True#,[yellow-background]#Partial#,[lime-background]#True#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,3,1400
>+https://github.com/Snaipe/Criterion[Criterion],[lime-background]#MIT#,[red-background]#False#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[lime-background]#True#,19,1800
>+https://github.com/rra/c-tap-harness/[C TAP],[lime-background]#Expat#,[lime-background]#True#,[yellow-background]#Partial#,[yellow-background]#Partial#,[lime-background]#True#,[red-background]#False#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,4,33
>+https://libcheck.github.io/check/[Check],[lime-background]#LGPL v2.1#,[red-background]#False#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,[lime-background]#True#,17,973
>+|=====
>+
i find this totally unreadable in its raw form.
consider user-defined document-attributes for specific cell contents.
externalizing the urls would probably help as well (i'm not sure how to 
do that best).

regards

^ permalink raw reply

* Re: [PATCH 0/4] Performance improvement & cleanup in loose ref iteration
From: Patrick Steinhardt @ 2023-10-10  7:21 UTC (permalink / raw)
  To: Victoria Dye; +Cc: Victoria Dye via GitGitGadget, git
In-Reply-To: <28ae03f5-7091-d3f3-8a70-56aba6639640@github.com>

[-- Attachment #1: Type: text/plain, Size: 8706 bytes --]

On Mon, Oct 09, 2023 at 02:49:14PM -0700, Victoria Dye wrote:
> Patrick Steinhardt wrote:
> > On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
> >> While investigating ref iteration performance in builtins like
> >> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
> >>
> >> The first patch tweaks the logic around prefix matching in
> >> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> >> actually match a given prefix. The unnecessary iteration doesn't seem to be
> >> causing any bugs in the ref iteration commands that I've tested, but it
> >> doesn't hurt to be more precise (and it helps with some other patches I'm
> >> working on ;) ).
> >>
> >> The next three patches update how 'loose_fill_ref_dir' determines the type
> >> of ref cache entry to create (directory or regular). On platforms that
> >> include d_type information in 'struct dirent' (as far as I can tell, all
> >> except NonStop & certain versions of Cygwin), this allows us to skip calling
> >> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> >> by about 20%.
> > 
> > I've done a small set of benchmarks with my usual test repositories,
> > which is linux.git with a bunch of references added. The repository
> > comes in four sizes:
> > 
> > - small: 50k references
> > - medium: 500k references
> > - high:  1.1m references
> > - huge: 12m references
> > 
> > Unfortunately, I couldn't really reproduce the performance improvements.
> > In fact, the new version runs consistently a tiny bit slower than the
> > old version:
> > 
> >     # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).
> > 
> >     Benchmark 1: git for-each-ref (revision=old,refcount=small)
> >       Time (mean ± σ):     135.5 ms ±   1.2 ms    [User: 76.4 ms, System: 59.0 ms]
> >       Range (min … max):   134.8 ms … 136.9 ms    3 runs
> > 
> >     Benchmark 2: git for-each-ref (revision=old,refcount=medium)
> >       Time (mean ± σ):     822.7 ms ±   2.2 ms    [User: 697.4 ms, System: 125.1 ms]
> >       Range (min … max):   821.1 ms … 825.2 ms    3 runs
> > 
> >     Benchmark 3: git for-each-ref (revision=old,refcount=high)
> >       Time (mean ± σ):      1.960 s ±  0.015 s    [User: 1.702 s, System: 0.257 s]
> >       Range (min … max):    1.944 s …  1.973 s    3 runs
> > 
> >     # New version, which is your tip.
> > 
> >     Benchmark 4: git for-each-ref (revision=old,refcount=huge)
> >       Time (mean ± σ):     16.815 s ±  0.054 s    [User: 15.091 s, System: 1.722 s]
> >       Range (min … max):   16.760 s … 16.869 s    3 runs
> > 
> >     Benchmark 5: git for-each-ref (revision=new,refcount=small)
> >       Time (mean ± σ):     136.0 ms ±   0.2 ms    [User: 78.8 ms, System: 57.1 ms]
> >       Range (min … max):   135.8 ms … 136.2 ms    3 runs
> > 
> >     Benchmark 6: git for-each-ref (revision=new,refcount=medium)
> >       Time (mean ± σ):     830.4 ms ±  21.2 ms    [User: 691.3 ms, System: 138.7 ms]
> >       Range (min … max):   814.2 ms … 854.5 ms    3 runs
> > 
> >     Benchmark 7: git for-each-ref (revision=new,refcount=high)
> >       Time (mean ± σ):      1.966 s ±  0.013 s    [User: 1.717 s, System: 0.249 s]
> >       Range (min … max):    1.952 s …  1.978 s    3 runs
> > 
> >     Benchmark 8: git for-each-ref (revision=new,refcount=huge)
> >       Time (mean ± σ):     16.945 s ±  0.037 s    [User: 15.182 s, System: 1.760 s]
> >       Range (min … max):   16.910 s … 16.983 s    3 runs
> > 
> >     Summary
> >       git for-each-ref (revision=old,refcount=small) ran
> >         1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
> >         6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
> >         6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
> >        14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
> >        14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
> >       124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
> >       125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)
> > 
> > The performance regression isn't all that concerning, but it makes me
> > wonder why I see things becoming slower rather than faster. My guess is
> > that this is because all my test repositories are well-packed and don't
> > have a lot of loose references. But I just wanted to confirm how you
> > benchmarked your change and what the underlying shape of your test repo
> > was.
> 
> I ran my benchmark on my (Intel) Mac with a test repository (single commit,
> one file) containing:
> 
> - 10k refs/heads/ references
> - 10k refs/tags/ references
> - 10k refs/special/ references 
> 
> All refs in the repository are loose. My Mac has historically been somewhat
> slow and inconsistent when it comes to perf testing, though, so I re-ran the
> benchmark a bit more formally on an Ubuntu VM (3 warmup iterations followed
> by at least 10 iterations per test):
> 
> ---
> 
> Benchmark 1: git for-each-ref (revision=old,refcount=3k)
>   Time (mean ± σ):      40.6 ms ±   3.9 ms    [User: 13.2 ms, System: 27.1 ms]
>   Range (min … max):    37.2 ms …  59.1 ms    76 runs
>  
>   Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
>  
> Benchmark 2: git for-each-ref (revision=new,refcount=3k)
>   Time (mean ± σ):      38.7 ms ±   4.4 ms    [User: 13.8 ms, System: 24.5 ms]
>   Range (min … max):    35.1 ms …  57.2 ms    71 runs
>  
>   Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
>  
> Benchmark 3: git for-each-ref (revision=old,refcount=30k)
>   Time (mean ± σ):     419.4 ms ±  43.9 ms    [User: 136.4 ms, System: 274.1 ms]
>   Range (min … max):   385.1 ms … 528.7 ms    10 runs
>  
> Benchmark 4: git for-each-ref (revision=new,refcount=30k)
>   Time (mean ± σ):     390.4 ms ±  27.2 ms    [User: 133.1 ms, System: 251.6 ms]
>   Range (min … max):   360.3 ms … 447.6 ms    10 runs
>  
> Benchmark 5: git for-each-ref (revision=old,refcount=300k)
>   Time (mean ± σ):      4.171 s ±  0.052 s    [User: 1.400 s, System: 2.715 s]
>   Range (min … max):    4.118 s …  4.283 s    10 runs
>  
> Benchmark 6: git for-each-ref (revision=new,refcount=300k)
>   Time (mean ± σ):      3.939 s ±  0.054 s    [User: 1.403 s, System: 2.466 s]
>   Range (min … max):    3.858 s …  4.026 s    10 runs
>  
> Summary
>   'git for-each-ref (revision=new,refcount=3k)' ran
>     1.05 ± 0.16 times faster than 'git for-each-ref (revision=old,refcount=3k)'
>    10.08 ± 1.34 times faster than 'git for-each-ref (revision=new,refcount=30k)'
>    10.83 ± 1.67 times faster than 'git for-each-ref (revision=old,refcount=30k)'
>   101.68 ± 11.63 times faster than 'git for-each-ref (revision=new,refcount=300k)'
>   107.67 ± 12.30 times faster than 'git for-each-ref (revision=old,refcount=300k)'
> 
> ---
> 
> So it's not the 20% speedup I saw on my local test repo (it's more like
> 5-8%), but there does appear to be a consistent improvement.

Thanks a bunch for re-doing the benchmark with a documented setup.

> As for your results, the changes in this series shouldn't affect
> packed ref operations, and the difference between old & new doesn't
> seem to indicate a regression. 

Yeah, I've also been surprised to see the performance regression for
packed-refs files. The regression is persistent and reproducable on my
machine though, so even though I tend to agree that the patches
shouldn't negatively impact packed-refs performance they somehow do. It
could just as well be something like different optimization choices by
the compler due to the added patches, or hitting different cache lines.
I dunno.

Anyway, I agree with your assessment. The regression I see is less than
1% for packed-refs, while the improvements for loose refs are a lot more
significant and conceptually make a lot of sense. So I didn't intend to
say that we shouldn't do these optimizations because of the miniscule
peformance regression with packed-refs.

Or in other words: this series looks good to me.

Thanks!
Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/4] ref-cache.c: fix prefix matching in ref iteration
From: Patrick Steinhardt @ 2023-10-10  7:21 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, Victoria Dye
In-Reply-To: <402176246ea9d722a71a0ca4e970dfce8a4bf776.1696888736.git.gitgitgadget@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6056 bytes --]

On Mon, Oct 09, 2023 at 09:58:53PM +0000, Victoria Dye via GitGitGadget wrote:
> From: Victoria Dye <vdye@github.com>
> 
> Update 'cache_ref_iterator_advance' to skip over refs that are not matched
> by the given prefix.
> 
> Currently, a ref entry is considered "matched" if the entry name is fully
> contained within the prefix:
> 
> * prefix: "refs/heads/v1"
> * entry: "refs/heads/v1.0"
> 
> OR if the prefix is fully contained in the entry name:
> 
> * prefix: "refs/heads/v1.0"
> * entry: "refs/heads/v1"
> 
> The first case is always correct, but the second is only correct if the ref
> cache entry is a directory, for example:
> 
> * prefix: "refs/heads/example"
> * entry: "refs/heads/"
> 
> Modify the logic in 'cache_ref_iterator_advance' to reflect these
> expectations:
> 
> 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
>    ref cache entry do not overlap at all. Skip this entry.
> 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
>    inside this entry if it is a directory. Skip if the entry is not a
>    directory, otherwise iterate over it.
> 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
>    that the cache entry (directory or not) is fully contained by or equal to
>    the prefix. Iterate over this entry.
> 
> Note that condition 2 relies on the names of directory entries having the
> appropriate trailing slash. The existing function documentation of
> 'create_dir_entry' explicitly calls out the trailing slash requirement, so
> this is a safe assumption to make.
> 
> This bug generally doesn't have any user-facing impact, since it requires:
> 
> 1. using a non-empty prefix without a trailing slash in an iteration like
>    'for_each_fullref_in',
> 2. the callback to said iteration not reapplying the original filter (as
>    for-each-ref does) to ensure unmatched refs are skipped, and
> 3. the repository having one or more refs that match part of, but not all
>    of, the prefix.
> 
> However, there are some niche scenarios that meet those criteria
> (specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add
> tests covering those cases to demonstrate the fix in this patch.
> 
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  refs/ref-cache.c              |  3 ++-
>  t/t1500-rev-parse.sh          | 23 +++++++++++++++++++++++
>  t/t4205-log-pretty-formats.sh | 30 ++++++++++++++++++++++++++++++
>  3 files changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> index 2294c4564fb..6e3b725245c 100644
> --- a/refs/ref-cache.c
> +++ b/refs/ref-cache.c
> @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
>  
>  		if (level->prefix_state == PREFIX_WITHIN_DIR) {
>  			entry_prefix_state = overlaps_prefix(entry->name, iter->prefix);
> -			if (entry_prefix_state == PREFIX_EXCLUDES_DIR)
> +			if (entry_prefix_state == PREFIX_EXCLUDES_DIR ||
> +			    (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR)))
>  				continue;
>  		} else {
>  			entry_prefix_state = level->prefix_state;
> diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> index 37ee5091b5c..3f9e7f62e45 100755
> --- a/t/t1500-rev-parse.sh
> +++ b/t/t1500-rev-parse.sh
> @@ -264,4 +264,27 @@ test_expect_success 'rev-parse --since= unsqueezed ordering' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'rev-parse --bisect includes bad, excludes good' '
> +	test_commit_bulk 6 &&
> +
> +	git update-ref refs/bisect/bad-1 HEAD~1 &&
> +	git update-ref refs/bisect/b HEAD~2 &&
> +	git update-ref refs/bisect/bad-3 HEAD~3 &&
> +	git update-ref refs/bisect/good-3 HEAD~3 &&
> +	git update-ref refs/bisect/bad-4 HEAD~4 &&
> +	git update-ref refs/bisect/go HEAD~4 &&
> +
> +	# Note: refs/bisect/b and refs/bisect/go should be ignored because they
> +	# do not match the refs/bisect/bad or refs/bisect/good prefixes.
> +	cat >expect <<-EOF &&
> +	refs/bisect/bad-1
> +	refs/bisect/bad-3
> +	refs/bisect/bad-4
> +	^refs/bisect/good-3
> +	EOF
> +
> +	git rev-parse --symbolic-full-name --bisect >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done
> diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
> index 16626e4fe96..62c7bfed5d7 100755
> --- a/t/t4205-log-pretty-formats.sh
> +++ b/t/t4205-log-pretty-formats.sh
> @@ -956,6 +956,36 @@ test_expect_success '%S in git log --format works with other placeholders (part
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'setup more commits for %S with --bisect' '
> +	test_commit four &&
> +	test_commit five &&
> +
> +	head1=$(git rev-parse --verify HEAD~0) &&
> +	head2=$(git rev-parse --verify HEAD~1) &&
> +	head3=$(git rev-parse --verify HEAD~2) &&
> +	head4=$(git rev-parse --verify HEAD~3)
> +'
> +
> +test_expect_success '%S with --bisect labels commits with refs/bisect/bad ref' '
> +	git update-ref refs/bisect/bad-$head1 $head1 &&
> +	git update-ref refs/bisect/go $head1 &&
> +	git update-ref refs/bisect/bad-$head2 $head2 &&
> +	git update-ref refs/bisect/b $head3 &&
> +	git update-ref refs/bisect/bad-$head4 $head4 &&
> +	git update-ref refs/bisect/good-$head4 $head4 &&
> +
> +	# We expect to see the range of commits betwee refs/bisect/good-$head4

Nit: s/betwee/between. Probably not worth rerolling this series only
because of this typo though.

Patrick

> +	# and refs/bisect/bad-$head1. The "source" ref is the nearest bisect ref
> +	# from which the commit is reachable.
> +	cat >expect <<-EOF &&
> +	$head1 refs/bisect/bad-$head1
> +	$head2 refs/bisect/bad-$head2
> +	$head3 refs/bisect/bad-$head2
> +	EOF
> +	git log --bisect --format="%H %S" >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'log --pretty=reference' '
>  	git log --pretty="tformat:%h (%s, %as)" >expect &&
>  	git log --pretty=reference >actual &&
> -- 
> gitgitgadget
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 7/7] builtin/merge-tree.c: implement support for `--write-pack`
From: Patrick Steinhardt @ 2023-10-10  6:36 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Junio C Hamano, git, Elijah Newren, Eric W. Biederman, Jeff King
In-Reply-To: <ZSQlgLDv+MrYmSp8@nand.local>

[-- Attachment #1: Type: text/plain, Size: 3748 bytes --]

On Mon, Oct 09, 2023 at 12:08:32PM -0400, Taylor Blau wrote:
> On Mon, Oct 09, 2023 at 12:54:12PM +0200, Patrick Steinhardt wrote:
> > In Gitaly, we usually set up quarantine directories for all operations
> > that create objects. This allows us to discard any newly written objects
> > in case either the RPC call gets cancelled or in case our access checks
> > determine that the change should not be allowed. The logic is rather
> > simple:
> >
> >     1. Create a new temporary directory.
> >
> >     2. Set up the new temporary directory as main object database via
> >        the `GIT_OBJECT_DIRECTORY` environment variable.
> >
> >     3. Set up the main repository's object database via the
> >        `GIT_ALTERNATE_OBJECT_DIRECTORIES` environment variable.
> 
> Is there a reason not to run Git in the quarantine environment and list
> the main repository as an alternate via $GIT_DIR/objects/info/alternates
> instead of the GIT_ALTERNATE_OBJECT_DIRECTORIES environment variable?

The quarantine environment as we use it is really only a single object
database, so you cannot run Git inside of it directly.

> >     4. Execute Git commands that write objects with these environment
> >        variables set up. The new objects will end up neatly contained in
> >        the temporary directory.
> >
> >     5. Once done, either discard the temporary object database or
> >        migrate objects into the main object daatabase.
> 
> Interesting. I'm curious why you don't use the builtin tmp_objdir
> mechanism in Git itself. Do you need to run more than one command in the
> quarantine environment? If so, that makes sense that you'd want to have
> a scratch repository that lasts beyond the lifetime of a single process.

It's a mixture of things:

    - Many commands simply don't set up a temporary object directory.

    - We want to check the result after the objects have been generated.
      Many of the commands don't provide hooks to do so in a reasonable
      way. So we want to check the result _after_ the command has exited
      already, and objects should not yet have been migrated into the
      target object database at that point.

    - Sometimes we indeed want to run multiple Git commands. We use this
      e.g. for worktreeless rebases, where we run a succession of
      commands to rebase every single commit.

So ultimately, our own quarantine directory sits at a conceptually
higher level than what Git commands would be able to provide.

> > I wonder whether this would be a viable approach for you, as well.
> 
> I think that the main problem that we are trying to solve with this
> series is creating a potentially large number of loose objects. I think
> that you could do something like what you propose above, with a 'git
> repacks -adk' before moving its objects over back to the main repository.

Yeah, that's certainly possible. I had the feeling that there are two
different problems that we're trying to solve in this thread:

    - The problem that the objects are of temporary nature. Regardless
      of whether they are in a packfile or loose, it doesn't make much
      sense to put them into the main object database as they would be
      unreachable anyway and thus get pruned after some time.

    - The problem that writing many loose objects can be slow.

My proposal only addresses the first problem.

> But since we're working in a single process only when doing a merge-tree
> operation, I think it is probably more expedient to write the pack's
> bytes directly.

If it's about efficiency and not about how we discard those objects
after the command has ran to completion then yes, I agree.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 0/3] rev-list: add support for commits in `--missing`
From: Patrick Steinhardt @ 2023-10-10  6:19 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git
In-Reply-To: <20231009105528.17777-1-karthik.188@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2054 bytes --]

On Mon, Oct 09, 2023 at 12:55:25PM +0200, Karthik Nayak wrote:
> The `--missing` option in git-rev-list(1) was introduced intitally
> to deal with missing blobs in the context of promissory notes.
> Eventually the option was extended to also support tree objects in
> 7c0fe330d5 (rev-list: handle missing tree objects properly,2018-10-05).
> 
> This patch series extends the `--missing` option to also add support for
> commit objects. We do this by introducing a new flag `MISSING` which is
> added whenever we encounter a missing commit during traversal. Then in
> `builtin/rev-list` we check for this flag and take the appropriate
> action based on the `--missing=*` option used.
> 
> This series is an alternate to the patch series I had posted earlier:
> https://lore.kernel.org/git/20230908174208.249184-1-karthik.188@gmail.com/.
> In that patch, we introduced an option `--ignore-missing-links` which
> was added to expose the `ignore_missing_links` bit to the user. The
> issue in that patch was that, the option `--ignore-missing-links` didn't
> play well the pre-existing `--missing` option. This series avoids that
> route and just extends the `--missing` option for commits to solve the
> same problem.

I had already reviewed the patches internally at GitLab, so for what
it's worth please feel free to add my Reviewed-by.

Patrick

> Karthik Nayak (3):
>   revision: rename bit to `do_not_die_on_missing_objects`
>   rev-list: move `show_commit()` to the bottom
>   rev-list: add commit object support in `--missing` option
> 
>  builtin/reflog.c            |  2 +-
>  builtin/rev-list.c          | 93 +++++++++++++++++++------------------
>  list-objects.c              |  2 +-
>  object.h                    |  2 +-
>  revision.c                  |  9 +++-
>  revision.h                  | 20 ++++----
>  t/t6022-rev-list-missing.sh | 74 +++++++++++++++++++++++++++++
>  7 files changed, 145 insertions(+), 57 deletions(-)
>  create mode 100755 t/t6022-rev-list-missing.sh
> 
> -- 
> 2.41.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] git-diff: Introduce --index and deprecate --cached.
From: Russell Nelson @ 2023-10-10  3:46 UTC (permalink / raw)
  To: junkio; +Cc: ae, git



Sent from my iPhone

^ permalink raw reply

* [silly] worldview documents?
From: Junio C Hamano @ 2023-10-10  2:44 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git
In-Reply-To: <CABPp-BFsrt0zS3NHsVAyOSW6vGioe8Z-iN2M3_JNBpP2fWVq9g@mail.gmail.com>

Elijah Newren <newren@gmail.com> writes:

>> [Footnote]
>> ...
>
> Thanks for writing this up.  In the past, I didn't know how to put
> into words why I didn't particularly care for this mode.  You explain
> it rather well.

I am glad it helped non-zero number of people.

It probably owes big to my failing, but because we strongly view it
a virtue not to be opinionated, we have many discrete tools and
features that can be used in combination to support a workflow well,
even when some of these tools and features are not useful for a
different workflow.  It indeed is a good thing to be flexible and to
support different workflows well, and we tend not to single out a
workflow among many and advocate it, but because our documentation
lacks description of major possible workflows, what their underlying
philosophies and their strengths are, how some of our tools and
features support them, and why some others are not good fit.  Being
given a toolbox with too many tools without being taught how they
are to be used together and for what purpose may be a fun puzzle to
figure out for tinkerers, but when you have a problem to solve and
tinkering is not your main focus, which is true for most people, it
is not fun.  

In short, in pursuit of not to be opinionated, we fail to give the
readers best current practices.  The first place to start rectifying
it might be to have some write-ups for various major workflows and
the worldview behind them.  The importance given to first-parenthood
offers two quite different worldviews that affects the choice of
tools (e.g. "merge --no-ff", "checkout origin/master && merge mine
&& branch -f mine" aka "reverse merge").

I suspect that this also relates to your "would --cc be totally
unnecessary now we have --remerge-diff?" as well.  What kind of
conflicts are interesting highly depends on what you are looking
for, which in turn is influenced by the workflow employed by the
project and what role you are playing in it.

^ permalink raw reply

* Re: [PATCH 0/3] Add `-p' option to `git-mv', inspired by `mkdir'
From: Junio C Hamano @ 2023-10-10  0:59 UTC (permalink / raw)
  To: Hugo Sales; +Cc: git, Shaoxuan Yuan
In-Reply-To: <xmqq1qe3wbt1.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> Regardless of the "do we require, or is it sifficient to imply, the
> '-p' option when we lack leading directories?" question, once we
> start "auto-creating" the leading directory hierarchy, one worrysome
> thing such a new feature introduces is an ambiguity and unexpected
> behaviour. ...
> Both are plausible, and "mkdir -p" does not have such a nasty
> ambiguity.  That is what makes me unsure about the new feature
> (again, either with required "-p" or with implied "-p").

I checked what POSIX, who does give "-p" to mkdir(1), has with
mv(1), and they do not have "-p".  Neither GNU, which tends to add
such "usability" enhancements on top of what everybody else has.

And I think they are being very sensible.  By not adding such an
option to "mv", they can sidestep the unnecessary ambiguity, and it
isn't too bad to do a "mkdir -p" separately before doing such a "mv"
into a (potential) new directory.

So, let's not do this patch.

Thanks.


^ permalink raw reply

* Re: [PATCH 03/25] documentation: fix typos
From: Elijah Newren @ 2023-10-10  0:57 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: Elijah Newren via GitGitGadget, git
In-Reply-To: <92beac82-f0fd-452b-858f-453cdf21b71f@ramsayjones.plus.com>

On Sun, Oct 8, 2023 at 9:32 AM Ramsay Jones <ramsay@ramsayjones.plus.com> wrote:
>
> On 08/10/2023 07:45, Elijah Newren via GitGitGadget wrote:
> [snip]
> > diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt
> > index 870e00f2982..42afb953e8c 100644
> > --- a/Documentation/gitformat-pack.txt
> > +++ b/Documentation/gitformat-pack.txt
> > @@ -17,8 +17,8 @@ $GIT_DIR/objects/pack/multi-pack-index
> >  DESCRIPTION
> >  -----------
> >
> > -The Git pack format is now Git stores most of its primary repository
> > -data. Over the lietime af a repository loose objects (if any) and
> > +The Git pack format is how Git stores most of its primary repository
> > +data. Over the lietime of a repository loose objects (if any) and
>
> Hmm, this tyop jumped out at me while this patch
> floated past... (at least I assume it is a typo!):
>
>   s/lietime/lifetime/

Oh, that's interesting.  I wonder if the LLM did fix that, and I just
didn't notice it.  I suspect so.

Anyway, thanks for catching!  It looks like Junio manually squashed it
in, but I've also got it locally in case any other changes are pointed
out and I need to re-roll.

^ permalink raw reply

* Re: [PATCH 0/3] Add `-p' option to `git-mv', inspired by `mkdir'
From: Junio C Hamano @ 2023-10-10  0:39 UTC (permalink / raw)
  To: Hugo Sales; +Cc: git, Derrick Stolee, Shaoxuan Yuan
In-Reply-To: <20231009233458.1371351-1-hugo@hsal.es>

Hugo Sales <hugo@hsal.es> writes:

> Allow passing a new `-p'/`--parents' option to `git-mv' making it so
> missing directories in the given target path get created if they don't
> exist. This allows running `git mv -p foo bar/quux/' to first create
> the `bar/' and `bar/quux/' directories if they don't exist, and then
> move `foo' to `bar/quux/foo'.
>
> This is inspired by `mkdir -p foo/bar/quux/' which will create the
> `foo/', `foo/bar/', and `foo/bar/quux/' directories if they don't
> exist.

I'll step back and ask a related design question without reading the
patches (yet).

Is there a reason why somebody benefits by us retaining the current
behaviour, where

    $ git mv file no/such/dir/yet/file

fails with "No such file or directory", and they are forced to say
either

    $ mkdir -p no/such/dir/yet
    $ git mv file no/such/dir/yet/file

or

    $ git mv -p file no/such/dir/yet/file

Yes, what I am getting at is if it makes sense to just "fix" the
behaviour of "git mv" about missing leading directories and do what
the end-user wanted to do without requiring "-p".

Regardless of the "do we require, or is it sifficient to imply, the
'-p' option when we lack leading directories?" question, once we
start "auto-creating" the leading directory hierarchy, one worrysome
thing such a new feature introduces is an ambiguity and unexpected
behaviour.  Imagine there is no "no" directory (so nothing under it
exists), and you did this (you do have a regular file "file").

    $ git mv [-p] file no/such/dir/yet

What should happen?  Would we do the equivalent of

    $ mkdir -p no/such/dir && git mv file no/such/dir/yet

or are we doing

    $ mkdir -p no/such/dir/yet && git mv file no/such/dir/yet/file

Both are plausible, and "mkdir -p" does not have such a nasty
ambiguity.  That is what makes me unsure about the new feature
(again, either with required "-p" or with implied "-p").

Thanks.

^ permalink raw reply

* Re: [PATCH v3 1/3] diff-merges: improve --diff-merges documentation
From: Junio C Hamano @ 2023-10-10  0:24 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Sergey Organov, git
In-Reply-To: <CABPp-BGL_QzRd3mRhSF7rHYNA4pFWfKPA+UuZDODFgEv-1BHhA@mail.gmail.com>

Elijah Newren <newren@gmail.com> writes:

>> >> In my opinion, --remerge-diff does this better; wouldn't we want a
>> >> ...
>> > Between -c and --cc, I do not think there is anything that makes us
>> > favor -c over --cc.  While the algorithm to decide which hunks out
>> > of -c's output to omit was being polished, comparison with -c served
>> > a good way to give baseline, but once --cc has become solid, I do
>> > not think I've used -c myself.
>
> Perhaps, then, the user manual should either omit -c, or recommend
> users use --cc instead?

I do not think I'd miss "-c", but I do not know about others.

>> > I personally find that a very trivial merge resolution is far easier
>> > to read with --cc than --remerge-diff, the latter being way too
>> > verbose.
>
> Ah, indeed, for those that know the --cc output format well (it takes
> a bit to figure out for newcomers), your example demonstrates this
> nicely.  Thanks.

Yup.  And newcomers would take a bit to figure out remerge-diff
output, too, so my answers were written from the "nobody will stay
newcomer forever.  now once they get proficient enough, which ones
are good for them" viewpoint.

^ permalink raw reply

* Re: [PATCH v8 0/3] Add unit test framework and project plan
From: Junio C Hamano @ 2023-10-09 23:50 UTC (permalink / raw)
  To: Josh Steadmon, Johannes Schindelin
  Cc: git, phillip.wood123, linusa, calvinwan, rsbecker
In-Reply-To: <cover.1696889529.git.steadmon@google.com>

Josh Steadmon <steadmon@google.com> writes:

> Changes in v8:
> - Flipped return values for TEST, TEST_TODO, and check_* macros &
>   functions. This makes it easier to reason about control flow for
>   patterns like:
>     if (check(some_condition)) { ... }
> - Moved unit test binaries to t/unit-tests/bin to simplify .gitignore
>   patterns.
> - Removed testing of some strbuf implementation details in t-strbuf.c
>
>
> Josh Steadmon (2):
>   unit tests: Add a project plan document
>   ci: run unit tests in CI
>
> Phillip Wood (1):
>   unit tests: add TAP unit test framework

Thank you, all.

The other topic to adjust for cmake by Dscho builds on this topic,
and it needs to be rebased on this updated round.  I think I did so
correctly, but because I use neither cmake or Windows, the result is
not even compile tested.  Sanity checking the result is very much
appreciated when I push out the result of today's integration cycle.

$ git log --oneline --first-parent --decorate master..js/doc-unit-tests-with-cmake
d0773c1331 (js/doc-unit-tests-with-cmake) cmake: handle also unit tests
192de6de57 cmake: use test names instead of full paths
e1d97bc4df cmake: fix typo in variable name
e07499b8a7 artifacts-tar: when including `.dll` files, don't forget the unit-tests
11264f8f42 unit-tests: do show relative file paths
6c76c5b32d unit-tests: do not mistake `.pdb` files for being executable
295af2ef26 cmake: also build unit tests
31c2361349 (js/doc-unit-tests) ci: run unit tests in CI
3a47942530 unit tests: add TAP unit test framework
eeea7d763a unit tests: add a project plan document


^ permalink raw reply

* [PATCH 1/3] mv: Add -p option to create parent directories
From: Hugo Sales @ 2023-10-09 23:34 UTC (permalink / raw)
  To: git; +Cc: Hugo Sales
In-Reply-To: <20231009233458.1371351-1-hugo@hsal.es>

Inspired by "mkdir -p", this patch allows specifying a "-p" or
"--parents" flag which will create all non-existent directories in the
destination path before renaming the file.

This allows the user to not have to run two commands to move files to a
new directory.

Signed-off-by: Hugo Sales <hugo@hsal.es>
---
 builtin/mv.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index c596515ad0..5d64d86179 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -168,7 +168,7 @@ static int empty_dir_has_sparse_contents(const char *name)
 int cmd_mv(int argc, const char **argv, const char *prefix)
 {
 	int i, flags, gitmodules_modified = 0;
-	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0;
+	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0, create_parents = 0;
 	struct option builtin_mv_options[] = {
 		OPT__VERBOSE(&verbose, N_("be verbose")),
 		OPT__DRY_RUN(&show_only, N_("dry run")),
@@ -176,6 +176,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_BOOL('k', NULL, &ignore_errors, N_("skip move/rename errors")),
 		OPT_BOOL(0, "sparse", &ignore_sparse, N_("allow updating entries outside of the sparse-checkout cone")),
+		OPT_BOOL('p', "parents", &create_parents, N_("create missing parent directories")),
 		OPT_END(),
 	};
 	const char **source, **destination, **dest_path, **submodule_gitfile;
@@ -220,8 +221,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	if (dest_path[0][0] == '\0')
 		/* special case: "." was normalized to "" */
 		destination = internal_prefix_pathspec(dest_path[0], argv, argc, DUP_BASENAME);
-	else if (!lstat(dest_path[0], &st) &&
-			S_ISDIR(st.st_mode)) {
+	else if (create_parents ||
+		 (!lstat(dest_path[0], &st) && S_ISDIR(st.st_mode))) {
 		destination = internal_prefix_pathspec(dst_w_slash, argv, argc, DUP_BASENAME);
 	} else {
 		if (!path_in_sparse_checkout(dst_w_slash, &the_index) &&
@@ -381,7 +382,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			bad = _("multiple sources for the same target");
 			goto act_on_entry;
 		}
-		if (is_dir_sep(dst[strlen(dst) - 1])) {
+
+		if (!create_parents && is_dir_sep(dst[strlen(dst) - 1])) {
 			bad = _("destination directory does not exist");
 			goto act_on_entry;
 		}
@@ -459,11 +461,18 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (show_only)
 			continue;
 		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
-		    !(dst_mode & (SKIP_WORKTREE_DIR | SPARSE)) &&
-		    rename(src, dst) < 0) {
-			if (ignore_errors)
-				continue;
-			die_errno(_("renaming '%s' failed"), src);
+		    !(dst_mode & (SKIP_WORKTREE_DIR | SPARSE))) {
+			if (create_parents && safe_create_leading_directories_const(dst) < 0) {
+				if (ignore_errors)
+					continue;
+				die_errno(_("creating parent directories for '%s' failed"), dst);
+			}
+
+			if (rename(src, dst) < 0) {
+				if (ignore_errors)
+					continue;
+				die_errno(_("renaming '%s' failed"), src);
+			}
 		}
 		if (submodule_gitfile[i]) {
 			if (!update_path_in_gitmodules(src, dst))
-- 
2.42.0


^ permalink raw reply related

* [PATCH 3/3] mv: Add documentation for new `-p' flag
From: Hugo Sales @ 2023-10-09 23:34 UTC (permalink / raw)
  To: git; +Cc: Hugo Sales
In-Reply-To: <20231009233458.1371351-1-hugo@hsal.es>

Signed-off-by: Hugo Sales <hugo@hsal.es>
---
 Documentation/git-mv.txt | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-mv.txt b/Documentation/git-mv.txt
index fb0220fd18..70c1e9572c 100644
--- a/Documentation/git-mv.txt
+++ b/Documentation/git-mv.txt
@@ -15,13 +15,14 @@ DESCRIPTION
 -----------
 Move or rename a file, directory or symlink.
 
- git mv [-v] [-f] [-n] [-k] <source> <destination>
- git mv [-v] [-f] [-n] [-k] <source> ... <destination directory>
+ git mv [-v] [-f] [-n] [-k] [-p] <source> <destination>
+ git mv [-v] [-f] [-n] [-k] [-p] <source> ... <destination directory>
 
 In the first form, it renames <source>, which must exist and be either
 a file, symlink or directory, to <destination>.
-In the second form, the last argument has to be an existing
-directory; the given sources will be moved into this directory.
+In the second form, the last argument refers to a directory, which has
+to exist unless -p is specified, in which case it gets created; the
+given sources will be moved into this directory.
 
 The index is updated after successful completion, but the change must still be
 committed.
@@ -39,6 +40,9 @@ OPTIONS
 -n::
 --dry-run::
 	Do nothing; only show what would happen
+-p::
+--parents::
+	Create missing parent directories in the <destination> or <destination directory> path. Similar to `mkdir -p', all missing leading directories are created.
 
 -v::
 --verbose::
-- 
2.42.0


^ permalink raw reply related

* [PATCH 2/3] mv: Add tests for new -p flag
From: Hugo Sales @ 2023-10-09 23:34 UTC (permalink / raw)
  To: git; +Cc: Hugo Sales
In-Reply-To: <20231009233458.1371351-1-hugo@hsal.es>

Signed-off-by: Hugo Sales <hugo@hsal.es>
---
 t/t7009-mv-parents.sh | 79 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)
 create mode 100755 t/t7009-mv-parents.sh

diff --git a/t/t7009-mv-parents.sh b/t/t7009-mv-parents.sh
new file mode 100755
index 0000000000..cc1d9dae08
--- /dev/null
+++ b/t/t7009-mv-parents.sh
@@ -0,0 +1,79 @@
+#!/bin/sh
+
+test_description='git mv -p'
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'mv fails to move a file if the target directory does not exist' '
+	echo test >test1 &&
+	git add test1 &&
+	test_must_fail git mv test1 foo/
+'
+
+test_expect_success 'mv fails to move multiple files if the target directory does not exist' '
+	echo test >test2-1 &&
+	echo test >test2-2 &&
+	git add test2-1 test2-2 &&
+	test_must_fail git mv test2-1 test2-2 foo/
+'
+
+test_expect_success 'mv fails to move a file if the target refers to a file in a directory that does not exist' '
+	echo test >test3 &&
+	git add test3 &&
+	test_must_fail git mv test3 foo/test3.txt
+'
+
+test_expect_success 'mv succeeds to move a file even if the target directory does not exist' '
+	echo test >test4 &&
+	git add test4 &&
+	git commit -m test4-commit1 &&
+	git mv -p test4 dir4/ &&
+	git commit -m test4-commit2 &&
+	git diff-tree -r -M --name-status HEAD^ HEAD >test4-actual &&
+	grep "^R100..*test4..*dir4/test4" test4-actual
+'
+
+test_expect_success 'mv succeeds to move multiple files even if the target directory does not exist' '
+	echo test >test5-1 &&
+	echo test >test5-2 &&
+	git add test5-1 test5-2 &&
+	git commit -m test5-commit1 &&
+	git mv -p test5-1 test5-2 dir5/ &&
+	git commit -m test5-commit2 &&
+	git diff-tree -r -M --name-status HEAD^ HEAD >test5-actual &&
+	grep -e "^R100..*test5-1..*dir5/test5-1" -e "^R100..*test5-2..*dir5/test5-2" test5-actual
+'
+
+test_expect_success 'mv succeeds to move a file even if the target refers to a file in a directory that does not exist' '
+	echo test >test6 &&
+	git add test6 &&
+	git commit -m test6-commmit-1 &&
+	git mv -p test6 dir6/test6.txt &&
+	git commit -m test6-commit2 &&
+	git diff-tree -r -M --name-status HEAD^ HEAD >test6-actual &&
+	grep "^R100..*test6..*dir6/test6.txt" test6-actual
+'
+
+test_expect_success 'mv succeeds to move a file even if the target refers to a file in a directory inside a directory that does not exist' '
+	echo test >test7 &&
+	git add test7 &&
+	git commit -m test7-commit1 &&
+	git mv -p test7 dir7/dir7/test7.txt &&
+	git commit -m test7-commit2 &&
+	git diff-tree -r -M --name-status HEAD^ HEAD >test7-actual &&
+	grep "^R100..*test7..*dir7/dir7/test7.txt" test7-actual
+'
+
+test_expect_success 'mv succeeds to move a file even if the target refers to a file in a directory inside a directory inside a directory that does not exist' '
+	echo test >test8 &&
+	git add test8 &&
+	git commit -m test8-commit1 &&
+	git mv -p test8 dir8/dir8/dir8/test8.txt &&
+	git commit -m test8-commit2 &&
+	git diff-tree -r -M --name-status HEAD^ HEAD >test8-actual &&
+	grep "^R100..*test8..*dir8/dir8/dir8/test8.txt" test8-actual
+'
+
+test_done
-- 
2.42.0


^ permalink raw reply related

* [PATCH 0/3] Add `-p' option to `git-mv', inspired by `mkdir'
From: Hugo Sales @ 2023-10-09 23:34 UTC (permalink / raw)
  To: git; +Cc: Hugo Sales, Derrick Stolee, Shaoxuan Yuan, Junio C Hamano

Allow passing a new `-p'/`--parents' option to `git-mv' making it so
missing directories in the given target path get created if they don't
exist. This allows running `git mv -p foo bar/quux/' to first create
the `bar/' and `bar/quux/' directories if they don't exist, and then
move `foo' to `bar/quux/foo'.

This is inspired by `mkdir -p foo/bar/quux/' which will create the
`foo/', `foo/bar/', and `foo/bar/quux/' directories if they don't
exist.

Hugo Sales (3):
  mv: Add -p option to create parent directories
  mv: Add tests for new -p flag
  mv: Add documentation for new `-p' flag

 Documentation/git-mv.txt | 12 ++++--
 builtin/mv.c             | 27 +++++++++-----
 t/t7009-mv-parents.sh    | 79 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 105 insertions(+), 13 deletions(-)
 create mode 100755 t/t7009-mv-parents.sh


base-commit: 3a06386e314565108ad56a9bdb8f7b80ac52fb69
-- 
2.42.0


^ permalink raw reply

* [PATCH v8 3/3] ci: run unit tests in CI
From: Josh Steadmon @ 2023-10-09 22:21 UTC (permalink / raw)
  To: git; +Cc: phillip.wood123, linusa, calvinwan, gitster, rsbecker
In-Reply-To: <cover.1696889529.git.steadmon@google.com>

Run unit tests in both Cirrus and GitHub CI. For sharded CI instances
(currently just Windows on GitHub), run only on the first shard. This is
OK while we have only a single unit test executable, but we may wish to
distribute tests more evenly when we add new unit tests in the future.

We may also want to add more status output in our unit test framework,
so that we can do similar post-processing as in
ci/lib.sh:handle_failed_tests().

Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 .cirrus.yml               | 2 +-
 ci/run-build-and-tests.sh | 2 ++
 ci/run-test-slice.sh      | 5 +++++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 4860bebd32..b6280692d2 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -19,4 +19,4 @@ freebsd_12_task:
   build_script:
     - su git -c gmake
   test_script:
-    - su git -c 'gmake test'
+    - su git -c 'gmake DEFAULT_UNIT_TEST_TARGET=unit-tests-prove test unit-tests'
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 2528f25e31..7a1466b868 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -50,6 +50,8 @@ if test -n "$run_tests"
 then
 	group "Run tests" make test ||
 	handle_failed_tests
+	group "Run unit tests" \
+		make DEFAULT_UNIT_TEST_TARGET=unit-tests-prove unit-tests
 fi
 check_unignored_build_artifacts
 
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index a3c67956a8..ae8094382f 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -15,4 +15,9 @@ group "Run tests" make --quiet -C t T="$(cd t &&
 	tr '\n' ' ')" ||
 handle_failed_tests
 
+# We only have one unit test at the moment, so run it in the first slice
+if [ "$1" == "0" ] ; then
+	group "Run unit tests" make --quiet -C t unit-tests-prove
+fi
+
 check_unignored_build_artifacts
-- 
2.42.0.609.gbb76f46606-goog


^ permalink raw reply related

* [PATCH v8 2/3] unit tests: add TAP unit test framework
From: Josh Steadmon @ 2023-10-09 22:21 UTC (permalink / raw)
  To: git; +Cc: phillip.wood123, linusa, calvinwan, gitster, rsbecker
In-Reply-To: <cover.1696889529.git.steadmon@google.com>

From: Phillip Wood <phillip.wood@dunelm.org.uk>

This patch contains an implementation for writing unit tests with TAP
output. Each test is a function that contains one or more checks. The
test is run with the TEST() macro and if any of the checks fail then the
test will fail. A complete program that tests STRBUF_INIT would look
like

     #include "test-lib.h"
     #include "strbuf.h"

     static void t_static_init(void)
     {
             struct strbuf buf = STRBUF_INIT;

             check_uint(buf.len, ==, 0);
             check_uint(buf.alloc, ==, 0);
             check_char(buf.buf[0], ==, '\0');
     }

     int main(void)
     {
             TEST(t_static_init(), "static initialization works);

             return test_done();
     }

The output of this program would be

     ok 1 - static initialization works
     1..1

If any of the checks in a test fail then they print a diagnostic message
to aid debugging and the test will be reported as failing. For example a
failing integer check would look like

     # check "x >= 3" failed at my-test.c:102
     #    left: 2
     #   right: 3
     not ok 1 - x is greater than or equal to three

There are a number of check functions implemented so far. check() checks
a boolean condition, check_int(), check_uint() and check_char() take two
values to compare and a comparison operator. check_str() will check if
two strings are equal. Custom checks are simple to implement as shown in
the comments above test_assert() in test-lib.h.

Tests can be skipped with test_skip() which can be supplied with a
reason for skipping which it will print. Tests can print diagnostic
messages with test_msg().  Checks that are known to fail can be wrapped
in TEST_TODO().

There are a couple of example test programs included in this
patch. t-basic.c implements some self-tests and demonstrates the
diagnostic output for failing test. The output of this program is
checked by t0080-unit-test-output.sh. t-strbuf.c shows some example
unit tests for strbuf.c

The unit tests will be built as part of the default "make all" target,
to avoid bitrot. If you wish to build just the unit tests, you can run
"make build-unit-tests". To run the tests, you can use "make unit-tests"
or run the test binaries directly, as in "./t/unit-tests/bin/t-strbuf".

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 Makefile                    |  28 ++-
 t/Makefile                  |  15 +-
 t/t0080-unit-test-output.sh |  58 +++++++
 t/unit-tests/.gitignore     |   1 +
 t/unit-tests/t-basic.c      |  95 +++++++++++
 t/unit-tests/t-strbuf.c     | 120 +++++++++++++
 t/unit-tests/test-lib.c     | 329 ++++++++++++++++++++++++++++++++++++
 t/unit-tests/test-lib.h     | 143 ++++++++++++++++
 8 files changed, 785 insertions(+), 4 deletions(-)
 create mode 100755 t/t0080-unit-test-output.sh
 create mode 100644 t/unit-tests/.gitignore
 create mode 100644 t/unit-tests/t-basic.c
 create mode 100644 t/unit-tests/t-strbuf.c
 create mode 100644 t/unit-tests/test-lib.c
 create mode 100644 t/unit-tests/test-lib.h

diff --git a/Makefile b/Makefile
index e440728c24..18c13f06c0 100644
--- a/Makefile
+++ b/Makefile
@@ -682,6 +682,9 @@ TEST_BUILTINS_OBJS =
 TEST_OBJS =
 TEST_PROGRAMS_NEED_X =
 THIRD_PARTY_SOURCES =
+UNIT_TEST_PROGRAMS =
+UNIT_TEST_DIR = t/unit-tests
+UNIT_TEST_BIN = $(UNIT_TEST_DIR)/bin
 
 # Having this variable in your environment would break pipelines because
 # you cause "cd" to echo its destination to stdout.  It can also take
@@ -1331,6 +1334,12 @@ THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
+UNIT_TEST_PROGRAMS += t-basic
+UNIT_TEST_PROGRAMS += t-strbuf
+UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
+UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
+UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
+
 # xdiff and reftable libs may in turn depend on what is in libgit.a
 GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
 EXTLIBS =
@@ -2672,6 +2681,7 @@ OBJECTS += $(TEST_OBJS)
 OBJECTS += $(XDIFF_OBJS)
 OBJECTS += $(FUZZ_OBJS)
 OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS)
+OBJECTS += $(UNIT_TEST_OBJS)
 
 ifndef NO_CURL
 	OBJECTS += http.o http-walker.o remote-curl.o
@@ -3167,7 +3177,7 @@ endif
 
 test_bindir_programs := $(patsubst %,bin-wrappers/%,$(BINDIR_PROGRAMS_NEED_X) $(BINDIR_PROGRAMS_NO_X) $(TEST_PROGRAMS_NEED_X))
 
-all:: $(TEST_PROGRAMS) $(test_bindir_programs)
+all:: $(TEST_PROGRAMS) $(test_bindir_programs) $(UNIT_TEST_PROGS)
 
 bin-wrappers/%: wrap-for-bin.sh
 	$(call mkdir_p_parent_template)
@@ -3592,7 +3602,7 @@ endif
 
 artifacts-tar:: $(ALL_COMMANDS_TO_INSTALL) $(SCRIPT_LIB) $(OTHER_PROGRAMS) \
 		GIT-BUILD-OPTIONS $(TEST_PROGRAMS) $(test_bindir_programs) \
-		$(MOFILES)
+		$(UNIT_TEST_PROGS) $(MOFILES)
 	$(QUIET_SUBDIR0)templates $(QUIET_SUBDIR1) \
 		SHELL_PATH='$(SHELL_PATH_SQ)' PERL_PATH='$(PERL_PATH_SQ)'
 	test -n "$(ARTIFACTS_DIRECTORY)"
@@ -3653,7 +3663,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) $(OBJECTS)
 	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
-	$(RM) $(TEST_PROGRAMS)
+	$(RM) $(TEST_PROGRAMS) $(UNIT_TEST_PROGS)
 	$(RM) $(FUZZ_PROGRAMS)
 	$(RM) $(SP_OBJ)
 	$(RM) $(HCC)
@@ -3831,3 +3841,15 @@ $(FUZZ_PROGRAMS): all
 		$(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
 
 fuzz-all: $(FUZZ_PROGRAMS)
+
+$(UNIT_TEST_BIN):
+	@mkdir -p $(UNIT_TEST_BIN)
+
+$(UNIT_TEST_PROGS): $(UNIT_TEST_BIN)/%$X: $(UNIT_TEST_DIR)/%.o $(UNIT_TEST_DIR)/test-lib.o $(GITLIBS) GIT-LDFLAGS $(UNIT_TEST_BIN)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
+		$(filter %.o,$^) $(filter %.a,$^) $(LIBS)
+
+.PHONY: build-unit-tests unit-tests
+build-unit-tests: $(UNIT_TEST_PROGS)
+unit-tests: $(UNIT_TEST_PROGS)
+	$(MAKE) -C t/ unit-tests
diff --git a/t/Makefile b/t/Makefile
index 3e00cdd801..75d9330437 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -17,6 +17,7 @@ TAR ?= $(TAR)
 RM ?= rm -f
 PROVE ?= prove
 DEFAULT_TEST_TARGET ?= test
+DEFAULT_UNIT_TEST_TARGET ?= unit-tests-raw
 TEST_LINT ?= test-lint
 
 ifdef TEST_OUTPUT_DIRECTORY
@@ -41,6 +42,7 @@ TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh))
 TINTEROP = $(sort $(wildcard interop/i[0-9][0-9][0-9][0-9]-*.sh))
 CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test)))
 CHAINLINT = '$(PERL_PATH_SQ)' chainlint.pl
+UNIT_TESTS = $(sort $(filter-out unit-tests/bin/t-basic%,$(wildcard unit-tests/bin/t-*)))
 
 # `test-chainlint` (which is a dependency of `test-lint`, `test` and `prove`)
 # checks all tests in all scripts via a single invocation, so tell individual
@@ -65,6 +67,17 @@ prove: pre-clean check-chainlint $(TEST_LINT)
 $(T):
 	@echo "*** $@ ***"; '$(TEST_SHELL_PATH_SQ)' $@ $(GIT_TEST_OPTS)
 
+$(UNIT_TESTS):
+	@echo "*** $@ ***"; $@
+
+.PHONY: unit-tests unit-tests-raw unit-tests-prove
+unit-tests: $(DEFAULT_UNIT_TEST_TARGET)
+
+unit-tests-raw: $(UNIT_TESTS)
+
+unit-tests-prove:
+	@echo "*** prove - unit tests ***"; $(PROVE) $(GIT_PROVE_OPTS) $(UNIT_TESTS)
+
 pre-clean:
 	$(RM) -r '$(TEST_RESULTS_DIRECTORY_SQ)'
 
@@ -149,4 +162,4 @@ perf:
 	$(MAKE) -C perf/ all
 
 .PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
-	check-chainlint clean-chainlint test-chainlint
+	check-chainlint clean-chainlint test-chainlint $(UNIT_TESTS)
diff --git a/t/t0080-unit-test-output.sh b/t/t0080-unit-test-output.sh
new file mode 100755
index 0000000000..961b54b06c
--- /dev/null
+++ b/t/t0080-unit-test-output.sh
@@ -0,0 +1,58 @@
+#!/bin/sh
+
+test_description='Test the output of the unit test framework'
+
+. ./test-lib.sh
+
+test_expect_success 'TAP output from unit tests' '
+	cat >expect <<-EOF &&
+	ok 1 - passing test
+	ok 2 - passing test and assertion return 1
+	# check "1 == 2" failed at t/unit-tests/t-basic.c:76
+	#    left: 1
+	#   right: 2
+	not ok 3 - failing test
+	ok 4 - failing test and assertion return 0
+	not ok 5 - passing TEST_TODO() # TODO
+	ok 6 - passing TEST_TODO() returns 1
+	# todo check ${SQ}check(x)${SQ} succeeded at t/unit-tests/t-basic.c:25
+	not ok 7 - failing TEST_TODO()
+	ok 8 - failing TEST_TODO() returns 0
+	# check "0" failed at t/unit-tests/t-basic.c:30
+	# skipping test - missing prerequisite
+	# skipping check ${SQ}1${SQ} at t/unit-tests/t-basic.c:32
+	ok 9 - test_skip() # SKIP
+	ok 10 - skipped test returns 1
+	# skipping test - missing prerequisite
+	ok 11 - test_skip() inside TEST_TODO() # SKIP
+	ok 12 - test_skip() inside TEST_TODO() returns 1
+	# check "0" failed at t/unit-tests/t-basic.c:48
+	not ok 13 - TEST_TODO() after failing check
+	ok 14 - TEST_TODO() after failing check returns 0
+	# check "0" failed at t/unit-tests/t-basic.c:56
+	not ok 15 - failing check after TEST_TODO()
+	ok 16 - failing check after TEST_TODO() returns 0
+	# check "!strcmp("\thello\\\\", "there\"\n")" failed at t/unit-tests/t-basic.c:61
+	#    left: "\011hello\\\\"
+	#   right: "there\"\012"
+	# check "!strcmp("NULL", NULL)" failed at t/unit-tests/t-basic.c:62
+	#    left: "NULL"
+	#   right: NULL
+	# check "${SQ}a${SQ} == ${SQ}\n${SQ}" failed at t/unit-tests/t-basic.c:63
+	#    left: ${SQ}a${SQ}
+	#   right: ${SQ}\012${SQ}
+	# check "${SQ}\\\\${SQ} == ${SQ}\\${SQ}${SQ}" failed at t/unit-tests/t-basic.c:64
+	#    left: ${SQ}\\\\${SQ}
+	#   right: ${SQ}\\${SQ}${SQ}
+	not ok 17 - messages from failing string and char comparison
+	# BUG: test has no checks at t/unit-tests/t-basic.c:91
+	not ok 18 - test with no checks
+	ok 19 - test with no checks returns 0
+	1..19
+	EOF
+
+	! "$GIT_BUILD_DIR"/t/unit-tests/bin/t-basic >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/t/unit-tests/.gitignore b/t/unit-tests/.gitignore
new file mode 100644
index 0000000000..5e56e040ec
--- /dev/null
+++ b/t/unit-tests/.gitignore
@@ -0,0 +1 @@
+/bin
diff --git a/t/unit-tests/t-basic.c b/t/unit-tests/t-basic.c
new file mode 100644
index 0000000000..fda1ae59a6
--- /dev/null
+++ b/t/unit-tests/t-basic.c
@@ -0,0 +1,95 @@
+#include "test-lib.h"
+
+/*
+ * The purpose of this "unit test" is to verify a few invariants of the unit
+ * test framework itself, as well as to provide examples of output from actually
+ * failing tests. As such, it is intended that this test fails, and thus it
+ * should not be run as part of `make unit-tests`. Instead, we verify it behaves
+ * as expected in the integration test t0080-unit-test-output.sh
+ */
+
+/* Used to store the return value of check_int(). */
+static int check_res;
+
+/* Used to store the return value of TEST(). */
+static int test_res;
+
+static void t_res(int expect)
+{
+	check_int(check_res, ==, expect);
+	check_int(test_res, ==, expect);
+}
+
+static void t_todo(int x)
+{
+	check_res = TEST_TODO(check(x));
+}
+
+static void t_skip(void)
+{
+	check(0);
+	test_skip("missing prerequisite");
+	check(1);
+}
+
+static int do_skip(void)
+{
+	test_skip("missing prerequisite");
+	return 1;
+}
+
+static void t_skip_todo(void)
+{
+	check_res = TEST_TODO(do_skip());
+}
+
+static void t_todo_after_fail(void)
+{
+	check(0);
+	TEST_TODO(check(0));
+}
+
+static void t_fail_after_todo(void)
+{
+	check(1);
+	TEST_TODO(check(0));
+	check(0);
+}
+
+static void t_messages(void)
+{
+	check_str("\thello\\", "there\"\n");
+	check_str("NULL", NULL);
+	check_char('a', ==, '\n');
+	check_char('\\', ==, '\'');
+}
+
+static void t_empty(void)
+{
+	; /* empty */
+}
+
+int cmd_main(int argc, const char **argv)
+{
+	test_res = TEST(check_res = check_int(1, ==, 1), "passing test");
+	TEST(t_res(1), "passing test and assertion return 1");
+	test_res = TEST(check_res = check_int(1, ==, 2), "failing test");
+	TEST(t_res(0), "failing test and assertion return 0");
+	test_res = TEST(t_todo(0), "passing TEST_TODO()");
+	TEST(t_res(1), "passing TEST_TODO() returns 1");
+	test_res = TEST(t_todo(1), "failing TEST_TODO()");
+	TEST(t_res(0), "failing TEST_TODO() returns 0");
+	test_res = TEST(t_skip(), "test_skip()");
+	TEST(check_int(test_res, ==, 1), "skipped test returns 1");
+	test_res = TEST(t_skip_todo(), "test_skip() inside TEST_TODO()");
+	TEST(t_res(1), "test_skip() inside TEST_TODO() returns 1");
+	test_res = TEST(t_todo_after_fail(), "TEST_TODO() after failing check");
+	TEST(check_int(test_res, ==, 0), "TEST_TODO() after failing check returns 0");
+	test_res = TEST(t_fail_after_todo(), "failing check after TEST_TODO()");
+	TEST(check_int(test_res, ==, 0), "failing check after TEST_TODO() returns 0");
+	TEST(t_messages(), "messages from failing string and char comparison");
+	test_res = TEST(t_empty(), "test with no checks");
+	TEST(check_int(test_res, ==, 0), "test with no checks returns 0");
+
+	return test_done();
+}
diff --git a/t/unit-tests/t-strbuf.c b/t/unit-tests/t-strbuf.c
new file mode 100644
index 0000000000..c2fcb0cbd6
--- /dev/null
+++ b/t/unit-tests/t-strbuf.c
@@ -0,0 +1,120 @@
+#include "test-lib.h"
+#include "strbuf.h"
+
+/* wrapper that supplies tests with an empty, initialized strbuf */
+static void setup(void (*f)(struct strbuf*, void*), void *data)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	f(&buf, data);
+	strbuf_release(&buf);
+	check_uint(buf.len, ==, 0);
+	check_uint(buf.alloc, ==, 0);
+}
+
+/* wrapper that supplies tests with a populated, initialized strbuf */
+static void setup_populated(void (*f)(struct strbuf*, void*), char *init_str, void *data)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_addstr(&buf, init_str);
+	check_uint(buf.len, ==, strlen(init_str));
+	f(&buf, data);
+	strbuf_release(&buf);
+	check_uint(buf.len, ==, 0);
+	check_uint(buf.alloc, ==, 0);
+}
+
+static int assert_sane_strbuf(struct strbuf *buf)
+{
+	/* Initialized strbufs should always have a non-NULL buffer */
+	if (buf->buf == NULL)
+		return 0;
+	/* Buffers should always be NUL-terminated */
+	if (buf->buf[buf->len] != '\0')
+		return 0;
+	/*
+	 * Freshly-initialized strbufs may not have a dynamically allocated
+	 * buffer
+	 */
+	if (buf->len == 0 && buf->alloc == 0)
+		return 1;
+	/* alloc must be at least one byte larger than len */
+	return buf->len + 1 <= buf->alloc;
+}
+
+static void t_static_init(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	check_uint(buf.len, ==, 0);
+	check_uint(buf.alloc, ==, 0);
+	check_char(buf.buf[0], ==, '\0');
+}
+
+static void t_dynamic_init(void)
+{
+	struct strbuf buf;
+
+	strbuf_init(&buf, 1024);
+	check(assert_sane_strbuf(&buf));
+	check_uint(buf.len, ==, 0);
+	check_uint(buf.alloc, >=, 1024);
+	check_char(buf.buf[0], ==, '\0');
+	strbuf_release(&buf);
+}
+
+static void t_addch(struct strbuf *buf, void *data)
+{
+	const char *p_ch = data;
+	const char ch = *p_ch;
+	size_t orig_alloc = buf->alloc;
+	size_t orig_len = buf->len;
+
+	if (!check(assert_sane_strbuf(buf)))
+		return;
+	strbuf_addch(buf, ch);
+	if (!check(assert_sane_strbuf(buf)))
+		return;
+	if (!(check_uint(buf->len, ==, orig_len + 1) &&
+	      check_uint(buf->alloc, >=, orig_alloc)))
+		return; /* avoid de-referencing buf->buf */
+	check_char(buf->buf[buf->len - 1], ==, ch);
+	check_char(buf->buf[buf->len], ==, '\0');
+}
+
+static void t_addstr(struct strbuf *buf, void *data)
+{
+	const char *text = data;
+	size_t len = strlen(text);
+	size_t orig_alloc = buf->alloc;
+	size_t orig_len = buf->len;
+
+	if (!check(assert_sane_strbuf(buf)))
+		return;
+	strbuf_addstr(buf, text);
+	if (!check(assert_sane_strbuf(buf)))
+		return;
+	if (!(check_uint(buf->len, ==, orig_len + len) &&
+	      check_uint(buf->alloc, >=, orig_alloc) &&
+	      check_uint(buf->alloc, >, orig_len + len) &&
+	      check_char(buf->buf[orig_len + len], ==, '\0')))
+	    return;
+	check_str(buf->buf + orig_len, text);
+}
+
+int cmd_main(int argc, const char **argv)
+{
+	if (!TEST(t_static_init(), "static initialization works"))
+		test_skip_all("STRBUF_INIT is broken");
+	TEST(t_dynamic_init(), "dynamic initialization works");
+	TEST(setup(t_addch, "a"), "strbuf_addch adds char");
+	TEST(setup(t_addch, ""), "strbuf_addch adds NUL char");
+	TEST(setup_populated(t_addch, "initial value", "a"),
+	     "strbuf_addch appends to initial value");
+	TEST(setup(t_addstr, "hello there"), "strbuf_addstr adds string");
+	TEST(setup_populated(t_addstr, "initial value", "hello there"),
+	     "strbuf_addstr appends string to initial value");
+
+	return test_done();
+}
diff --git a/t/unit-tests/test-lib.c b/t/unit-tests/test-lib.c
new file mode 100644
index 0000000000..b20f543121
--- /dev/null
+++ b/t/unit-tests/test-lib.c
@@ -0,0 +1,329 @@
+#include "test-lib.h"
+
+enum result {
+	RESULT_NONE,
+	RESULT_FAILURE,
+	RESULT_SKIP,
+	RESULT_SUCCESS,
+	RESULT_TODO
+};
+
+static struct {
+	enum result result;
+	int count;
+	unsigned failed :1;
+	unsigned lazy_plan :1;
+	unsigned running :1;
+	unsigned skip_all :1;
+	unsigned todo :1;
+} ctx = {
+	.lazy_plan = 1,
+	.result = RESULT_NONE,
+};
+
+static void msg_with_prefix(const char *prefix, const char *format, va_list ap)
+{
+	fflush(stderr);
+	if (prefix)
+		fprintf(stdout, "%s", prefix);
+	vprintf(format, ap); /* TODO: handle newlines */
+	putc('\n', stdout);
+	fflush(stdout);
+}
+
+void test_msg(const char *format, ...)
+{
+	va_list ap;
+
+	va_start(ap, format);
+	msg_with_prefix("# ", format, ap);
+	va_end(ap);
+}
+
+void test_plan(int count)
+{
+	assert(!ctx.running);
+
+	fflush(stderr);
+	printf("1..%d\n", count);
+	fflush(stdout);
+	ctx.lazy_plan = 0;
+}
+
+int test_done(void)
+{
+	assert(!ctx.running);
+
+	if (ctx.lazy_plan)
+		test_plan(ctx.count);
+
+	return ctx.failed;
+}
+
+void test_skip(const char *format, ...)
+{
+	va_list ap;
+
+	assert(ctx.running);
+
+	ctx.result = RESULT_SKIP;
+	va_start(ap, format);
+	if (format)
+		msg_with_prefix("# skipping test - ", format, ap);
+	va_end(ap);
+}
+
+void test_skip_all(const char *format, ...)
+{
+	va_list ap;
+	const char *prefix;
+
+	if (!ctx.count && ctx.lazy_plan) {
+		/* We have not printed a test plan yet */
+		prefix = "1..0 # SKIP ";
+		ctx.lazy_plan = 0;
+	} else {
+		/* We have already printed a test plan */
+		prefix = "Bail out! # ";
+		ctx.failed = 1;
+	}
+	ctx.skip_all = 1;
+	ctx.result = RESULT_SKIP;
+	va_start(ap, format);
+	msg_with_prefix(prefix, format, ap);
+	va_end(ap);
+}
+
+int test__run_begin(void)
+{
+	assert(!ctx.running);
+
+	ctx.count++;
+	ctx.result = RESULT_NONE;
+	ctx.running = 1;
+
+	return ctx.skip_all;
+}
+
+static void print_description(const char *format, va_list ap)
+{
+	if (format) {
+		fputs(" - ", stdout);
+		vprintf(format, ap);
+	}
+}
+
+int test__run_end(int was_run UNUSED, const char *location, const char *format, ...)
+{
+	va_list ap;
+
+	assert(ctx.running);
+	assert(!ctx.todo);
+
+	fflush(stderr);
+	va_start(ap, format);
+	if (!ctx.skip_all) {
+		switch (ctx.result) {
+		case RESULT_SUCCESS:
+			printf("ok %d", ctx.count);
+			print_description(format, ap);
+			break;
+
+		case RESULT_FAILURE:
+			printf("not ok %d", ctx.count);
+			print_description(format, ap);
+			break;
+
+		case RESULT_TODO:
+			printf("not ok %d", ctx.count);
+			print_description(format, ap);
+			printf(" # TODO");
+			break;
+
+		case RESULT_SKIP:
+			printf("ok %d", ctx.count);
+			print_description(format, ap);
+			printf(" # SKIP");
+			break;
+
+		case RESULT_NONE:
+			test_msg("BUG: test has no checks at %s", location);
+			printf("not ok %d", ctx.count);
+			print_description(format, ap);
+			ctx.result = RESULT_FAILURE;
+			break;
+		}
+	}
+	va_end(ap);
+	ctx.running = 0;
+	if (ctx.skip_all)
+		return 1;
+	putc('\n', stdout);
+	fflush(stdout);
+	ctx.failed |= ctx.result == RESULT_FAILURE;
+
+	return ctx.result != RESULT_FAILURE;
+}
+
+static void test_fail(void)
+{
+	assert(ctx.result != RESULT_SKIP);
+
+	ctx.result = RESULT_FAILURE;
+}
+
+static void test_pass(void)
+{
+	assert(ctx.result != RESULT_SKIP);
+
+	if (ctx.result == RESULT_NONE)
+		ctx.result = RESULT_SUCCESS;
+}
+
+static void test_todo(void)
+{
+	assert(ctx.result != RESULT_SKIP);
+
+	if (ctx.result != RESULT_FAILURE)
+		ctx.result = RESULT_TODO;
+}
+
+int test_assert(const char *location, const char *check, int ok)
+{
+	assert(ctx.running);
+
+	if (ctx.result == RESULT_SKIP) {
+		test_msg("skipping check '%s' at %s", check, location);
+		return 1;
+	} else if (!ctx.todo) {
+		if (ok) {
+			test_pass();
+		} else {
+			test_msg("check \"%s\" failed at %s", check, location);
+			test_fail();
+		}
+	}
+
+	return !!ok;
+}
+
+void test__todo_begin(void)
+{
+	assert(ctx.running);
+	assert(!ctx.todo);
+
+	ctx.todo = 1;
+}
+
+int test__todo_end(const char *location, const char *check, int res)
+{
+	assert(ctx.running);
+	assert(ctx.todo);
+
+	ctx.todo = 0;
+	if (ctx.result == RESULT_SKIP)
+		return 1;
+	if (res) {
+		test_msg("todo check '%s' succeeded at %s", check, location);
+		test_fail();
+	} else {
+		test_todo();
+	}
+
+	return !res;
+}
+
+int check_bool_loc(const char *loc, const char *check, int ok)
+{
+	return test_assert(loc, check, ok);
+}
+
+union test__tmp test__tmp[2];
+
+int check_int_loc(const char *loc, const char *check, int ok,
+		  intmax_t a, intmax_t b)
+{
+	int ret = test_assert(loc, check, ok);
+
+	if (!ret) {
+		test_msg("   left: %"PRIdMAX, a);
+		test_msg("  right: %"PRIdMAX, b);
+	}
+
+	return ret;
+}
+
+int check_uint_loc(const char *loc, const char *check, int ok,
+		   uintmax_t a, uintmax_t b)
+{
+	int ret = test_assert(loc, check, ok);
+
+	if (!ret) {
+		test_msg("   left: %"PRIuMAX, a);
+		test_msg("  right: %"PRIuMAX, b);
+	}
+
+	return ret;
+}
+
+static void print_one_char(char ch, char quote)
+{
+	if ((unsigned char)ch < 0x20u || ch == 0x7f) {
+		/* TODO: improve handling of \a, \b, \f ... */
+		printf("\\%03o", (unsigned char)ch);
+	} else {
+		if (ch == '\\' || ch == quote)
+			putc('\\', stdout);
+		putc(ch, stdout);
+	}
+}
+
+static void print_char(const char *prefix, char ch)
+{
+	printf("# %s: '", prefix);
+	print_one_char(ch, '\'');
+	fputs("'\n", stdout);
+}
+
+int check_char_loc(const char *loc, const char *check, int ok, char a, char b)
+{
+	int ret = test_assert(loc, check, ok);
+
+	if (!ret) {
+		fflush(stderr);
+		print_char("   left", a);
+		print_char("  right", b);
+		fflush(stdout);
+	}
+
+	return ret;
+}
+
+static void print_str(const char *prefix, const char *str)
+{
+	printf("# %s: ", prefix);
+	if (!str) {
+		fputs("NULL\n", stdout);
+	} else {
+		putc('"', stdout);
+		while (*str)
+			print_one_char(*str++, '"');
+		fputs("\"\n", stdout);
+	}
+}
+
+int check_str_loc(const char *loc, const char *check,
+		  const char *a, const char *b)
+{
+	int ok = (!a && !b) || (a && b && !strcmp(a, b));
+	int ret = test_assert(loc, check, ok);
+
+	if (!ret) {
+		fflush(stderr);
+		print_str("   left", a);
+		print_str("  right", b);
+		fflush(stdout);
+	}
+
+	return ret;
+}
diff --git a/t/unit-tests/test-lib.h b/t/unit-tests/test-lib.h
new file mode 100644
index 0000000000..8df3804914
--- /dev/null
+++ b/t/unit-tests/test-lib.h
@@ -0,0 +1,143 @@
+#ifndef TEST_LIB_H
+#define TEST_LIB_H
+
+#include "git-compat-util.h"
+
+/*
+ * Run a test function, returns 1 if the test succeeds, 0 if it
+ * fails. If test_skip_all() has been called then the test will not be
+ * run. The description for each test should be unique. For example:
+ *
+ *  TEST(test_something(arg1, arg2), "something %d %d", arg1, arg2)
+ */
+#define TEST(t, ...)					\
+	test__run_end(test__run_begin() ? 0 : (t, 1),	\
+		      TEST_LOCATION(),  __VA_ARGS__)
+
+/*
+ * Print a test plan, should be called before any tests. If the number
+ * of tests is not known in advance test_done() will automatically
+ * print a plan at the end of the test program.
+ */
+void test_plan(int count);
+
+/*
+ * test_done() must be called at the end of main(). It will print the
+ * plan if plan() was not called at the beginning of the test program
+ * and returns the exit code for the test program.
+ */
+int test_done(void);
+
+/* Skip the current test. */
+__attribute__((format (printf, 1, 2)))
+void test_skip(const char *format, ...);
+
+/* Skip all remaining tests. */
+__attribute__((format (printf, 1, 2)))
+void test_skip_all(const char *format, ...);
+
+/* Print a diagnostic message to stdout. */
+__attribute__((format (printf, 1, 2)))
+void test_msg(const char *format, ...);
+
+/*
+ * Test checks are built around test_assert(). checks return 1 on
+ * success, 0 on failure. If any check fails then the test will
+ * fail. To create a custom check define a function that wraps
+ * test_assert() and a macro to wrap that function. For example:
+ *
+ *  static int check_oid_loc(const char *loc, const char *check,
+ *			     struct object_id *a, struct object_id *b)
+ *  {
+ *	    int res = test_assert(loc, check, oideq(a, b));
+ *
+ *	    if (res) {
+ *		    test_msg("   left: %s", oid_to_hex(a);
+ *		    test_msg("  right: %s", oid_to_hex(a);
+ *
+ *	    }
+ *	    return res;
+ *  }
+ *
+ *  #define check_oid(a, b) \
+ *	    check_oid_loc(TEST_LOCATION(), "oideq("#a", "#b")", a, b)
+ */
+int test_assert(const char *location, const char *check, int ok);
+
+/* Helper macro to pass the location to checks */
+#define TEST_LOCATION() TEST__MAKE_LOCATION(__LINE__)
+
+/* Check a boolean condition. */
+#define check(x)				\
+	check_bool_loc(TEST_LOCATION(), #x, x)
+int check_bool_loc(const char *loc, const char *check, int ok);
+
+/*
+ * Compare two integers. Prints a message with the two values if the
+ * comparison fails. NB this is not thread safe.
+ */
+#define check_int(a, op, b)						\
+	(test__tmp[0].i = (a), test__tmp[1].i = (b),			\
+	 check_int_loc(TEST_LOCATION(), #a" "#op" "#b,			\
+		       test__tmp[0].i op test__tmp[1].i, a, b))
+int check_int_loc(const char *loc, const char *check, int ok,
+		  intmax_t a, intmax_t b);
+
+/*
+ * Compare two unsigned integers. Prints a message with the two values
+ * if the comparison fails. NB this is not thread safe.
+ */
+#define check_uint(a, op, b)						\
+	(test__tmp[0].u = (a), test__tmp[1].u = (b),			\
+	 check_uint_loc(TEST_LOCATION(), #a" "#op" "#b,			\
+			test__tmp[0].u op test__tmp[1].u, a, b))
+int check_uint_loc(const char *loc, const char *check, int ok,
+		   uintmax_t a, uintmax_t b);
+
+/*
+ * Compare two chars. Prints a message with the two values if the
+ * comparison fails. NB this is not thread safe.
+ */
+#define check_char(a, op, b)						\
+	(test__tmp[0].c = (a), test__tmp[1].c = (b),			\
+	 check_char_loc(TEST_LOCATION(), #a" "#op" "#b,			\
+			test__tmp[0].c op test__tmp[1].c, a, b))
+int check_char_loc(const char *loc, const char *check, int ok,
+		   char a, char b);
+
+/* Check whether two strings are equal. */
+#define check_str(a, b)							\
+	check_str_loc(TEST_LOCATION(), "!strcmp("#a", "#b")", a, b)
+int check_str_loc(const char *loc, const char *check,
+		  const char *a, const char *b);
+
+/*
+ * Wrap a check that is known to fail. If the check succeeds then the
+ * test will fail. Returns 1 if the check fails, 0 if it
+ * succeeds. For example:
+ *
+ *  TEST_TODO(check(0));
+ */
+#define TEST_TODO(check) \
+	(test__todo_begin(), test__todo_end(TEST_LOCATION(), #check, check))
+
+/* Private helpers */
+
+#define TEST__STR(x) #x
+#define TEST__MAKE_LOCATION(line) __FILE__ ":" TEST__STR(line)
+
+union test__tmp {
+	intmax_t i;
+	uintmax_t u;
+	char c;
+};
+
+extern union test__tmp test__tmp[2];
+
+int test__run_begin(void);
+__attribute__((format (printf, 3, 4)))
+int test__run_end(int, const char *, const char *, ...);
+void test__todo_begin(void);
+int test__todo_end(const char *, const char *, int);
+
+#endif /* TEST_LIB_H */
-- 
2.42.0.609.gbb76f46606-goog


^ permalink raw reply related

* [PATCH v8 1/3] unit tests: Add a project plan document
From: Josh Steadmon @ 2023-10-09 22:21 UTC (permalink / raw)
  To: git; +Cc: phillip.wood123, linusa, calvinwan, gitster, rsbecker
In-Reply-To: <cover.1696889529.git.steadmon@google.com>

In our current testing environment, we spend a significant amount of
effort crafting end-to-end tests for error conditions that could easily
be captured by unit tests (or we simply forgo some hard-to-setup and
rare error conditions). Describe what we hope to accomplish by
implementing unit tests, and explain some open questions and milestones.
Discuss desired features for test frameworks/harnesses, and provide a
preliminary comparison of several different frameworks.

Co-authored-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 Documentation/Makefile                 |   1 +
 Documentation/technical/unit-tests.txt | 220 +++++++++++++++++++++++++
 2 files changed, 221 insertions(+)
 create mode 100644 Documentation/technical/unit-tests.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index b629176d7d..3f2383a12c 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -122,6 +122,7 @@ TECH_DOCS += technical/scalar
 TECH_DOCS += technical/send-pack-pipeline
 TECH_DOCS += technical/shallow
 TECH_DOCS += technical/trivial-merge
+TECH_DOCS += technical/unit-tests
 SP_ARTICLES += $(TECH_DOCS)
 SP_ARTICLES += technical/api-index
 
diff --git a/Documentation/technical/unit-tests.txt b/Documentation/technical/unit-tests.txt
new file mode 100644
index 0000000000..b7a89cc838
--- /dev/null
+++ b/Documentation/technical/unit-tests.txt
@@ -0,0 +1,220 @@
+= Unit Testing
+
+In our current testing environment, we spend a significant amount of effort
+crafting end-to-end tests for error conditions that could easily be captured by
+unit tests (or we simply forgo some hard-to-setup and rare error conditions).
+Unit tests additionally provide stability to the codebase and can simplify
+debugging through isolation. Writing unit tests in pure C, rather than with our
+current shell/test-tool helper setup, simplifies test setup, simplifies passing
+data around (no shell-isms required), and reduces testing runtime by not
+spawning a separate process for every test invocation.
+
+We believe that a large body of unit tests, living alongside the existing test
+suite, will improve code quality for the Git project.
+
+== Definitions
+
+For the purposes of this document, we'll use *test framework* to refer to
+projects that support writing test cases and running tests within the context
+of a single executable. *Test harness* will refer to projects that manage
+running multiple executables (each of which may contain multiple test cases) and
+aggregating their results.
+
+In reality, these terms are not strictly defined, and many of the projects
+discussed below contain features from both categories.
+
+For now, we will evaluate projects solely on their framework features. Since we
+are relying on having TAP output (see below), we can assume that any framework
+can be made to work with a harness that we can choose later.
+
+
+== Choosing a framework
+
+We believe the best option is to implement a custom TAP framework for the Git
+project. We use a version of the framework originally proposed in
+https://lore.kernel.org/git/c902a166-98ce-afba-93f2-ea6027557176@gmail.com/[1].
+
+
+== Choosing a test harness
+
+During upstream discussion, it was occasionally noted that `prove` provides many
+convenient features, such as scheduling slower tests first, or re-running
+previously failed tests.
+
+While we already support the use of `prove` as a test harness for the shell
+tests, it is not strictly required. The t/Makefile allows running shell tests
+directly (though with interleaved output if parallelism is enabled). Git
+developers who wish to use `prove` as a more advanced harness can do so by
+setting DEFAULT_TEST_TARGET=prove in their config.mak.
+
+We will follow a similar approach for unit tests: by default the test
+executables will be run directly from the t/Makefile, but `prove` can be
+configured with DEFAULT_UNIT_TEST_TARGET=prove.
+
+
+== Framework selection
+
+There are a variety of features we can use to rank the candidate frameworks, and
+those features have different priorities:
+
+* Critical features: we probably won't consider a framework without these
+** Can we legally / easily use the project?
+*** <<license,License>>
+*** <<vendorable-or-ubiquitous,Vendorable or ubiquitous>>
+*** <<maintainable-extensible,Maintainable / extensible>>
+*** <<major-platform-support,Major platform support>>
+** Does the project support our bare-minimum needs?
+*** <<tap-support,TAP support>>
+*** <<diagnostic-output,Diagnostic output>>
+*** <<runtime-skippable-tests,Runtime-skippable tests>>
+* Nice-to-have features:
+** <<parallel-execution,Parallel execution>>
+** <<mock-support,Mock support>>
+** <<signal-error-handling,Signal & error-handling>>
+* Tie-breaker stats
+** <<project-kloc,Project KLOC>>
+** <<adoption,Adoption>>
+
+[[license]]
+=== License
+
+We must be able to legally use the framework in connection with Git. As Git is
+licensed only under GPLv2, we must eliminate any LGPLv3, GPLv3, or Apache 2.0
+projects.
+
+[[vendorable-or-ubiquitous]]
+=== Vendorable or ubiquitous
+
+We want to avoid forcing Git developers to install new tools just to run unit
+tests. Any prospective frameworks and harnesses must either be vendorable
+(meaning, we can copy their source directly into Git's repository), or so
+ubiquitous that it is reasonable to expect that most developers will have the
+tools installed already.
+
+[[maintainable-extensible]]
+=== Maintainable / extensible
+
+It is unlikely that any pre-existing project perfectly fits our needs, so any
+project we select will need to be actively maintained and open to accepting
+changes. Alternatively, assuming we are vendoring the source into our repo, it
+must be simple enough that Git developers can feel comfortable making changes as
+needed to our version.
+
+In the comparison table below, "True" means that the framework seems to have
+active developers, that it is simple enough that Git developers can make changes
+to it, and that the project seems open to accepting external contributions (or
+that it is vendorable). "Partial" means that at least one of the above
+conditions holds.
+
+[[major-platform-support]]
+=== Major platform support
+
+At a bare minimum, unit-testing must work on Linux, MacOS, and Windows.
+
+In the comparison table below, "True" means that it works on all three major
+platforms with no issues. "Partial" means that there may be annoyances on one or
+more platforms, but it is still usable in principle.
+
+[[tap-support]]
+=== TAP support
+
+The https://testanything.org/[Test Anything Protocol] is a text-based interface
+that allows tests to communicate with a test harness. It is already used by
+Git's integration test suite. Supporting TAP output is a mandatory feature for
+any prospective test framework.
+
+In the comparison table below, "True" means this is natively supported.
+"Partial" means TAP output must be generated by post-processing the native
+output.
+
+Frameworks that do not have at least Partial support will not be evaluated
+further.
+
+[[diagnostic-output]]
+=== Diagnostic output
+
+When a test case fails, the framework must generate enough diagnostic output to
+help developers find the appropriate test case in source code in order to debug
+the failure.
+
+[[runtime-skippable-tests]]
+=== Runtime-skippable tests
+
+Test authors may wish to skip certain test cases based on runtime circumstances,
+so the framework should support this.
+
+[[parallel-execution]]
+=== Parallel execution
+
+Ideally, we will build up a significant collection of unit test cases, most
+likely split across multiple executables. It will be necessary to run these
+tests in parallel to enable fast develop-test-debug cycles.
+
+In the comparison table below, "True" means that individual test cases within a
+single test executable can be run in parallel. We assume that executable-level
+parallelism can be handled by the test harness.
+
+[[mock-support]]
+=== Mock support
+
+Unit test authors may wish to test code that interacts with objects that may be
+inconvenient to handle in a test (e.g. interacting with a network service).
+Mocking allows test authors to provide a fake implementation of these objects
+for more convenient tests.
+
+[[signal-error-handling]]
+=== Signal & error handling
+
+The test framework should fail gracefully when test cases are themselves buggy
+or when they are interrupted by signals during runtime.
+
+[[project-kloc]]
+=== Project KLOC
+
+The size of the project, in thousands of lines of code as measured by
+https://dwheeler.com/sloccount/[sloccount] (rounded up to the next multiple of
+1,000). As a tie-breaker, we probably prefer a project with fewer LOC.
+
+[[adoption]]
+=== Adoption
+
+As a tie-breaker, we prefer a more widely-used project. We use the number of
+GitHub / GitLab stars to estimate this.
+
+
+=== Comparison
+
+[format="csv",options="header",width="33%"]
+|=====
+Framework,"<<license,License>>","<<vendorable-or-ubiquitous,Vendorable or ubiquitous>>","<<maintainable-extensible,Maintainable / extensible>>","<<major-platform-support,Major platform support>>","<<tap-support,TAP support>>","<<diagnostic-output,Diagnostic output>>","<<runtime--skippable-tests,Runtime- skippable tests>>","<<parallel-execution,Parallel execution>>","<<mock-support,Mock support>>","<<signal-error-handling,Signal & error handling>>","<<project-kloc,Project KLOC>>","<<adoption,Adoption>>"
+https://lore.kernel.org/git/c902a166-98ce-afba-93f2-ea6027557176@gmail.com/[Custom Git impl.],[lime-background]#GPL v2#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,1,0
+https://github.com/silentbicycle/greatest[Greatest],[lime-background]#ISC#,[lime-background]#True#,[yellow-background]#Partial#,[lime-background]#True#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,3,1400
+https://github.com/Snaipe/Criterion[Criterion],[lime-background]#MIT#,[red-background]#False#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[lime-background]#True#,19,1800
+https://github.com/rra/c-tap-harness/[C TAP],[lime-background]#Expat#,[lime-background]#True#,[yellow-background]#Partial#,[yellow-background]#Partial#,[lime-background]#True#,[red-background]#False#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,4,33
+https://libcheck.github.io/check/[Check],[lime-background]#LGPL v2.1#,[red-background]#False#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,[lime-background]#True#,17,973
+|=====
+
+=== Additional framework candidates
+
+Several suggested frameworks have been eliminated from consideration:
+
+* Incompatible licenses:
+** https://github.com/zorgnax/libtap[libtap] (LGPL v3)
+** https://cmocka.org/[cmocka] (Apache 2.0)
+* Missing source: https://www.kindahl.net/mytap/doc/index.html[MyTap]
+* No TAP support:
+** https://nemequ.github.io/munit/[µnit]
+** https://github.com/google/cmockery[cmockery]
+** https://github.com/lpabon/cmockery2[cmockery2]
+** https://github.com/ThrowTheSwitch/Unity[Unity]
+** https://github.com/siu/minunit[minunit]
+** https://cunit.sourceforge.net/[CUnit]
+
+
+== Milestones
+
+* Add useful tests of library-like code
+* Integrate with
+  https://lore.kernel.org/git/20230502211454.1673000-1-calvinwan@google.com/[stdlib
+  work]
+* Run alongside regular `make test` target
-- 
2.42.0.609.gbb76f46606-goog


^ permalink raw reply related

* [PATCH v8 0/3] Add unit test framework and project plan
From: Josh Steadmon @ 2023-10-09 22:21 UTC (permalink / raw)
  To: git; +Cc: phillip.wood123, linusa, calvinwan, gitster, rsbecker
In-Reply-To: <0169ce6fb9ccafc089b74ae406db0d1a8ff8ac65.1688165272.git.steadmon@google.com>

In our current testing environment, we spend a significant amount of
effort crafting end-to-end tests for error conditions that could easily
be captured by unit tests (or we simply forgo some hard-to-setup and
rare error conditions). Unit tests additionally provide stability to the
codebase and can simplify debugging through isolation. Turning parts of
Git into libraries[1] gives us the ability to run unit tests on the
libraries and to write unit tests in C. Writing unit tests in pure C,
rather than with our current shell/test-tool helper setup, simplifies
test setup, simplifies passing data around (no shell-isms required), and
reduces testing runtime by not spawning a separate process for every
test invocation.

This series begins with a project document covering our goals for adding
unit tests and a discussion of alternative frameworks considered, as
well as the features used to evaluate them. A rendered preview of this
doc can be found at [2]. It also adds Phillip Wood's TAP implemenation
(with some slightly re-worked Makefile rules) and a sample strbuf unit
test. Finally, we modify the configs for GitHub and Cirrus CI to run the
unit tests. Sample runs showing successful CI runs can be found at [3],
[4], and [5].

[1] https://lore.kernel.org/git/CAJoAoZ=Cig_kLocxKGax31sU7Xe4==BGzC__Bg2_pr7krNq6MA@mail.gmail.com/
[2] https://github.com/steadmon/git/blob/unit-tests-asciidoc/Documentation/technical/unit-tests.adoc
[3] https://github.com/steadmon/git/actions/runs/5884659246/job/15959781385#step:4:1803
[4] https://github.com/steadmon/git/actions/runs/5884659246/job/15959938401#step:5:186
[5] https://cirrus-ci.com/task/6126304366428160 (unrelated tests failed,
    but note that t-strbuf ran successfully)

In addition to reviewing the patches in this series, reviewers can help
this series progress by chiming in on these remaining TODOs:
- Figure out if we should split t-basic.c into multiple meta-tests, to
  avoid merge conflicts and changes to expected text in
  t0080-unit-test-output.sh.
- Figure out if we should de-duplicate assertions in t-strbuf.c at the
  cost of making tests less self-contained and diagnostic output less
  helpful.
- Figure out if we should collect unit tests statistics similar to the
  "counts" files for shell tests
- Decide if it's OK to wait on sharding unit tests across "sliced" CI
  instances
- Provide guidelines for writing new unit tests

Changes in v8:
- Flipped return values for TEST, TEST_TODO, and check_* macros &
  functions. This makes it easier to reason about control flow for
  patterns like:
    if (check(some_condition)) { ... }
- Moved unit test binaries to t/unit-tests/bin to simplify .gitignore
  patterns.
- Removed testing of some strbuf implementation details in t-strbuf.c


Josh Steadmon (2):
  unit tests: Add a project plan document
  ci: run unit tests in CI

Phillip Wood (1):
  unit tests: add TAP unit test framework

 .cirrus.yml                            |   2 +-
 Documentation/Makefile                 |   1 +
 Documentation/technical/unit-tests.txt | 220 +++++++++++++++++
 Makefile                               |  28 ++-
 ci/run-build-and-tests.sh              |   2 +
 ci/run-test-slice.sh                   |   5 +
 t/Makefile                             |  15 +-
 t/t0080-unit-test-output.sh            |  58 +++++
 t/unit-tests/.gitignore                |   1 +
 t/unit-tests/t-basic.c                 |  95 +++++++
 t/unit-tests/t-strbuf.c                | 120 +++++++++
 t/unit-tests/test-lib.c                | 329 +++++++++++++++++++++++++
 t/unit-tests/test-lib.h                | 143 +++++++++++
 13 files changed, 1014 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/technical/unit-tests.txt
 create mode 100755 t/t0080-unit-test-output.sh
 create mode 100644 t/unit-tests/.gitignore
 create mode 100644 t/unit-tests/t-basic.c
 create mode 100644 t/unit-tests/t-strbuf.c
 create mode 100644 t/unit-tests/test-lib.c
 create mode 100644 t/unit-tests/test-lib.h

Range-diff against v7:
-:  ---------- > 1:  81c5148a12 unit tests: Add a project plan document
1:  3cc98d4045 ! 2:  00d3c95a81 unit tests: add TAP unit test framework
    @@ Commit message
     
                      check_uint(buf.len, ==, 0);
                      check_uint(buf.alloc, ==, 0);
    -                 if (check(buf.buf == strbuf_slopbuf))
    -                        return; /* avoid SIGSEV */
                      check_char(buf.buf[0], ==, '\0');
              }
     
    @@ Commit message
         checked by t0080-unit-test-output.sh. t-strbuf.c shows some example
         unit tests for strbuf.c
     
    -    The unit tests can be built with "make unit-tests" (this works but the
    -    Makefile changes need some further work). Once they have been built they
    -    can be run manually (e.g t/unit-tests/t-strbuf) or with prove.
    +    The unit tests will be built as part of the default "make all" target,
    +    to avoid bitrot. If you wish to build just the unit tests, you can run
    +    "make build-unit-tests". To run the tests, you can use "make unit-tests"
    +    or run the test binaries directly, as in "./t/unit-tests/bin/t-strbuf".
     
         Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
    @@ Makefile: TEST_BUILTINS_OBJS =
      THIRD_PARTY_SOURCES =
     +UNIT_TEST_PROGRAMS =
     +UNIT_TEST_DIR = t/unit-tests
    ++UNIT_TEST_BIN = $(UNIT_TEST_DIR)/bin
      
      # Having this variable in your environment would break pipelines because
      # you cause "cd" to echo its destination to stdout.  It can also take
    @@ Makefile: THIRD_PARTY_SOURCES += compat/regex/%
      
     +UNIT_TEST_PROGRAMS += t-basic
     +UNIT_TEST_PROGRAMS += t-strbuf
    -+UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_DIR)/%$X,$(UNIT_TEST_PROGRAMS))
    ++UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
     +UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
     +UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
     +
    @@ Makefile: $(FUZZ_PROGRAMS): all
      
      fuzz-all: $(FUZZ_PROGRAMS)
     +
    -+$(UNIT_TEST_PROGS): $(UNIT_TEST_DIR)/%$X: $(UNIT_TEST_DIR)/%.o $(UNIT_TEST_DIR)/test-lib.o $(GITLIBS) GIT-LDFLAGS
    ++$(UNIT_TEST_BIN):
    ++	@mkdir -p $(UNIT_TEST_BIN)
    ++
    ++$(UNIT_TEST_PROGS): $(UNIT_TEST_BIN)/%$X: $(UNIT_TEST_DIR)/%.o $(UNIT_TEST_DIR)/test-lib.o $(GITLIBS) GIT-LDFLAGS $(UNIT_TEST_BIN)
     +	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
     +		$(filter %.o,$^) $(filter %.a,$^) $(LIBS)
     +
    @@ t/Makefile: TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh))
      TINTEROP = $(sort $(wildcard interop/i[0-9][0-9][0-9][0-9]-*.sh))
      CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test)))
      CHAINLINT = '$(PERL_PATH_SQ)' chainlint.pl
    -+UNIT_TESTS = $(sort $(filter-out %.h %.c %.o unit-tests/t-basic%,$(wildcard unit-tests/*)))
    ++UNIT_TESTS = $(sort $(filter-out unit-tests/bin/t-basic%,$(wildcard unit-tests/bin/t-*)))
      
      # `test-chainlint` (which is a dependency of `test-lint`, `test` and `prove`)
      # checks all tests in all scripts via a single invocation, so tell individual
    @@ t/t0080-unit-test-output.sh (new)
     +test_expect_success 'TAP output from unit tests' '
     +	cat >expect <<-EOF &&
     +	ok 1 - passing test
    -+	ok 2 - passing test and assertion return 0
    ++	ok 2 - passing test and assertion return 1
     +	# check "1 == 2" failed at t/unit-tests/t-basic.c:76
     +	#    left: 1
     +	#   right: 2
     +	not ok 3 - failing test
    -+	ok 4 - failing test and assertion return -1
    ++	ok 4 - failing test and assertion return 0
     +	not ok 5 - passing TEST_TODO() # TODO
    -+	ok 6 - passing TEST_TODO() returns 0
    ++	ok 6 - passing TEST_TODO() returns 1
     +	# todo check ${SQ}check(x)${SQ} succeeded at t/unit-tests/t-basic.c:25
     +	not ok 7 - failing TEST_TODO()
    -+	ok 8 - failing TEST_TODO() returns -1
    ++	ok 8 - failing TEST_TODO() returns 0
     +	# check "0" failed at t/unit-tests/t-basic.c:30
     +	# skipping test - missing prerequisite
     +	# skipping check ${SQ}1${SQ} at t/unit-tests/t-basic.c:32
     +	ok 9 - test_skip() # SKIP
    -+	ok 10 - skipped test returns 0
    ++	ok 10 - skipped test returns 1
     +	# skipping test - missing prerequisite
     +	ok 11 - test_skip() inside TEST_TODO() # SKIP
    -+	ok 12 - test_skip() inside TEST_TODO() returns 0
    ++	ok 12 - test_skip() inside TEST_TODO() returns 1
     +	# check "0" failed at t/unit-tests/t-basic.c:48
     +	not ok 13 - TEST_TODO() after failing check
    -+	ok 14 - TEST_TODO() after failing check returns -1
    ++	ok 14 - TEST_TODO() after failing check returns 0
     +	# check "0" failed at t/unit-tests/t-basic.c:56
     +	not ok 15 - failing check after TEST_TODO()
    -+	ok 16 - failing check after TEST_TODO() returns -1
    ++	ok 16 - failing check after TEST_TODO() returns 0
     +	# check "!strcmp("\thello\\\\", "there\"\n")" failed at t/unit-tests/t-basic.c:61
     +	#    left: "\011hello\\\\"
     +	#   right: "there\"\012"
    @@ t/t0080-unit-test-output.sh (new)
     +	not ok 17 - messages from failing string and char comparison
     +	# BUG: test has no checks at t/unit-tests/t-basic.c:91
     +	not ok 18 - test with no checks
    -+	ok 19 - test with no checks returns -1
    ++	ok 19 - test with no checks returns 0
     +	1..19
     +	EOF
     +
    -+	! "$GIT_BUILD_DIR"/t/unit-tests/t-basic >actual &&
    ++	! "$GIT_BUILD_DIR"/t/unit-tests/bin/t-basic >actual &&
     +	test_cmp expect actual
     +'
     +
    @@ t/t0080-unit-test-output.sh (new)
     
      ## t/unit-tests/.gitignore (new) ##
     @@
    -+/t-basic
    -+/t-strbuf
    ++/bin
     
      ## t/unit-tests/t-basic.c (new) ##
     @@
    @@ t/unit-tests/t-basic.c (new)
     +static int do_skip(void)
     +{
     +	test_skip("missing prerequisite");
    -+	return 0;
    ++	return 1;
     +}
     +
     +static void t_skip_todo(void)
    @@ t/unit-tests/t-basic.c (new)
     +int cmd_main(int argc, const char **argv)
     +{
     +	test_res = TEST(check_res = check_int(1, ==, 1), "passing test");
    -+	TEST(t_res(0), "passing test and assertion return 0");
    ++	TEST(t_res(1), "passing test and assertion return 1");
     +	test_res = TEST(check_res = check_int(1, ==, 2), "failing test");
    -+	TEST(t_res(-1), "failing test and assertion return -1");
    ++	TEST(t_res(0), "failing test and assertion return 0");
     +	test_res = TEST(t_todo(0), "passing TEST_TODO()");
    -+	TEST(t_res(0), "passing TEST_TODO() returns 0");
    ++	TEST(t_res(1), "passing TEST_TODO() returns 1");
     +	test_res = TEST(t_todo(1), "failing TEST_TODO()");
    -+	TEST(t_res(-1), "failing TEST_TODO() returns -1");
    ++	TEST(t_res(0), "failing TEST_TODO() returns 0");
     +	test_res = TEST(t_skip(), "test_skip()");
    -+	TEST(check_int(test_res, ==, 0), "skipped test returns 0");
    ++	TEST(check_int(test_res, ==, 1), "skipped test returns 1");
     +	test_res = TEST(t_skip_todo(), "test_skip() inside TEST_TODO()");
    -+	TEST(t_res(0), "test_skip() inside TEST_TODO() returns 0");
    ++	TEST(t_res(1), "test_skip() inside TEST_TODO() returns 1");
     +	test_res = TEST(t_todo_after_fail(), "TEST_TODO() after failing check");
    -+	TEST(check_int(test_res, ==, -1), "TEST_TODO() after failing check returns -1");
    ++	TEST(check_int(test_res, ==, 0), "TEST_TODO() after failing check returns 0");
     +	test_res = TEST(t_fail_after_todo(), "failing check after TEST_TODO()");
    -+	TEST(check_int(test_res, ==, -1), "failing check after TEST_TODO() returns -1");
    ++	TEST(check_int(test_res, ==, 0), "failing check after TEST_TODO() returns 0");
     +	TEST(t_messages(), "messages from failing string and char comparison");
     +	test_res = TEST(t_empty(), "test with no checks");
    -+	TEST(check_int(test_res, ==, -1), "test with no checks returns -1");
    ++	TEST(check_int(test_res, ==, 0), "test with no checks returns 0");
     +
     +	return test_done();
     +}
    @@ t/unit-tests/t-strbuf.c (new)
     +#include "test-lib.h"
     +#include "strbuf.h"
     +
    -+/* wrapper that supplies tests with an initialized strbuf */
    ++/* wrapper that supplies tests with an empty, initialized strbuf */
     +static void setup(void (*f)(struct strbuf*, void*), void *data)
     +{
     +	struct strbuf buf = STRBUF_INIT;
    @@ t/unit-tests/t-strbuf.c (new)
     +	strbuf_release(&buf);
     +	check_uint(buf.len, ==, 0);
     +	check_uint(buf.alloc, ==, 0);
    -+	check(buf.buf == strbuf_slopbuf);
    -+	check_char(buf.buf[0], ==, '\0');
    ++}
    ++
    ++/* wrapper that supplies tests with a populated, initialized strbuf */
    ++static void setup_populated(void (*f)(struct strbuf*, void*), char *init_str, void *data)
    ++{
    ++	struct strbuf buf = STRBUF_INIT;
    ++
    ++	strbuf_addstr(&buf, init_str);
    ++	check_uint(buf.len, ==, strlen(init_str));
    ++	f(&buf, data);
    ++	strbuf_release(&buf);
    ++	check_uint(buf.len, ==, 0);
    ++	check_uint(buf.alloc, ==, 0);
    ++}
    ++
    ++static int assert_sane_strbuf(struct strbuf *buf)
    ++{
    ++	/* Initialized strbufs should always have a non-NULL buffer */
    ++	if (buf->buf == NULL)
    ++		return 0;
    ++	/* Buffers should always be NUL-terminated */
    ++	if (buf->buf[buf->len] != '\0')
    ++		return 0;
    ++	/*
    ++	 * Freshly-initialized strbufs may not have a dynamically allocated
    ++	 * buffer
    ++	 */
    ++	if (buf->len == 0 && buf->alloc == 0)
    ++		return 1;
    ++	/* alloc must be at least one byte larger than len */
    ++	return buf->len + 1 <= buf->alloc;
     +}
     +
     +static void t_static_init(void)
    @@ t/unit-tests/t-strbuf.c (new)
     +
     +	check_uint(buf.len, ==, 0);
     +	check_uint(buf.alloc, ==, 0);
    -+	if (check(buf.buf == strbuf_slopbuf))
    -+		return; /* avoid de-referencing buf.buf */
     +	check_char(buf.buf[0], ==, '\0');
     +}
     +
    @@ t/unit-tests/t-strbuf.c (new)
     +	struct strbuf buf;
     +
     +	strbuf_init(&buf, 1024);
    ++	check(assert_sane_strbuf(&buf));
     +	check_uint(buf.len, ==, 0);
     +	check_uint(buf.alloc, >=, 1024);
     +	check_char(buf.buf[0], ==, '\0');
    @@ t/unit-tests/t-strbuf.c (new)
     +{
     +	const char *p_ch = data;
     +	const char ch = *p_ch;
    ++	size_t orig_alloc = buf->alloc;
    ++	size_t orig_len = buf->len;
     +
    ++	if (!check(assert_sane_strbuf(buf)))
    ++		return;
     +	strbuf_addch(buf, ch);
    -+	if (check_uint(buf->len, ==, 1) ||
    -+	    check_uint(buf->alloc, >, 1))
    ++	if (!check(assert_sane_strbuf(buf)))
    ++		return;
    ++	if (!(check_uint(buf->len, ==, orig_len + 1) &&
    ++	      check_uint(buf->alloc, >=, orig_alloc)))
     +		return; /* avoid de-referencing buf->buf */
    -+	check_char(buf->buf[0], ==, ch);
    -+	check_char(buf->buf[1], ==, '\0');
    ++	check_char(buf->buf[buf->len - 1], ==, ch);
    ++	check_char(buf->buf[buf->len], ==, '\0');
     +}
     +
     +static void t_addstr(struct strbuf *buf, void *data)
     +{
     +	const char *text = data;
     +	size_t len = strlen(text);
    ++	size_t orig_alloc = buf->alloc;
    ++	size_t orig_len = buf->len;
     +
    ++	if (!check(assert_sane_strbuf(buf)))
    ++		return;
     +	strbuf_addstr(buf, text);
    -+	if (check_uint(buf->len, ==, len) ||
    -+	    check_uint(buf->alloc, >, len) ||
    -+	    check_char(buf->buf[len], ==, '\0'))
    ++	if (!check(assert_sane_strbuf(buf)))
    ++		return;
    ++	if (!(check_uint(buf->len, ==, orig_len + len) &&
    ++	      check_uint(buf->alloc, >=, orig_alloc) &&
    ++	      check_uint(buf->alloc, >, orig_len + len) &&
    ++	      check_char(buf->buf[orig_len + len], ==, '\0')))
     +	    return;
    -+	check_str(buf->buf, text);
    ++	check_str(buf->buf + orig_len, text);
     +}
     +
     +int cmd_main(int argc, const char **argv)
     +{
    -+	if (TEST(t_static_init(), "static initialization works"))
    ++	if (!TEST(t_static_init(), "static initialization works"))
     +		test_skip_all("STRBUF_INIT is broken");
     +	TEST(t_dynamic_init(), "dynamic initialization works");
     +	TEST(setup(t_addch, "a"), "strbuf_addch adds char");
     +	TEST(setup(t_addch, ""), "strbuf_addch adds NUL char");
    ++	TEST(setup_populated(t_addch, "initial value", "a"),
    ++	     "strbuf_addch appends to initial value");
     +	TEST(setup(t_addstr, "hello there"), "strbuf_addstr adds string");
    ++	TEST(setup_populated(t_addstr, "initial value", "hello there"),
    ++	     "strbuf_addstr appends string to initial value");
     +
     +	return test_done();
     +}
    @@ t/unit-tests/test-lib.c (new)
     +	va_end(ap);
     +	ctx.running = 0;
     +	if (ctx.skip_all)
    -+		return 0;
    ++		return 1;
     +	putc('\n', stdout);
     +	fflush(stdout);
     +	ctx.failed |= ctx.result == RESULT_FAILURE;
     +
    -+	return -(ctx.result == RESULT_FAILURE);
    ++	return ctx.result != RESULT_FAILURE;
     +}
     +
     +static void test_fail(void)
    @@ t/unit-tests/test-lib.c (new)
     +
     +	if (ctx.result == RESULT_SKIP) {
     +		test_msg("skipping check '%s' at %s", check, location);
    -+		return 0;
    ++		return 1;
     +	} else if (!ctx.todo) {
     +		if (ok) {
     +			test_pass();
    @@ t/unit-tests/test-lib.c (new)
     +		}
     +	}
     +
    -+	return -!ok;
    ++	return !!ok;
     +}
     +
     +void test__todo_begin(void)
    @@ t/unit-tests/test-lib.c (new)
     +
     +	ctx.todo = 0;
     +	if (ctx.result == RESULT_SKIP)
    -+		return 0;
    -+	if (!res) {
    ++		return 1;
    ++	if (res) {
     +		test_msg("todo check '%s' succeeded at %s", check, location);
     +		test_fail();
     +	} else {
     +		test_todo();
     +	}
     +
    -+	return -!res;
    ++	return !res;
     +}
     +
     +int check_bool_loc(const char *loc, const char *check, int ok)
    @@ t/unit-tests/test-lib.c (new)
     +{
     +	int ret = test_assert(loc, check, ok);
     +
    -+	if (ret) {
    ++	if (!ret) {
     +		test_msg("   left: %"PRIdMAX, a);
     +		test_msg("  right: %"PRIdMAX, b);
     +	}
    @@ t/unit-tests/test-lib.c (new)
     +{
     +	int ret = test_assert(loc, check, ok);
     +
    -+	if (ret) {
    ++	if (!ret) {
     +		test_msg("   left: %"PRIuMAX, a);
     +		test_msg("  right: %"PRIuMAX, b);
     +	}
    @@ t/unit-tests/test-lib.c (new)
     +{
     +	int ret = test_assert(loc, check, ok);
     +
    -+	if (ret) {
    ++	if (!ret) {
     +		fflush(stderr);
     +		print_char("   left", a);
     +		print_char("  right", b);
    @@ t/unit-tests/test-lib.c (new)
     +	int ok = (!a && !b) || (a && b && !strcmp(a, b));
     +	int ret = test_assert(loc, check, ok);
     +
    -+	if (ret) {
    ++	if (!ret) {
     +		fflush(stderr);
     +		print_str("   left", a);
     +		print_str("  right", b);
    @@ t/unit-tests/test-lib.h (new)
     +#include "git-compat-util.h"
     +
     +/*
    -+ * Run a test function, returns 0 if the test succeeds, -1 if it
    ++ * Run a test function, returns 1 if the test succeeds, 0 if it
     + * fails. If test_skip_all() has been called then the test will not be
     + * run. The description for each test should be unique. For example:
     + *
    @@ t/unit-tests/test-lib.h (new)
     +void test_msg(const char *format, ...);
     +
     +/*
    -+ * Test checks are built around test_assert(). checks return 0 on
    -+ * success, -1 on failure. If any check fails then the test will
    ++ * Test checks are built around test_assert(). checks return 1 on
    ++ * success, 0 on failure. If any check fails then the test will
     + * fail. To create a custom check define a function that wraps
     + * test_assert() and a macro to wrap that function. For example:
     + *
    @@ t/unit-tests/test-lib.h (new)
     +
     +/*
     + * Wrap a check that is known to fail. If the check succeeds then the
    -+ * test will fail. Returns 0 if the check fails, -1 if it
    ++ * test will fail. Returns 1 if the check fails, 0 if it
     + * succeeds. For example:
     + *
     + *  TEST_TODO(check(0));
2:  abf4dc41ac ! 3:  aa1dfa4892 ci: run unit tests in CI
    @@ ci/run-test-slice.sh: group "Run tests" make --quiet -C t T="$(cd t &&
     +fi
     +
      check_unignored_build_artifacts
    -
    - ## t/Makefile ##
    -@@ t/Makefile: TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh))
    - TINTEROP = $(sort $(wildcard interop/i[0-9][0-9][0-9][0-9]-*.sh))
    - CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test)))
    - CHAINLINT = '$(PERL_PATH_SQ)' chainlint.pl
    --UNIT_TESTS = $(sort $(filter-out %.h %.c %.o unit-tests/t-basic%,$(wildcard unit-tests/*)))
    -+UNIT_TESTS = $(sort $(filter-out %.h %.c %.o unit-tests/t-basic%,$(wildcard unit-tests/t-*)))
    - 
    - # `test-chainlint` (which is a dependency of `test-lint`, `test` and `prove`)
    - # checks all tests in all scripts via a single invocation, so tell individual

base-commit: a9e066fa63149291a55f383cfa113d8bdbdaa6b3
-- 
2.42.0.609.gbb76f46606-goog


^ permalink raw reply

* Re: [PATCH 0/3] rev-list: add support for commits in `--missing`
From: Junio C Hamano @ 2023-10-09 22:02 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git
In-Reply-To: <20231009105528.17777-1-karthik.188@gmail.com>

Karthik Nayak <karthik.188@gmail.com> writes:

> This series is an alternate to the patch series I had posted earlier:
> https://lore.kernel.org/git/20230908174208.249184-1-karthik.188@gmail.com/.
> In that patch, we introduced an option `--ignore-missing-links` which
> was added to expose the `ignore_missing_links` bit to the user. The
> issue in that patch was that, the option `--ignore-missing-links` didn't
> play well the pre-existing `--missing` option. This series avoids that
> route and just extends the `--missing` option for commits to solve the
> same problem.

Thanks for exploring the problem space in the other direction.

I haven't really thought things through yet, but it looks to me that
the updated behaviour of "--missing" is more in line with what the
option would have wanted to be from day one.

> Karthik Nayak (3):
>   revision: rename bit to `do_not_die_on_missing_objects`
>   rev-list: move `show_commit()` to the bottom
>   rev-list: add commit object support in `--missing` option
>
>  builtin/reflog.c            |  2 +-
>  builtin/rev-list.c          | 93 +++++++++++++++++++------------------
>  list-objects.c              |  2 +-
>  object.h                    |  2 +-
>  revision.c                  |  9 +++-
>  revision.h                  | 20 ++++----
>  t/t6022-rev-list-missing.sh | 74 +++++++++++++++++++++++++++++
>  7 files changed, 145 insertions(+), 57 deletions(-)
>  create mode 100755 t/t6022-rev-list-missing.sh

^ permalink raw reply

* [PATCH v2 4/4] files-backend.c: avoid stat in 'loose_fill_ref_dir'
From: Victoria Dye via GitGitGadget @ 2023-10-09 21:58 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Victoria Dye, Victoria Dye
In-Reply-To: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

From: Victoria Dye <vdye@github.com>

Modify the 'readdir' loop in 'loose_fill_ref_dir' to, rather than 'stat' a
file to determine whether it is a directory or not, use 'get_dtype'.

Currently, the loop uses 'stat' to determine whether each dirent is a
directory itself or not in order to construct the appropriate ref cache
entry. If 'stat' fails (returning a negative value), the dirent is silently
skipped; otherwise, 'S_ISDIR(st.st_mode)' is used to check whether the entry
is a directory.

On platforms that include an entry's d_type in in the 'dirent' struct, this
extra 'stat' check is redundant. We can use the 'get_dtype' method to
extract this information on platforms that support it (i.e. where
NO_D_TYPE_IN_DIRENT is unset), and derive it with 'stat' on platforms that
don't. Because 'stat' is an expensive call, this confers a
modest-but-noticeable performance improvement when iterating over large
numbers of refs (approximately 20% speedup in 'git for-each-ref' in a 30k
ref repo).

Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set
to 1 to replicate the existing handling of symlink dirents. This
unfortunately requires calling 'stat' on the associated entry regardless of
platform, but symlinks in the loose ref store are highly unlikely since
they'd need to be created manually by a user.

Note that this patch also changes the condition for skipping creation of a
ref entry from "when 'stat' fails" to "when the d_type is anything other
than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because
the platform doesn't support d_type in dirents or some other reason) or
DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If
the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be
skipped. However, it will also be skipped if it is any other valid d_type
(e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not
handle these properly anyway, so we can safely constrain accepted types to
directories and regular files.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 refs/files-backend.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 341354182bb..db5c0c7a724 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -246,10 +246,8 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 	int dirnamelen = strlen(dirname);
 	struct strbuf refname;
 	struct strbuf path = STRBUF_INIT;
-	size_t path_baselen;
 
 	files_ref_path(refs, &path, dirname);
-	path_baselen = path.len;
 
 	d = opendir(path.buf);
 	if (!d) {
@@ -262,23 +260,22 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 
 	while ((de = readdir(d)) != NULL) {
 		struct object_id oid;
-		struct stat st;
 		int flag;
+		unsigned char dtype;
 
 		if (de->d_name[0] == '.')
 			continue;
 		if (ends_with(de->d_name, ".lock"))
 			continue;
 		strbuf_addstr(&refname, de->d_name);
-		strbuf_addstr(&path, de->d_name);
-		if (stat(path.buf, &st) < 0) {
-			; /* silently ignore */
-		} else if (S_ISDIR(st.st_mode)) {
+
+		dtype = get_dtype(de, &path, 1);
+		if (dtype == DT_DIR) {
 			strbuf_addch(&refname, '/');
 			add_entry_to_dir(dir,
 					 create_dir_entry(dir->cache, refname.buf,
 							  refname.len));
-		} else {
+		} else if (dtype == DT_REG) {
 			if (!refs_resolve_ref_unsafe(&refs->base,
 						     refname.buf,
 						     RESOLVE_REF_READING,
@@ -308,7 +305,6 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 					 create_ref_entry(refname.buf, &oid, flag));
 		}
 		strbuf_setlen(&refname, dirnamelen);
-		strbuf_setlen(&path, path_baselen);
 	}
 	strbuf_release(&refname);
 	strbuf_release(&path);
-- 
gitgitgadget

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox