From: Eric Wong <e@80x24.org>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Emily Shaffer <emilyshaffer@google.com>,
Jeff King <peff@peff.net>,
Jonathan Tan <jonathantanmy@google.com>,
git@vger.kernel.org
Subject: Re: Git in Outreachy December 2019?
Date: Tue, 24 Sep 2019 00:55:29 +0000 [thread overview]
Message-ID: <20190924005529.GA8354@dcvr> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1909171158090.15067@tvgsbejvaqbjf.bet>
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> On Mon, 16 Sep 2019, Emily Shaffer wrote:
> > - try and make progress towards running many tests from a single test
> > file in parallel - maybe this is too big, I'm not sure if we know how
> > many of our tests are order-dependent within a file for now...
>
> Another, potentially more rewarding, project would be to modernize our
> test suite framework, so that it is not based on Unix shell scripting,
> but on C instead.
I worry more C would reduce the amount of contributors (some of
the C rewrites already scared me off hacking years ago). I
figure more users are familiar with sh than C.
It would also increase the disparity between tests and use of
actual users from the command-line.
> The fact that it is based on Unix shell scripting not only costs a lot
> of speed, especially on Windows, it also limits us quite a bit, and I am
> talking about a lot more than just the awkwardness of having to think
> about options of BSD vs GNU variants of common command-line tools.
I agree that it costs a lot of time, and I'm even on Linux using
dash as /bin/sh + eatmydata (but ancient laptop)
> For example, many, many, if not all, test cases, spend the majority of
> their code on setting up specific scenarios. I don't know about you,
> but personally I have to dive into many of them when things fail (and I
> _dread_ the numbers 0021, 0025 and 3070, let me tell you) and I really
> have to say that most of that code is hard to follow and does not make
> it easy to form a mental model of what the code tries to accomplish.
>
> To address this, a while ago Thomas Rast started to use `fast-export`ed
> commit histories in test scripts (see e.g. `t/t3206/history.export`). I
> still find that this fails to make it easier for occasional readers to
> understand the ideas underlying the test cases.
>
> Another approach is to document heavily the ideas first, then use code
> to implement them. For example, t3430 starts with this:
>
> [...]
>
> Initial setup:
>
> -- B -- (first)
> / \
> A - C - D - E - H (master)
> \ \ /
> \ F - G (second)
> \
> Conflicting-G
>
> [...]
>
> test_commit A &&
> git checkout -b first &&
> test_commit B &&
> git checkout master &&
> test_commit C &&
> test_commit D &&
> git merge --no-commit B &&
> test_tick &&
> git commit -m E &&
> git tag -m E E &&
> git checkout -b second C &&
> test_commit F &&
> test_commit G &&
> git checkout master &&
> git merge --no-commit G &&
> test_tick &&
> git commit -m H &&
> git tag -m H H &&
> git checkout A &&
> test_commit conflicting-G G.t
>
> [...]
>
> While this is _somewhat_ better than having only the code, I am still
> unhappy about it: this wall of `test_commit` lines interspersed with
> other commands is very hard to follow.
Agreed. More on the readability part below...
As far as speeding that up, I think moving some parts
of test setup to Makefiles + fast-import/fast-export would give
us a nice balance of speed + maintainability:
1. initial setup is done using normal commands (or graph drawing tool)
2. the result of setup is "built" with fast-export
3. test uses fast-import
Makefile rules would prevent subsequent test runs from repeating
1. and 2.
> If we were to (slowly) convert our test suite framework to C, we could
> change that.
>
> One idea would be to allow recreating commit history from something that
> looks like the output of `git log`, or even `git log --graph --oneline`,
> much like `git mktree` (which really should have been a test helper
> instead of a Git command, but I digress) takes something that looks like
> the output of `git ls-tree` and creates a tree object from it.
I've been playing with Graph::Easy (Perl5 module) in other
projects, and I also think the setup could be more easily
expressed with a declarative language (e.g. GNU make)
> Another thing that would be much easier if we moved more and more parts
> of the test suite framework to C: we could implement more powerful
> assertions, a lot more easily. For example, the trace output of a failed
> `test_i18ngrep` (or `mingw_test_cmp`!!!) could be made a lot more
> focused on what is going wrong than on cluttering the terminal window
> with almost useless lines which are tedious to sift through.
I fail to see how language choice here matters. But then again,
I have plenty of experience writing bad code in ALL languages I
know :>
> Likewise, having a framework in C would make it a lot easier to improve
> debugging, e.g. by making test scripts "resumable" (guarded by an
> option, it could store a complete state, including a copy of the trash
> directory, before executing commands, which would allow "going back in
> time" and calling a failing command with a debugger, or with valgrind, or
> just seeing whether the command would still fail, i.e. whether the test
> case is flaky).
Resumability sounds like a perfect job for GNU make.
(that said, I don't know if you use make or something else to build gfw)
> In many ways, our current test suite seems to test Git's functionality
> as much as (core) contributors' abilities to implement test cases in
> Unix shell script, _correctly_, and maybe also contributors' patience.
> You could say that it tests for the wrong thing at least half of the
> time, by design.
Basic (not advanced) sh is already a prerequisite for using git.
Writing correct code and tests in ANY language is still a
challenge for me; but I'm least convinced a low-level language
such as C is the right language for writing integration tests in.
C is fine for unit tests, and maybe we can use more unit tests
and less integration tests.
> It might look like a somewhat less important project, but given that we
> exercise almost 150,000 test cases with every CI build, I think it does
> make sense to grind our axe for a while, so to say.
Something that would benefit both users and regular contributors
is the use and adoption of more batch and eval-friendly interfaces.
e.g. fast-import/export, cat-file --batch, for-each-ref --perl...
I haven't used hg since 2005, but I know "hg server" exists
nowadays to get rid of a lot of startup overhead in Mercurial,
and maybe git could steal that idea, too...
> Therefore, it might be a really good project to modernize our test
> suite. To take ideas from modern test frameworks such as Jest and try to
> bring them to C. Which means that new contributors would probably be
> better suited to work on this project than Git old-timers!
>
> And the really neat thing about this project is that it could be done
> incrementally.
I hope to find time to hack some more batch/eval-friendly stuff
that can make scripting git more performant; but no idea on my
availability :<
next prev parent reply other threads:[~2019-09-24 0:55 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-27 5:17 Git in Outreachy December 2019? Jeff King
2019-08-31 7:58 ` Christian Couder
2019-08-31 19:44 ` Olga Telezhnaya
2019-09-04 19:41 ` Jeff King
2019-09-05 7:24 ` Christian Couder
2019-09-05 19:39 ` Emily Shaffer
2019-09-06 11:55 ` Carlo Arenas
2019-09-07 6:39 ` Jeff King
2019-09-07 10:13 ` Carlo Arenas
2019-09-07 6:36 ` Jeff King
2019-09-08 14:56 ` Pratyush Yadav
2019-09-09 17:00 ` Jeff King
2019-09-23 18:07 ` SZEDER Gábor
2019-09-26 9:47 ` SZEDER Gábor
2019-09-26 19:32 ` Johannes Schindelin
2019-09-26 21:54 ` SZEDER Gábor
2019-09-26 11:42 ` Johannes Schindelin
2019-09-13 20:03 ` Jonathan Tan
2019-09-13 20:51 ` Jeff King
2019-09-16 18:42 ` Emily Shaffer
2019-09-16 21:33 ` Eric Wong
2019-09-16 21:44 ` SZEDER Gábor
2019-09-16 23:13 ` Jonathan Nieder
2019-09-17 0:59 ` Jeff King
2019-09-17 11:23 ` Johannes Schindelin
2019-09-17 12:02 ` SZEDER Gábor
2019-09-23 12:47 ` Johannes Schindelin
2019-09-23 16:58 ` SZEDER Gábor
2019-09-26 11:04 ` Johannes Schindelin
2019-09-26 13:28 ` SZEDER Gábor
2019-09-26 19:39 ` Johannes Schindelin
2019-09-26 21:44 ` SZEDER Gábor
2019-09-27 22:18 ` Jeff King
2019-10-09 17:25 ` SZEDER Gábor
2019-10-11 6:34 ` Jeff King
2019-09-23 18:19 ` Jeff King
2019-09-24 14:30 ` Johannes Schindelin
2019-09-17 15:10 ` Christian Couder
2019-09-23 12:50 ` Johannes Schindelin
2019-09-23 19:30 ` Jeff King
2019-09-23 18:07 ` Jeff King
2019-09-24 14:25 ` Johannes Schindelin
2019-09-24 15:33 ` Jeff King
2019-09-28 3:56 ` Junio C Hamano
2019-09-24 0:55 ` Eric Wong [this message]
2019-09-26 12:45 ` Johannes Schindelin
2019-09-30 8:55 ` Eric Wong
2019-09-28 4:01 ` Junio C Hamano
2019-09-20 17:04 ` Jonathan Tan
2019-09-21 1:47 ` Emily Shaffer
2019-09-23 14:23 ` Christian Couder
2019-09-23 19:40 ` Jeff King
2019-09-23 22:29 ` Philip Oakley
2019-10-22 21:16 ` Emily Shaffer
2019-09-23 11:49 ` Christian Couder
2019-09-23 17:58 ` Jonathan Tan
2019-09-23 19:27 ` Jeff King
2019-09-23 20:48 ` Jonathan Tan
2019-09-23 19:15 ` Jeff King
2019-09-23 20:38 ` Jonathan Tan
2019-09-23 21:28 ` Jeff King
2019-09-24 17:07 ` Jonathan Tan
2019-09-26 7:09 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190924005529.GA8354@dcvr \
--to=e@80x24.org \
--cc=Johannes.Schindelin@gmx.de \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.