All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Tan <pyokagan@gmail.com>
To: Git List <git@vger.kernel.org>
Cc: Junio C Hamano <gitster@pobox.com>,
	Johannes Schindelin <johannes.schindelin@gmx.de>,
	Duy Nguyen <pclouds@gmail.com>,
	Stefan Beller <sbeller@google.com>,
	sam.halliday@gmail.com, Paul Tan <pyokagan@gmail.com>
Subject: [PATCH/RFC/GSoC 00/17] A barebones git-rebase in C
Date: Sat, 12 Mar 2016 18:46:20 +0800	[thread overview]
Message-ID: <1457779597-6918-1-git-send-email-pyokagan@gmail.com> (raw)

Hi all,

Last year I rewrote git-am from shell script to C. This succeeded in speeding
up a non-interactive git-rebase by 6-7x[1], which is really handly when rebasing
multiple topic branches.

[1] http://thread.gmane.org/gmane.comp.version-control.git/271967

However, it turns out that when working on a topic branch, I frequently use
interactive rebase instead to edit and squash commits. Unfortunately, as
git-rebase--interactive.sh is still a shell script, it is a bit slower (e.g.
taking a few seconds longer compared to non-interactive rebase when rebasing
big topic branches).

The situation is much worse on Windows, as from the invocation of git rebase -i,
it takes a few seconds before the editor even pops up, and the actual
rebase proceeds at a snails pace, taking around 3 minutes for a 50-patch
series, which is a huge deal-breaker since my workflow depends on frequent
commits and squashes.

As such, this year I would like to apply for GSoC to work on a rewrite of
git-rebase to C. It is slightly hefty, as there are three backends (am, merge
and interactive), along with the git-rebase.sh script.

To get a gauge of how much code is needed for the rewrite, I explored rewriting
the scripts into C, and then extracted some bits out and polished them a bit to
make a barebones git-rebase in C, creating this patch series:

[01/17] perf: introduce performance tests for git-rebase

A simple performance test for the three rebase backends so we can compare this
C version and the shell version below.

[02/17] sha1_name: implement get_oid() and friends
[03/17] builtin-rebase: implement skeletal builtin rebase
[04/17] builtin-rebase: parse rebase arguments into a common rebase_options struct
[05/17] rebase-options: implement rebase_options_load() and rebase_options_save()

The three rebase backends (am, merge, interactive) have vastly different
capabilities, so I did not try to shoehorn them into the same interface.
However, they do share a few common options and functionality, so I introduced
the common rebase-common.c library and rebase_options struct.

In the above patches we implement the essential arguments for a rebase: the
upstream, branch_name and --onto <newbase>.

[06/17] rebase-am: introduce am backend for builtin rebase

This patch implements a barebones rebase-am backend.

[07/17] rebase-common: implement refresh_and_write_cache()
[08/17] rebase-common: let refresh_and_write_cache() take a flags argument
[09/17] rebase-common: implement cache_has_unstaged_changes()
[10/17] rebase-common: implement cache_has_uncommitted_changes()
[11/17] rebase-merge: introduce merge backend for builtin rebase

These patches implement a barebones rebase-merge backend.

[12/17] rebase-todo: introduce rebase_todo_item
[13/17] rebase-todo: introduce rebase_todo_list
[14/17] status: use rebase_todo_list
[15/17] wrapper: implement append_file()
[16/17] editor: implement git_sequence_editor() and launch_sequence_editor()
[17/17] rebase-interactive: introduce interactive backend for builtin rebase

And these patches implement a barebones rebase-interactive backend.

With these patches the performance numbers when rebasing 50 commits on the
git.git repository are, on Linux,

Before patch series:

Test                               this tree
--------------------------------------------------
3400.2: rebase --onto master^      1.10(0.84+0.06)
3402.2: rebase -m --onto master^   2.38(1.38+0.13)
3404.2: rebase -i --onto master^   3.11(1.37+0.27)

After patch series:

Test                               this tree
--------------------------------------------------
3400.2: rebase --onto master^      0.74(0.51+0.08)
3402.2: rebase -m --onto master^   1.72(1.26+0.17)
3404.2: rebase -i --onto master^   1.74(1.20+0.18)

And on Windows,

Before patch series:

Test                               this tree
----------------------------------------------------
3400.2: rebase --onto master^      10.90(0.06+0.47)
3402.2: rebase -m --onto master^   86.87(0.04+0.47)
3404.2: rebase -i --onto master^   191.65(0.09+0.44)

After patch series:

Test                               this tree
---------------------------------------------------
3400.2: rebase --onto master^      6.45(0.13+0.40)
3402.2: rebase -m --onto master^   12.32(0.13+0.40)
3404.2: rebase -i --onto master^   14.16(0.15+0.40)

(Thanks to the git-am rewrite, non-interactive rebase on Windows is already
relatively fast ;-) )

So, we have around a 1.4x-1.8x speedup for Linux users, and a 1.7x-13x speedup
for Windows users. The annoying long delay before the interactive editor is
launched on Windows is gotten rid of, which I'm very happy about :-)

On the code side, we do get some nice things with a rewrite to C. For example,
we get the rebase-todo library for parsing and writing git-rebase-todo files,
which means that wt-status.c and rebase-interactive.c can share the same
parsing code. Although not in this patch series, rebase-interactive.c can also
now share the same author-script parsing and writing code from builtin/am.c as
well.

Regards,
Paul

Paul Tan (17):
  perf: introduce performance tests for git-rebase
  sha1_name: implement get_oid() and friends
  builtin-rebase: implement skeletal builtin rebase
  builtin-rebase: parse rebase arguments into a common rebase_options
    struct
  rebase-options: implement rebase_options_load() and
    rebase_options_save()
  rebase-am: introduce am backend for builtin rebase
  rebase-common: implement refresh_and_write_cache()
  rebase-common: let refresh_and_write_cache() take a flags argument
  rebase-common: implement cache_has_unstaged_changes()
  rebase-common: implement cache_has_uncommitted_changes()
  rebase-merge: introduce merge backend for builtin rebase
  rebase-todo: introduce rebase_todo_item
  rebase-todo: introduce rebase_todo_list
  status: use rebase_todo_list
  wrapper: implement append_file()
  editor: implement git_sequence_editor() and launch_sequence_editor()
  rebase-interactive: introduce interactive backend for builtin rebase

 Makefile                           |  10 +-
 builtin.h                          |   1 +
 builtin/am.c                       |  16 +-
 builtin/pull.c                     |  41 +---
 builtin/rebase.c                   | 264 ++++++++++++++++++++++++++
 cache.h                            |   8 +
 editor.c                           |  27 ++-
 git.c                              |   1 +
 rebase-am.c                        | 110 +++++++++++
 rebase-am.h                        |  22 +++
 rebase-common.c                    | 220 ++++++++++++++++++++++
 rebase-common.h                    |  48 +++++
 rebase-interactive.c               | 375 +++++++++++++++++++++++++++++++++++++
 rebase-interactive.h               |  33 ++++
 rebase-merge.c                     | 256 +++++++++++++++++++++++++
 rebase-merge.h                     |  28 +++
 rebase-todo.c                      | 251 +++++++++++++++++++++++++
 rebase-todo.h                      |  55 ++++++
 sha1_name.c                        |  30 +++
 strbuf.h                           |   1 +
 t/perf/p3400-rebase.sh             |  25 +++
 t/perf/p3402-rebase-merge.sh       |  25 +++
 t/perf/p3404-rebase-interactive.sh |  26 +++
 wrapper.c                          |  23 +++
 wt-status.c                        | 100 +++-------
 25 files changed, 1863 insertions(+), 133 deletions(-)
 create mode 100644 builtin/rebase.c
 create mode 100644 rebase-am.c
 create mode 100644 rebase-am.h
 create mode 100644 rebase-common.c
 create mode 100644 rebase-common.h
 create mode 100644 rebase-interactive.c
 create mode 100644 rebase-interactive.h
 create mode 100644 rebase-merge.c
 create mode 100644 rebase-merge.h
 create mode 100644 rebase-todo.c
 create mode 100644 rebase-todo.h
 create mode 100755 t/perf/p3400-rebase.sh
 create mode 100755 t/perf/p3402-rebase-merge.sh
 create mode 100755 t/perf/p3404-rebase-interactive.sh

-- 
2.7.0

             reply	other threads:[~2016-03-12 10:47 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-12 10:46 Paul Tan [this message]
2016-03-12 10:46 ` [PATCH/RFC/GSoC 01/17] perf: introduce performance tests for git-rebase Paul Tan
2016-03-16  7:58   ` Johannes Schindelin
2016-03-16 11:51     ` Paul Tan
2016-03-16 15:59       ` Johannes Schindelin
2016-03-18 11:01         ` Thomas Gummerer
2016-03-18 16:00           ` Johannes Schindelin
2016-03-20 14:00             ` Thomas Gummerer
2016-03-21  7:54               ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 02/17] sha1_name: implement get_oid() and friends Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 03/17] builtin-rebase: implement skeletal builtin rebase Paul Tan
2016-03-14 18:31   ` Stefan Beller
2016-03-15  8:01     ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 04/17] builtin-rebase: parse rebase arguments into a common rebase_options struct Paul Tan
2016-03-14 20:05   ` Stefan Beller
2016-03-15 10:54   ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 05/17] rebase-options: implement rebase_options_load() and rebase_options_save() Paul Tan
2016-03-14 20:30   ` Stefan Beller
2016-03-16  8:04     ` Johannes Schindelin
2016-03-16 12:28       ` Paul Tan
2016-03-16 17:11         ` Johannes Schindelin
2016-03-21 14:55           ` Paul Tan
2016-03-16 12:04     ` Paul Tan
2016-03-16 17:10       ` Stefan Beller
2016-03-12 10:46 ` [PATCH/RFC/GSoC 06/17] rebase-am: introduce am backend for builtin rebase Paul Tan
2016-03-16 13:21   ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 07/17] rebase-common: implement refresh_and_write_cache() Paul Tan
2016-03-14 21:10   ` Junio C Hamano
2016-03-16 12:56     ` Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 08/17] rebase-common: let refresh_and_write_cache() take a flags argument Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 09/17] rebase-common: implement cache_has_unstaged_changes() Paul Tan
2016-03-14 20:54   ` Johannes Schindelin
2016-03-14 21:52     ` Junio C Hamano
2016-03-15 11:51       ` Johannes Schindelin
2016-03-15 11:07     ` Duy Nguyen
2016-03-15 14:15       ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 10/17] rebase-common: implement cache_has_uncommitted_changes() Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 11/17] rebase-merge: introduce merge backend for builtin rebase Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 12/17] rebase-todo: introduce rebase_todo_item Paul Tan
2016-03-14 13:43   ` Christian Couder
2016-03-14 20:33     ` Johannes Schindelin
2016-03-16 12:54     ` Paul Tan
2016-03-16 15:55       ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 13/17] rebase-todo: introduce rebase_todo_list Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 14/17] status: use rebase_todo_list Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 15/17] wrapper: implement append_file() Paul Tan
2016-03-12 10:46 ` [PATCH/RFC/GSoC 16/17] editor: implement git_sequence_editor() and launch_sequence_editor() Paul Tan
2016-03-15  7:00   ` Johannes Schindelin
2016-03-16 13:06     ` Paul Tan
2016-03-16 18:21       ` Johannes Schindelin
2016-03-12 10:46 ` [PATCH/RFC/GSoC 17/17] rebase-interactive: introduce interactive backend for builtin rebase Paul Tan
2016-03-15  7:57   ` Johannes Schindelin
2016-03-15 16:48     ` Paul Tan
2016-03-15 19:45       ` Johannes Schindelin
2016-03-14 12:15 ` [PATCH/RFC/GSoC 00/17] A barebones git-rebase in C Duy Nguyen
2016-03-14 17:32   ` Stefan Beller
2016-03-14 18:43   ` Junio C Hamano
2016-03-16 12:46     ` Paul Tan
2016-03-14 20:44   ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1457779597-6918-1-git-send-email-pyokagan@gmail.com \
    --to=pyokagan@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pclouds@gmail.com \
    --cc=sam.halliday@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.