git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/24] Index-v5
@ 2013-11-27 12:00 Thomas Gummerer
  2013-11-27 12:00 ` [PATCH v4 01/24] t2104: Don't fail for index versions other than [23] Thomas Gummerer
                   ` (24 more replies)
  0 siblings, 25 replies; 41+ messages in thread
From: Thomas Gummerer @ 2013-11-27 12:00 UTC (permalink / raw)
  To: git
  Cc: t.gummerer, gitster, tr, mhagger, pclouds, robin.rosenberg,
	sunshine, ramsay

Hi,

previous rounds (without api) are at $gmane/202752, $gmane/202923,
$gmane/203088 and $gmane/203517, the previous rounds with api were at
$gmane/229732, $gmane/230210 and $gmane/232488.  Thanks to Duy for
reviewing the the last round and Junio, Ramsay and Eric for additional
comments.

Since the last round I've added a POC for partial writing, resulting
in the following performance improvements for update-index:

Test                                        1063432           HEAD
------------------------------------------------------------------------------------
0003.2: v[23]: update-index                 0.60(0.38+0.20)   0.76(0.36+0.17) +26.7%
0003.3: v[23]: grep nonexistent -- subdir   0.28(0.17+0.11)   0.28(0.18+0.09) +0.0%
0003.4: v[23]: ls-files -- subdir           0.26(0.15+0.10)   0.24(0.14+0.09) -7.7%
0003.7: v[23] update-index                  0.59(0.36+0.22)   0.58(0.36+0.20) -1.7%
0003.9: v4: update-index                    0.46(0.28+0.17)   0.45(0.30+0.11) -2.2%
0003.10: v4: grep nonexistent -- subdir     0.26(0.14+0.11)   0.21(0.14+0.07) -19.2%
0003.11: v4: ls-files -- subdir             0.24(0.14+0.10)   0.20(0.12+0.08) -16.7%
0003.14: v4 update-index                    0.49(0.31+0.18)   0.65(0.34+0.17) +32.7%
0003.16: v5: update-index                   0.53(0.30+0.22)   0.50(0.28+0.20) -5.7%
0003.17: v5: ls-files                       0.27(0.15+0.12)   0.27(0.17+0.10) +0.0%
0003.18: v5: grep nonexistent -- subdir     0.02(0.01+0.01)   0.03(0.01+0.01) +50.0%
0003.19: v5: ls-files -- subdir             0.02(0.00+0.02)   0.02(0.01+0.01) +0.0%
0003.22: v5 update-index                    0.53(0.29+0.23)   0.02(0.01+0.01) -96.2%

Given this, I don't think a complete change of the in-core format for
the cache-entries is necessary to take full advantage of the new index
file format.  Instead some changes to the current in-core format would
work well with the new on-disk format.

The current in-memory format fits the internal needs of git fairly well,
so I don't think changing it to fit a better index file format would
make a lot of sense, given that we can take advantage of the new format
with the existing in-memory format.

This series doesn't use kb/fast-hashmap yet, but that should be fairly
simple to change if the series is deemed a good change.  The
performance tests for update-index test require
tg/perf-lib-test-perf-cleanup. 

Other changes, made following the review comments are:

documentation: add documentation of the index-v5 file format
  - Update documentation that directory flags are now 32-bits.  That
    makes aligned access simpler
  - offset_to_offset is no longer included in the checksum for files.
    It's unnecessary.

read-cache: read index-v5
  - Add fix for reading with different level pathspecs given
  - Use init_directory_entry to initialize all fields in a new
    directory entry
  - use memset to simplify the create_new_conflict function
  - Add comments to explain -5 when reading directories and files
  - Add comments for the more complex functions
  - Add name flex_array to the end of ondisk_directory_entry for
    simplified reading
  - Add name flex_array to the end of ondisk_cache_entry for
    simplified reading
  - Move conflict reading functions to next patch
  - mark functions as static when they are

read-cache: read resolve-undo data
  - Add comments for the more complex function
  - Read conflicts + resolve undo data as extension

read-cache: read cache-tree in index-v5
  - Add comments for the more complex function
  - Instead of sorting the directory entries, sort the cache-tree
    directly.  This also required changing the algorithms with which
    the cache entries are extracted from the directory tree.

read-cache: write index-v5
  - Free pointers allocated by super_directory
  - Rewrite condition as suggested by Duy
  - Don't check for CE_REMOVE'd entries in the writing code, they are
    already checked in the compile_directory_data code
  - Remove overly complicated directory size calculation since flags
    are now 32-bits

read-cache: write resolve-undo data for index-v5
  - Free pointers allocated by super_directory
  - Write conflicts + resolve undo data as extension

introduce GIT_INDEX_VERSION environment variable
  - Add documentation for GIT_INDEX_VERSION

test-lib: allow setting the index format version

Removed commits:
  - read-cache: don't check uid, gid, ino
  - read-cache: use fixed width integer types (independently in pu)
  - read-cache: clear version in discard_index()

Typos fixed as suggested by Eric Sunshine

Thomas Gummerer (22):
  read-cache: split index file version specific functionality
  read-cache: move index v2 specific functions to their own file
  read-cache: Re-read index if index file changed
  add documentation for the index api
  read-cache: add index reading api
  make sure partially read index is not changed
  grep.c: use index api
  ls-files.c: use index api
  documentation: add documentation of the index-v5 file format
  read-cache: make in-memory format aware of stat_crc
  read-cache: read index-v5
  read-cache: read resolve-undo data
  read-cache: read cache-tree in index-v5
  read-cache: write index-v5
  read-cache: write index-v5 cache-tree data
  read-cache: write resolve-undo data for index-v5
  update-index.c: rewrite index when index-version is given
  introduce GIT_INDEX_VERSION environment variable
  test-lib: allow setting the index format version
  t1600: add index v5 specific tests
  POC for partial writing
  perf: add partial writing test

Thomas Rast (1):
  p0003-index.sh: add perf test for the index formats

 Documentation/git.txt                            |    5 +
 Documentation/technical/api-in-core-index.txt    |   56 +-
 Documentation/technical/index-file-format-v5.txt |  294 +++++
 Makefile                                         |   10 +
 builtin/apply.c                                  |    2 +
 builtin/grep.c                                   |   69 +-
 builtin/ls-files.c                               |   36 +-
 builtin/update-index.c                           |   50 +-
 cache-tree.c                                     |   15 +-
 cache-tree.h                                     |    2 +
 cache.h                                          |  115 +-
 lockfile.c                                       |    2 +-
 read-cache-v2.c                                  |  561 +++++++++
 read-cache-v5.c                                  | 1406 ++++++++++++++++++++++
 read-cache.c                                     |  691 +++--------
 read-cache.h                                     |   67 ++
 resolve-undo.c                                   |    1 +
 t/perf/p0003-index.sh                            |   74 ++
 t/t1600-index-v5.sh                              |   25 +
 t/t2101-update-index-reupdate.sh                 |   12 +-
 t/test-lib-functions.sh                          |    5 +
 t/test-lib.sh                                    |    3 +
 test-index-version.c                             |    6 +
 unpack-trees.c                                   |    3 +-
 24 files changed, 2921 insertions(+), 589 deletions(-)
 create mode 100644 Documentation/technical/index-file-format-v5.txt
 create mode 100644 read-cache-v2.c
 create mode 100644 read-cache-v5.c
 create mode 100644 read-cache.h
 create mode 100755 t/perf/p0003-index.sh
 create mode 100755 t/t1600-index-v5.sh

-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2013-12-09 10:14 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-27 12:00 [PATCH v4 00/24] Index-v5 Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 01/24] t2104: Don't fail for index versions other than [23] Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 02/24] read-cache: split index file version specific functionality Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 03/24] read-cache: move index v2 specific functions to their own file Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 04/24] read-cache: Re-read index if index file changed Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 05/24] add documentation for the index api Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 06/24] read-cache: add index reading api Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 07/24] make sure partially read index is not changed Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 08/24] grep.c: use index api Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 09/24] ls-files.c: " Thomas Gummerer
2013-11-30  9:17   ` Duy Nguyen
2013-11-30 10:30     ` Thomas Gummerer
2013-11-30 15:39   ` Antoine Pelisse
2013-11-30 20:08     ` Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 10/24] documentation: add documentation of the index-v5 file format Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 11/24] read-cache: make in-memory format aware of stat_crc Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 12/24] read-cache: read index-v5 Thomas Gummerer
2013-11-30  9:17   ` Duy Nguyen
2013-11-30 10:40     ` Thomas Gummerer
2013-11-30 12:19   ` Antoine Pelisse
2013-11-30 20:10     ` Thomas Gummerer
2013-11-30 15:26   ` Antoine Pelisse
2013-11-30 20:27     ` Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 13/24] read-cache: read resolve-undo data Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 14/24] read-cache: read cache-tree in index-v5 Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 15/24] read-cache: write index-v5 Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 16/24] read-cache: write index-v5 cache-tree data Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 17/24] read-cache: write resolve-undo data for index-v5 Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 18/24] update-index.c: rewrite index when index-version is given Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 19/24] p0003-index.sh: add perf test for the index formats Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 20/24] introduce GIT_INDEX_VERSION environment variable Thomas Gummerer
2013-11-27 21:57   ` Eric Sunshine
2013-11-27 22:08     ` Junio C Hamano
2013-11-28  9:57       ` Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 21/24] test-lib: allow setting the index format version Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 22/24] t1600: add index v5 specific tests Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 23/24] POC for partial writing Thomas Gummerer
2013-11-30  9:58   ` Duy Nguyen
2013-11-30 10:50     ` Thomas Gummerer
2013-11-27 12:00 ` [PATCH v4 24/24] perf: add partial writing test Thomas Gummerer
2013-12-09 10:14 ` [PATCH v4 00/24] Index-v5 Thomas Gummerer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).