All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/27] [GSOC] [RFC] cat-file: reuse ref-filter logic
@ 2021-08-13  8:22 ZheNing Hu via GitGitGadget
  2021-08-13  8:22 ` [PATCH 01/27] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
                   ` (26 more replies)
  0 siblings, 27 replies; 28+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-08-13  8:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu

This patch series makes cat-file reuse ref-filter logic. At the same time,
some performance optimizations have been carried out. It's last version is
here:
https://lore.kernel.org/git/pull.993.v2.git.1626363626.gitgitgadget@gmail.com/#t

It seems that zh/ref-filter-raw-data is still hovering in the next branch
(Because git is rc2) So I now want to show some recent performance
optimizations first.

Change from last version:

 1.  Use free_global_resource() to avoid memory leaks.
 2.  Skip parse_object_buffer() which bring 12.5% performance optimization.
 3.  Merge two for loop in grab_person() which bring 2% performance
     optimization.
 4.  Remove strlen from find_subpos.
 5.  Introducing xstrvfmt_len() and xstrfmt_len().
 6.  Remove second parsing in format_ref_array_item() which bring 1.9%
     performance optimization
 7.  Introduction ref_filter_slopbuf to instread xstrdup("").
 8.  Add deref member to struct used_atom to simplify the logic of the
     program.
 9.  Introduce symref_atom_parser() to make the program logic more concise.
 10. Use switch/case instread of if/else to increase the readability of the
     code.
 11. Reuse finnal buffer which bring 2% performance optimization.
 12. Add need_get_object_info flag to reduce memory comparing.

This is the result of the performance test after I did some optimization:

Test                                        upstream/master   this tree
------------------------------------------------------------------------------------
1006.2: cat-file --batch-check              0.08(0.07+0.00)   0.09(0.08+0.01) +12.5%
1006.3: cat-file --batch-check with atoms   0.06(0.04+0.02)   0.08(0.06+0.02) +33.3%
1006.4: cat-file --batch                    0.49(0.46+0.02)   0.50(0.47+0.03) +2.0%
1006.5: cat-file --batch with atoms         0.47(0.45+0.01)   0.49(0.47+0.02) +4.3%


We can see that the performance of the current patch of git cat-file --batch
is very close to upstream/master. The optimization of git cat-file
--batch-check does not seem obvious, because its optimization degree will be
affected by noise, which may appear in the range of +12.5% to +50.0%. From
an optimistic point of view, the execution time of git cat-file
--batch-check itself is relatively short, the optimization is of course not
obvious.

As GSOC is about to end, this patch series is estimated to be adjusted for
some time, I can only wish this patch can be accepted in the future.

Note: The previous part of this patch series is the duplicate content
belonging to zh/ref-filter-raw-data.

ZheNing Hu (27):
  [GSOC] ref-filter: add obj-type check in grab contents
  [GSOC] ref-filter: add %(raw) atom
  [GSOC] ref-filter: --format=%(raw) support --perl
  [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  [GSOC] ref-filter: add %(rest) atom
  [GSOC] ref-filter: pass get_object() return value to their callers
  [GSOC] ref-filter: introduce free_ref_array_item_value() function
  [GSOC] ref-filter: add cat_file_mode to ref_format
  [GSOC] ref-filter: modify the error message and value in get_object
  [GSOC] cat-file: add has_object_file() check
  [GSOC] cat-file: change batch_objects parameter name
  [GSOC] cat-file: create p1006-cat-file.sh
  [GSOC] cat-file: reuse ref-filter logic
  [GSOC] cat-file: reuse err buf in batch_object_write()
  [GSOC] cat-file: re-implement --textconv, --filters options
  [GSOC] ref-filter: remove grab_oid() function
  [GSOC] ref-filter: performance optimization by skip
    parse_object_buffer
  [GSOC] ref-filter: use atom_type and merge two for loop in grab_person
  [GSOC] ref-filter: remove strlen from find_subpos
  [GSOC] ref-filter: introducing xstrvfmt_len() and xstrfmt_len()
  [GSOC] ref-filter: remove second parsing in format_ref_array_item
  [GSOC] ref-filter: introduction ref_filter_slopbuf
  [GSOC] ref-filter: add deref member to struct used_atom
  [GSOC] ref-filter: introduce symref_atom_parser()
  [GSOC] ref-filter: use switch case instread of if else
  [GSOC] ref-filter: reuse finnal buffer if no stack need
  [GSOC] ref-filter: add need_get_object_info flag to struct expand_data

 Documentation/git-cat-file.txt     |   6 +
 Documentation/git-for-each-ref.txt |   9 +
 builtin/branch.c                   |   2 +
 builtin/cat-file.c                 | 275 +++------
 builtin/for-each-ref.c             |   3 +-
 builtin/tag.c                      |   4 +-
 builtin/verify-tag.c               |   2 +
 quote.c                            |  17 +
 quote.h                            |   1 +
 ref-filter.c                       | 902 +++++++++++++++++++----------
 ref-filter.h                       |  30 +-
 strbuf.c                           |  21 +
 strbuf.h                           |   6 +
 t/perf/p1006-cat-file.sh           |  28 +
 t/t1006-cat-file.sh                | 239 ++++++++
 t/t3203-branch-output.sh           |   4 +
 t/t6300-for-each-ref.sh            | 235 ++++++++
 t/t6301-for-each-ref-errors.sh     |   2 +-
 t/t7004-tag.sh                     |   4 +
 t/t7030-verify-tag.sh              |   4 +
 20 files changed, 1283 insertions(+), 511 deletions(-)
 create mode 100755 t/perf/p1006-cat-file.sh


base-commit: 5d213e46bb7b880238ff5ea3914e940a50ae9369
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1016%2Fadlternative%2Fcat-file-reuse-ref-filter-logic-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1016/adlternative/cat-file-reuse-ref-filter-logic-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1016
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-08-13  8:23 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-13  8:22 [PATCH 00/27] [GSOC] [RFC] cat-file: reuse ref-filter logic ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 01/27] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 02/27] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 03/27] [GSOC] ref-filter: --format=%(raw) support --perl ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 04/27] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 05/27] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 06/27] [GSOC] ref-filter: pass get_object() return value to their callers ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 07/27] [GSOC] ref-filter: introduce free_ref_array_item_value() function ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 08/27] [GSOC] ref-filter: add cat_file_mode to ref_format ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 09/27] [GSOC] ref-filter: modify the error message and value in get_object ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 10/27] [GSOC] cat-file: add has_object_file() check ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 11/27] [GSOC] cat-file: change batch_objects parameter name ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 12/27] [GSOC] cat-file: create p1006-cat-file.sh ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 13/27] [GSOC] cat-file: reuse ref-filter logic ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 14/27] [GSOC] cat-file: reuse err buf in batch_object_write() ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 15/27] [GSOC] cat-file: re-implement --textconv, --filters options ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 16/27] [GSOC] ref-filter: remove grab_oid() function ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 17/27] [GSOC] ref-filter: performance optimization by skip parse_object_buffer ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 18/27] [GSOC] ref-filter: use atom_type and merge two for loop in grab_person ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 19/27] [GSOC] ref-filter: remove strlen from find_subpos ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 20/27] [GSOC] ref-filter: introducing xstrvfmt_len() and xstrfmt_len() ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 21/27] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 22/27] [GSOC] ref-filter: introduction ref_filter_slopbuf ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 23/27] [GSOC] ref-filter: add deref member to struct used_atom ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 24/27] [GSOC] ref-filter: introduce symref_atom_parser() ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 25/27] [GSOC] ref-filter: use switch case instread of if else ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 26/27] [GSOC] ref-filter: reuse finnal buffer if no stack need ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 27/27] [GSOC] ref-filter: add need_get_object_info flag to struct expand_data ZheNing Hu via GitGitGadget

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.