From: Thomas Gummerer <t.gummerer@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Thomas Rast <trast@inf.ethz.ch>,
Michael Haggerty <mhagger@alum.mit.edu>,
Junio C Hamano <gitster@pobox.com>,
Robin Rosenberg <robin.rosenberg@dewire.com>
Subject: [PATCH 5.5/22] Add documentation for the index api
Date: Mon, 08 Jul 2013 22:54:23 +0200 [thread overview]
Message-ID: <871u78rcw0.fsf@gmail.com> (raw)
In-Reply-To: <CACsJy8CtOWjpxKuNhQXYjPAv8MU0U6yTBEuQeqm0kxqVne6NjQ@mail.gmail.com>
Document the new index api and add examples of how it should be used
instead of the old functions directly accessing the index.
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
Duy Nguyen <pclouds@gmail.com> writes:
> Hmm.. I was confused actually (documentation on the api would help
> greatly).
As promised, a draft for a documentation for the index api as it is in
this series.
Documentation/technical/api-in-core-index.txt | 108 +++++++++++++++++++++++++-
1 file changed, 106 insertions(+), 2 deletions(-)
diff --git a/Documentation/technical/api-in-core-index.txt b/Documentation/technical/api-in-core-index.txt
index adbdbf5..5269bb1 100644
--- a/Documentation/technical/api-in-core-index.txt
+++ b/Documentation/technical/api-in-core-index.txt
@@ -1,14 +1,116 @@
in-core index API
=================
+Reading API
+-----------
+
+`read_index()`::
+ Read the whole index file from disk.
+
+`index_name_pos(name, namelen)`::
+ Find a cache_entry with name in the index. Returns pos if an
+ entry is matched exactly and -pos-1 if an entry is matched
+ partially.
+ e.g.
+ index:
+ file1
+ file2
+ path/file1
+ zzz
+
+ index_name_pos("path/file1", 10) returns 2, while
+ index_name_pos("path", 4) returns -1
+
+`read_index_filtered(opts)`::
+ This method behaves differently for index-v2 and index-v5.
+
+ For index-v2 it simply reads the whole index as read_index()
+ does, so we are sure we don't have to reload anything if the
+ user wants a different filter. It also sets the filter_opts
+ in the index_state, which is used to limit the results when
+ iterating over the index with for_each_index_entry().
+
+ The whole index is read to avoid the need to eventually
+ re-read the index later, because the performance is no
+ different when reading it partially.
+
+ For index-v5 it creates an adjusted_pathspec to filter the
+ reading. First all the directory entries are read and then
+ the cache_entries in the directories that match the adjusted
+ pathspec are read. The filter_opts in the index_state are set
+ to filter out the rest of the cache_entries that are matched
+ by the adjusted pathspec but not by the pathspec given. The
+ rest of the index entries are filtered out when iterating over
+ the cache with for_each_index_entries.
+
+`get_index_entry_by_name(name, namelen, &ce)`::
+ Returns a cache_entry matched by the name, returned via the
+ &ce parameter. If a cache entry is matched exactly, 1 is
+ returned, otherwise 0. For an example see index_name_pos().
+ This function should be used instead of the index_name_pos()
+ function to retrieve cache entries.
+
+`for_each_index_entry(fn, cb_data)`::
+ Iterates over all cache_entries in the index filtered by
+ filter_opts in the index_stat. For each cache entry fn is
+ executed with cb_data as callback data. From within the loop
+ do `return 0` to continue, or `return 1` to break the loop.
+
+`next_index_entry(ce)`::
+ Returns the cache_entry that follows after ce
+
+`index_change_filter_opts(opts)`::
+ This function again has a slightly different functionality for
+ index-v2 and index-v5.
+
+ For index-v2 it simply changes the filter_opts, so
+ for_each_index_entry uses the changed index_opts, to iterate
+ over a different set of cache entries.
+
+ For index-v5 it refreshes the index if the filter_opts have
+ changed and sets the new filter_opts in the index state, again
+ to iterate over a different set of cache entries as with
+ index-v2.
+
+ This has some optimization potential, in the case that the
+ opts get stricter (less of the index should be read) it
+ doesn't have to reload anything, but currently does.
+
+Using the new index api
+-----------------------
+
+Currently loops over a specific set of index entry were written as:
+ i = start_index;
+ while (i < active_nr) { ce = active_cache[i]; do(something); i++; }
+
+they should be rewritten to:
+ ce = start;
+ while (ce) { do(something); ce = next_cache_entry(ce); }
+
+which is the equivalent operation but hides the in-memory format of
+the index from the user.
+
+For getting a cache entry get_cache_entry_by_name() should be used
+instead of cache_name_pos(). e.g.:
+ int pos = cache_name_pos(name, namelen);
+ struct cache_entry *ce = active_cache[pos];
+ if (pos < 0) { do(something) }
+ else { do(somethingelse) }
+
+should be written as:
+ struct cache_entry *ce;
+ int ret = get_cache_entry_by_name(name, namelen, &ce);
+ if (!ret) { do(something) }
+ else { do(somethingelse) }
+
+TODO
+----
Talk about <read-cache.c> and <cache-tree.c>, things like:
* cache -> the_index macros
-* read_index()
* write_index()
* ie_match_stat() and ie_modified(); how they are different and when to
use which.
-* index_name_pos()
* remove_index_entry_at()
* remove_file_from_index()
* add_file_to_index()
@@ -18,4 +120,6 @@ Talk about <read-cache.c> and <cache-tree.c>, things like:
* cache_tree_invalidate_path()
* cache_tree_update()
+
+
(JC, Linus)
--
1.8.3.453.g1dfc63d
next prev parent reply other threads:[~2013-07-08 20:54 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-07 8:11 [PATCH 00/22] Index v5 Thomas Gummerer
2013-07-07 8:11 ` [PATCH 01/22] t2104: Don't fail for index versions other than [23] Thomas Gummerer
2013-07-07 8:11 ` [PATCH 02/22] read-cache: split index file version specific functionality Thomas Gummerer
2013-07-07 8:11 ` [PATCH 03/22] read-cache: move index v2 specific functions to their own file Thomas Gummerer
2013-07-07 8:11 ` [PATCH 04/22] read-cache: Re-read index if index file changed Thomas Gummerer
2013-07-07 8:11 ` [PATCH 05/22] read-cache: add index reading api Thomas Gummerer
2013-07-08 2:01 ` Duy Nguyen
2013-07-08 11:40 ` Thomas Gummerer
2013-07-08 2:19 ` Duy Nguyen
2013-07-08 11:20 ` Thomas Gummerer
2013-07-08 12:45 ` Duy Nguyen
2013-07-08 13:37 ` Thomas Gummerer
2013-07-08 20:54 ` Thomas Gummerer [this message]
2013-07-09 15:42 ` [PATCH 5.5/22] Add documentation for the index api Duy Nguyen
2013-07-09 20:10 ` Thomas Gummerer
2013-07-10 5:28 ` Duy Nguyen
2013-07-11 11:30 ` Thomas Gummerer
2013-07-11 11:42 ` Duy Nguyen
2013-07-11 12:27 ` Duy Nguyen
2013-07-08 16:36 ` [PATCH 05/22] read-cache: add index reading api Junio C Hamano
2013-07-08 20:10 ` Thomas Gummerer
2013-07-08 23:09 ` Junio C Hamano
2013-07-09 20:13 ` Thomas Gummerer
2013-07-07 8:11 ` [PATCH 06/22] make sure partially read index is not changed Thomas Gummerer
2013-07-08 16:31 ` Junio C Hamano
2013-07-08 18:33 ` Thomas Gummerer
2013-07-07 8:11 ` [PATCH 07/22] dir.c: use index api Thomas Gummerer
2013-07-07 8:11 ` [PATCH 08/22] tree.c: " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 09/22] name-hash.c: " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 10/22] grep.c: Use " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 11/22] ls-files.c: use the " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 12/22] read-cache: make read_blob_data_from_index use " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 13/22] documentation: add documentation of the index-v5 file format Thomas Gummerer
2013-07-11 10:39 ` Duy Nguyen
2013-07-11 11:39 ` Thomas Gummerer
2013-07-11 11:47 ` Duy Nguyen
2013-07-11 12:26 ` Thomas Gummerer
2013-07-11 12:50 ` Duy Nguyen
2013-07-07 8:11 ` [PATCH 14/22] read-cache: make in-memory format aware of stat_crc Thomas Gummerer
2013-07-07 8:11 ` [PATCH 15/22] read-cache: read index-v5 Thomas Gummerer
2013-07-07 20:18 ` Eric Sunshine
2013-07-08 11:40 ` Thomas Gummerer
2013-07-07 8:11 ` [PATCH 16/22] read-cache: read resolve-undo data Thomas Gummerer
2013-07-07 8:11 ` [PATCH 17/22] read-cache: read cache-tree in index-v5 Thomas Gummerer
2013-07-07 20:41 ` Eric Sunshine
2013-07-07 8:11 ` [PATCH 18/22] read-cache: write index-v5 Thomas Gummerer
2013-07-07 20:43 ` Eric Sunshine
2013-07-07 8:11 ` [PATCH 19/22] read-cache: write index-v5 cache-tree data Thomas Gummerer
2013-07-07 8:11 ` [PATCH 20/22] read-cache: write resolve-undo data for index-v5 Thomas Gummerer
2013-07-07 8:11 ` [PATCH 21/22] update-index.c: rewrite index when index-version is given Thomas Gummerer
2013-07-07 8:12 ` [PATCH 22/22] p0003-index.sh: add perf test for the index formats Thomas Gummerer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871u78rcw0.fsf@gmail.com \
--to=t.gummerer@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
--cc=pclouds@gmail.com \
--cc=robin.rosenberg@dewire.com \
--cc=trast@inf.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).