All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Thomas Rast <trast@inf.ethz.ch>,
	Michael Haggerty <mhagger@alum.mit.edu>,
	Junio C Hamano <gitster@pobox.com>,
	Robin Rosenberg <robin.rosenberg@dewire.com>
Subject: [PATCH 5.5/22] Add documentation for the index api
Date: Mon, 08 Jul 2013 22:54:23 +0200	[thread overview]
Message-ID: <871u78rcw0.fsf@gmail.com> (raw)
In-Reply-To: <CACsJy8CtOWjpxKuNhQXYjPAv8MU0U6yTBEuQeqm0kxqVne6NjQ@mail.gmail.com>

Document the new index api and add examples of how it should be used
instead of the old functions directly accessing the index.

Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---

Duy Nguyen <pclouds@gmail.com> writes:

> Hmm.. I was confused actually (documentation on the api would help
> greatly).

As promised, a draft for a documentation for the index api as it is in
this series.

Documentation/technical/api-in-core-index.txt | 108 +++++++++++++++++++++++++-
 1 file changed, 106 insertions(+), 2 deletions(-)

diff --git a/Documentation/technical/api-in-core-index.txt b/Documentation/technical/api-in-core-index.txt
index adbdbf5..5269bb1 100644
--- a/Documentation/technical/api-in-core-index.txt
+++ b/Documentation/technical/api-in-core-index.txt
@@ -1,14 +1,116 @@
 in-core index API
 =================

+Reading API
+-----------
+
+`read_index()`::
+	Read the whole index file from disk.
+
+`index_name_pos(name, namelen)`::
+	Find a cache_entry with name in the index.  Returns pos if an
+	entry is matched exactly and -pos-1 if an entry is matched
+	partially.
+	e.g.
+	index:
+	file1
+	file2
+	path/file1
+	zzz
+
+	index_name_pos("path/file1", 10) returns 2, while
+	index_name_pos("path", 4) returns -1
+
+`read_index_filtered(opts)`::
+	This method behaves differently for index-v2 and index-v5.
+
+	For index-v2 it simply reads the whole index as read_index()
+	does, so we are sure we don't have to reload anything if the
+	user wants a different filter.  It also sets the filter_opts
+	in the index_state, which is used to limit the results when
+	iterating over the index with for_each_index_entry().
+
+	The whole index is read to avoid the need to eventually
+	re-read the index later, because the performance is no
+	different when reading it partially.
+
+	For index-v5 it creates an adjusted_pathspec to filter the
+	reading.  First all the directory entries are read and then
+	the cache_entries in the directories that match the adjusted
+	pathspec are read.  The filter_opts in the index_state are set
+	to filter out the rest of the cache_entries that are matched
+	by the adjusted pathspec but not by the pathspec given.  The
+	rest of the index entries are filtered out when iterating over
+	the cache with for_each_index_entries.
+
+`get_index_entry_by_name(name, namelen, &ce)`::
+	Returns a cache_entry matched by the name, returned via the
+	&ce parameter.  If a cache entry is matched exactly, 1 is
+	returned, otherwise 0.  For an example see index_name_pos().
+	This function should be used instead of the index_name_pos()
+	function to retrieve cache entries.
+
+`for_each_index_entry(fn, cb_data)`::
+	Iterates over all cache_entries in the index filtered by
+	filter_opts in the index_stat.  For each cache entry fn is
+	executed with cb_data as callback data.  From within the loop
+	do `return 0` to continue, or `return 1` to break the loop.
+
+`next_index_entry(ce)`::
+	Returns the cache_entry that follows after ce
+
+`index_change_filter_opts(opts)`::
+	This function again has a slightly different functionality for
+	index-v2 and index-v5.
+
+	For index-v2 it simply changes the filter_opts, so
+	for_each_index_entry uses the changed index_opts, to iterate
+	over a different set of cache entries.
+
+	For index-v5 it refreshes the index if the filter_opts have
+	changed and sets the new filter_opts in the index state, again
+	to iterate over a different set of cache entries as with
+	index-v2.
+
+	This has some optimization potential, in the case that the
+	opts get stricter (less of the index should be read) it
+	doesn't have to reload anything, but currently does.
+
+Using the new index api
+-----------------------
+
+Currently loops over a specific set of index entry were written as:
+  i = start_index;
+  while (i < active_nr) { ce = active_cache[i]; do(something); i++; }
+
+they should be rewritten to:
+  ce = start;
+  while (ce) { do(something); ce = next_cache_entry(ce); }
+
+which is the equivalent operation but hides the in-memory format of
+the index from the user.
+
+For getting a cache entry get_cache_entry_by_name() should be used
+instead of cache_name_pos(). e.g.:
+  int pos = cache_name_pos(name, namelen);
+  struct cache_entry *ce = active_cache[pos];
+  if (pos < 0) { do(something) }
+  else { do(somethingelse) }
+
+should be written as:
+  struct cache_entry *ce;
+  int ret = get_cache_entry_by_name(name, namelen, &ce);
+  if (!ret) { do(something) }
+  else { do(somethingelse) }
+
+TODO
+----
 Talk about <read-cache.c> and <cache-tree.c>, things like:

 * cache -> the_index macros
-* read_index()
 * write_index()
 * ie_match_stat() and ie_modified(); how they are different and when to
   use which.
-* index_name_pos()
 * remove_index_entry_at()
 * remove_file_from_index()
 * add_file_to_index()
@@ -18,4 +120,6 @@ Talk about <read-cache.c> and <cache-tree.c>, things like:
 * cache_tree_invalidate_path()
 * cache_tree_update()

+
+
 (JC, Linus)
--
1.8.3.453.g1dfc63d

  parent reply	other threads:[~2013-07-08 20:54 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-07  8:11 [PATCH 00/22] Index v5 Thomas Gummerer
2013-07-07  8:11 ` [PATCH 01/22] t2104: Don't fail for index versions other than [23] Thomas Gummerer
2013-07-07  8:11 ` [PATCH 02/22] read-cache: split index file version specific functionality Thomas Gummerer
2013-07-07  8:11 ` [PATCH 03/22] read-cache: move index v2 specific functions to their own file Thomas Gummerer
2013-07-07  8:11 ` [PATCH 04/22] read-cache: Re-read index if index file changed Thomas Gummerer
2013-07-07  8:11 ` [PATCH 05/22] read-cache: add index reading api Thomas Gummerer
2013-07-08  2:01   ` Duy Nguyen
2013-07-08 11:40     ` Thomas Gummerer
2013-07-08  2:19   ` Duy Nguyen
2013-07-08 11:20     ` Thomas Gummerer
2013-07-08 12:45       ` Duy Nguyen
2013-07-08 13:37         ` Thomas Gummerer
2013-07-08 20:54         ` Thomas Gummerer [this message]
2013-07-09 15:42           ` [PATCH 5.5/22] Add documentation for the index api Duy Nguyen
2013-07-09 20:10             ` Thomas Gummerer
2013-07-10  5:28               ` Duy Nguyen
2013-07-11 11:30                 ` Thomas Gummerer
2013-07-11 11:42                   ` Duy Nguyen
2013-07-11 12:27                     ` Duy Nguyen
2013-07-08 16:36   ` [PATCH 05/22] read-cache: add index reading api Junio C Hamano
2013-07-08 20:10     ` Thomas Gummerer
2013-07-08 23:09       ` Junio C Hamano
2013-07-09 20:13         ` Thomas Gummerer
2013-07-07  8:11 ` [PATCH 06/22] make sure partially read index is not changed Thomas Gummerer
2013-07-08 16:31   ` Junio C Hamano
2013-07-08 18:33     ` Thomas Gummerer
2013-07-07  8:11 ` [PATCH 07/22] dir.c: use index api Thomas Gummerer
2013-07-07  8:11 ` [PATCH 08/22] tree.c: " Thomas Gummerer
2013-07-07  8:11 ` [PATCH 09/22] name-hash.c: " Thomas Gummerer
2013-07-07  8:11 ` [PATCH 10/22] grep.c: Use " Thomas Gummerer
2013-07-07  8:11 ` [PATCH 11/22] ls-files.c: use the " Thomas Gummerer
2013-07-07  8:11 ` [PATCH 12/22] read-cache: make read_blob_data_from_index use " Thomas Gummerer
2013-07-07  8:11 ` [PATCH 13/22] documentation: add documentation of the index-v5 file format Thomas Gummerer
2013-07-11 10:39   ` Duy Nguyen
2013-07-11 11:39     ` Thomas Gummerer
2013-07-11 11:47       ` Duy Nguyen
2013-07-11 12:26         ` Thomas Gummerer
2013-07-11 12:50           ` Duy Nguyen
2013-07-07  8:11 ` [PATCH 14/22] read-cache: make in-memory format aware of stat_crc Thomas Gummerer
2013-07-07  8:11 ` [PATCH 15/22] read-cache: read index-v5 Thomas Gummerer
2013-07-07 20:18   ` Eric Sunshine
2013-07-08 11:40     ` Thomas Gummerer
2013-07-07  8:11 ` [PATCH 16/22] read-cache: read resolve-undo data Thomas Gummerer
2013-07-07  8:11 ` [PATCH 17/22] read-cache: read cache-tree in index-v5 Thomas Gummerer
2013-07-07 20:41   ` Eric Sunshine
2013-07-07  8:11 ` [PATCH 18/22] read-cache: write index-v5 Thomas Gummerer
2013-07-07 20:43   ` Eric Sunshine
2013-07-07  8:11 ` [PATCH 19/22] read-cache: write index-v5 cache-tree data Thomas Gummerer
2013-07-07  8:11 ` [PATCH 20/22] read-cache: write resolve-undo data for index-v5 Thomas Gummerer
2013-07-07  8:11 ` [PATCH 21/22] update-index.c: rewrite index when index-version is given Thomas Gummerer
2013-07-07  8:12 ` [PATCH 22/22] p0003-index.sh: add perf test for the index formats Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871u78rcw0.fsf@gmail.com \
    --to=t.gummerer@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=pclouds@gmail.com \
    --cc=robin.rosenberg@dewire.com \
    --cc=trast@inf.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.