From: Thomas Gummerer <t.gummerer@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, trast@inf.ethz.ch, mhagger@alum.mit.edu,
pclouds@gmail.com, robin.rosenberg@dewire.com
Subject: Re: [PATCH 05/22] read-cache: add index reading api
Date: Mon, 08 Jul 2013 22:10:58 +0200 [thread overview]
Message-ID: <874nc4rewd.fsf@gmail.com> (raw)
In-Reply-To: <7va9lx100l.fsf@alter.siamese.dyndns.org>
Junio C Hamano <gitster@pobox.com> writes:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
>
>> Add an api for access to the index file. Currently there is only a very
>> basic api for accessing the index file, which only allows a full read of
>> the index, and lets the users of the data filter it. The new index api
>> gives the users the possibility to use only part of the index and
>> provides functions for iterating over and accessing cache entries.
>>
>> This simplifies future improvements to the in-memory format, as changes
>> will be concentrated on one file, instead of the whole git source code.
>>
>> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
>> ---
>> cache.h | 57 +++++++++++++++++++++++++++++-
>> read-cache-v2.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++--
>> read-cache.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++++----
>> read-cache.h | 12 ++++++-
>> 4 files changed, 263 insertions(+), 10 deletions(-)
>>
>> diff --git a/cache.h b/cache.h
>> index 5082b34..d38dfbd 100644
>> --- a/cache.h
>> +++ b/cache.h
>> @@ -127,7 +127,8 @@ struct cache_entry {
>> unsigned int ce_flags;
>> unsigned int ce_namelen;
>> unsigned char sha1[20];
>> - struct cache_entry *next;
>> + struct cache_entry *next; /* used by name_hash */
>> + struct cache_entry *next_ce; /* used to keep a list of cache entries */
>
> The reader often needs to rewind the read-pointer partially while
> walking the index (e.g. next_cache_entry() in unpack-trees.c and how
> the o->cache_bottom position is used throughout the subsystem). I
> am not sure if this singly-linked list is a good way to go.
I'm not very familiar with the unpack-trees code, but from a quick look
the pointer (or position in the cache) is always only moved forward. A
problem I do see though is skipping a number of entries at once. An
example for that below:
int matches;
matches = cache_tree_matches_traversal(o->src_index->cache_tree,
names, info);
/*
* Everything under the name matches; skip the
* entire hierarchy. diff_index_cached codepath
* special cases D/F conflicts in such a way that
* it does not do any look-ahead, so this is safe.
*/
if (matches) {
o->cache_bottom += matches;
return mask;
}
This could probably be transformed into something like
skip_cache_tree_matches(cache-tree, names, info);
I'll take some time to familiarize myself with the unpack-trees code to
see if I can find a better solution than this, and if there are more
pitfalls.
>> +/*
>> + * Options by which the index should be filtered when read partially.
>> + *
>> + * pathspec: The pathspec which the index entries have to match
>> + * seen: Used to return the seen parameter from match_pathspec()
>> + * max_prefix, max_prefix_len: These variables are set to the longest
>> + * common prefix, the length of the longest common prefix of the
>> + * given pathspec
>
> These probably should use "struct pathspec" abstration, not just the
> "array of raw strings", no?
Yes, thanks, that's probably a good idea.
next prev parent reply other threads:[~2013-07-08 20:11 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-07 8:11 [PATCH 00/22] Index v5 Thomas Gummerer
2013-07-07 8:11 ` [PATCH 01/22] t2104: Don't fail for index versions other than [23] Thomas Gummerer
2013-07-07 8:11 ` [PATCH 02/22] read-cache: split index file version specific functionality Thomas Gummerer
2013-07-07 8:11 ` [PATCH 03/22] read-cache: move index v2 specific functions to their own file Thomas Gummerer
2013-07-07 8:11 ` [PATCH 04/22] read-cache: Re-read index if index file changed Thomas Gummerer
2013-07-07 8:11 ` [PATCH 05/22] read-cache: add index reading api Thomas Gummerer
2013-07-08 2:01 ` Duy Nguyen
2013-07-08 11:40 ` Thomas Gummerer
2013-07-08 2:19 ` Duy Nguyen
2013-07-08 11:20 ` Thomas Gummerer
2013-07-08 12:45 ` Duy Nguyen
2013-07-08 13:37 ` Thomas Gummerer
2013-07-08 20:54 ` [PATCH 5.5/22] Add documentation for the index api Thomas Gummerer
2013-07-09 15:42 ` Duy Nguyen
2013-07-09 20:10 ` Thomas Gummerer
2013-07-10 5:28 ` Duy Nguyen
2013-07-11 11:30 ` Thomas Gummerer
2013-07-11 11:42 ` Duy Nguyen
2013-07-11 12:27 ` Duy Nguyen
2013-07-08 16:36 ` [PATCH 05/22] read-cache: add index reading api Junio C Hamano
2013-07-08 20:10 ` Thomas Gummerer [this message]
2013-07-08 23:09 ` Junio C Hamano
2013-07-09 20:13 ` Thomas Gummerer
2013-07-07 8:11 ` [PATCH 06/22] make sure partially read index is not changed Thomas Gummerer
2013-07-08 16:31 ` Junio C Hamano
2013-07-08 18:33 ` Thomas Gummerer
2013-07-07 8:11 ` [PATCH 07/22] dir.c: use index api Thomas Gummerer
2013-07-07 8:11 ` [PATCH 08/22] tree.c: " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 09/22] name-hash.c: " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 10/22] grep.c: Use " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 11/22] ls-files.c: use the " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 12/22] read-cache: make read_blob_data_from_index use " Thomas Gummerer
2013-07-07 8:11 ` [PATCH 13/22] documentation: add documentation of the index-v5 file format Thomas Gummerer
2013-07-11 10:39 ` Duy Nguyen
2013-07-11 11:39 ` Thomas Gummerer
2013-07-11 11:47 ` Duy Nguyen
2013-07-11 12:26 ` Thomas Gummerer
2013-07-11 12:50 ` Duy Nguyen
2013-07-07 8:11 ` [PATCH 14/22] read-cache: make in-memory format aware of stat_crc Thomas Gummerer
2013-07-07 8:11 ` [PATCH 15/22] read-cache: read index-v5 Thomas Gummerer
2013-07-07 20:18 ` Eric Sunshine
2013-07-08 11:40 ` Thomas Gummerer
2013-07-07 8:11 ` [PATCH 16/22] read-cache: read resolve-undo data Thomas Gummerer
2013-07-07 8:11 ` [PATCH 17/22] read-cache: read cache-tree in index-v5 Thomas Gummerer
2013-07-07 20:41 ` Eric Sunshine
2013-07-07 8:11 ` [PATCH 18/22] read-cache: write index-v5 Thomas Gummerer
2013-07-07 20:43 ` Eric Sunshine
2013-07-07 8:11 ` [PATCH 19/22] read-cache: write index-v5 cache-tree data Thomas Gummerer
2013-07-07 8:11 ` [PATCH 20/22] read-cache: write resolve-undo data for index-v5 Thomas Gummerer
2013-07-07 8:11 ` [PATCH 21/22] update-index.c: rewrite index when index-version is given Thomas Gummerer
2013-07-07 8:12 ` [PATCH 22/22] p0003-index.sh: add perf test for the index formats Thomas Gummerer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874nc4rewd.fsf@gmail.com \
--to=t.gummerer@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
--cc=pclouds@gmail.com \
--cc=robin.rosenberg@dewire.com \
--cc=trast@inf.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).