From: Karsten Blees <karsten.blees@gmail.com>
To: Junio C Hamano <gitster@pobox.com>,
Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Cc: Tanay Abhra <tanayabh@gmail.com>,
git@vger.kernel.org, Ramkumar Ramachandra <artagnon@gmail.com>,
Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
Subject: Re: [PATCH v3 2/3] config: add hashtable for config parsing & retrieval
Date: Wed, 25 Jun 2014 22:23:06 +0200 [thread overview]
Message-ID: <53AB2FAA.0@gmail.com> (raw)
In-Reply-To: <xmqq61joamcc.fsf@gitster.dls.corp.google.com>
Am 25.06.2014 20:13, schrieb Junio C Hamano:
> Ramsay Jones <ramsay@ramsay1.demon.co.uk> writes:
>
>> On 24/06/14 00:25, Junio C Hamano wrote:
>> ...
>>> Yup, that is a very good point. There needs an infrastructure to
>>> tie a set of files (i.e. the standard one being the chain of
>>> system-global /etc/gitconfig to repo-specific .git/config, and any
>>> custom one that can be specified by the caller like submodule code)
>>> to a separate hashmap; a future built-in submodule code would use
>>> two hashmaps aka "config-caches", one to manage the usual
>>> "configuration" and the other to manage the contents of the
>>> .gitmodules file.
>>>
>>
>> I had expected to see one hash table per file/blob, with the three
>> standard config hash tables linked together to implement the scope/
>> priority rules. (Well, these could be merged into one, as the current
>> code does, since that makes handling "multi" keys slightly easier).
>
> Again, good point. I think a rough outline of a design that take
> both
>
> (1) we may have to read two or more separate sets of "config like
> things" (e.g. the contents from the normal config system and
> the contents from the .gitmodules file) and
>
> (2) we may have to read two or more files that make up a logically
> single set of "config-like things" (e.g. the "normal config
> system" reads from three separate files)
>
> into account may look like this:
>
> * Each instance of in-core "config-like things" is expressed as a
> struct "config-set".
>
> * A "config-set" points at an ordered set of struct "config-file",
> each of which represents what was read and cached in-core from a
> file.
Is this additional complexity really necessary?
How would you handle included files? Split up the including file in before / after parts? I.e.
repo-config-file[include-to-end]
included-file
repo-config-file[top-to-include]
user-config-file
...
Looking up a single-valued key would then be O(n) (where n is the number of sruct config_file's in the config_set) rather than O(1).
Looking up a multi-valued key would involve joining values from all files, every time the value is looked up (dynamically allocating lists on the heap etc.).
The configuration is typically loaded once, followed by lots of lookups. So from a performance perspective, doing the merging at load time is sure better.
>
> * When we know or notice that a single file on the filesystem was
> modified, we do not have to invalidate the whole "config-set"
> that depends on the file; the "config-file" that corresponds to
> the file on the filesystem is invalidated instead.
>
What's the use case for this? Do you expect e.g. 'git gc' to detect changed depth/window size at run time and adjust the algorithm accordingly? Or do you just intend to cache parsed config data (the latter could be done by recording all involved file names and stats in the config-set and reloading the whole thing if any of the files change)?
next prev parent reply other threads:[~2014-06-25 20:23 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-23 10:11 [PATCH v3 0/3] git config cache & special querying api utilizing the cache Tanay Abhra
2014-06-23 10:11 ` [PATCH v3 1/3] string-list: add string_list initialiser helper functions Tanay Abhra
2014-06-23 12:36 ` Torsten Bögershausen
2014-06-23 13:19 ` Tanay Abhra
2014-06-23 10:11 ` [PATCH v3 2/3] config: add hashtable for config parsing & retrieval Tanay Abhra
2014-06-23 11:55 ` Matthieu Moy
2014-06-24 12:06 ` Tanay Abhra
2014-06-25 20:25 ` Karsten Blees
2014-06-23 14:57 ` Ramsay Jones
2014-06-23 16:20 ` Tanay Abhra
2014-06-24 15:32 ` Ramsay Jones
2014-06-26 16:15 ` Matthieu Moy
2014-06-23 23:25 ` Junio C Hamano
2014-06-24 7:23 ` Tanay Abhra
2014-06-25 18:21 ` Junio C Hamano
2014-06-24 7:25 ` Tanay Abhra
2014-06-24 15:57 ` Ramsay Jones
2014-06-25 18:13 ` Junio C Hamano
2014-06-25 20:23 ` Karsten Blees [this message]
2014-06-25 20:53 ` Junio C Hamano
2014-06-26 17:37 ` Matthieu Moy
2014-06-26 19:00 ` Junio C Hamano
2014-06-26 19:19 ` Karsten Blees
2014-06-26 21:21 ` Junio C Hamano
2014-06-27 8:19 ` Karsten Blees
2014-06-27 8:19 ` Matthieu Moy
2014-06-27 17:13 ` Junio C Hamano
2014-06-23 23:14 ` Junio C Hamano
2014-06-24 12:21 ` Tanay Abhra
2014-06-26 16:27 ` Matthieu Moy
2014-06-25 21:44 ` Karsten Blees
2014-06-26 16:43 ` Matthieu Moy
2014-06-23 10:11 ` [PATCH v3 3/3] test-config: add usage examples for non-callback query functions Tanay Abhra
2014-06-25 11:19 ` Eric Sunshine
2014-06-26 8:40 ` Tanay Abhra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53AB2FAA.0@gmail.com \
--to=karsten.blees@gmail.com \
--cc=Matthieu.Moy@grenoble-inp.fr \
--cc=artagnon@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ramsay@ramsay1.demon.co.uk \
--cc=tanayabh@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).