Re: [PATCH v4 1/2] add `config_set` API for caching config files

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Tanay Abhra <tanayabh@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Ramkumar Ramachandra <artagnon@gmail.com>,
	Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>,
	Karsten Blees <karsten.blees@gmail.com>,
	Eric Sunshine <sunshine@sunshineco.com>
Subject: Re: [PATCH v4 1/2] add `config_set` API for caching config files
Date: Thu, 03 Jul 2014 22:35:13 +0530	[thread overview]
Message-ID: <53B58D49.8010409@gmail.com> (raw)
In-Reply-To: <xmqqd2dnitm7.fsf@gitster.dls.corp.google.com>



On 7/2/2014 10:30 PM, Junio C Hamano wrote:
> Tanay Abhra <tanayabh@gmail.com> writes:
> 
>> diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt
>> index 230b3a0..2c02fee 100644
>> --- a/Documentation/technical/api-config.txt
>> +++ b/Documentation/technical/api-config.txt
>> @@ -77,6 +77,75 @@ To read a specific file in git-config format, use
>> +`git_config_get_value` returns 0 on success, or -1 for no value found.
>> +
>> +`git_config_get_value_multi` allocates and returns a `string_list`
>> +containing all the values for the key passed as parameter, sorted in order
>> +of increasing priority (Note: caller has to call `string_list_clear` on
>> +the returned pointer and then free it).
>> +
>> +`git_config_get_value_multi` returns NULL for no value found.
>> +
>> +`git_config_clear` is provided for resetting and invalidating the cache.
>> +
>> +`git_config_iter` gives a way to iterate over the keys in cache. Existing
>> +`git_config` callback function signature is used for iterating.
> 
> Instead of doing prose above and then enumeration below,
> consistently using the enumeration-style would make the descriptions
> of API functions easier to find and read.  Also show what parameters
> each function takes.  E.g.
> 

Noted.

>                 
> A few random thoughts.
> 
>  - Are "a value for the variable is found" and "no value for the
>    variable is found" the only possible outcome?  Should somebody
>    (not necessarily the calling code) be notified of an error---for
>    example, if you opened a config file successfully but later found
>    that the file could not be parsed due to a syntax error or an I/O
>    error, is it possible the caller of this function to tell, or are
>    these just some special cases of "variable not found"?

The syntactical errors when parsing the file are shown when it fails to
construct the hashmap. Whenever a caller calls for a value for the first
time, the file is read and parsed and if it fails during parsing it dies
raising a error specifying the lineno and filename.

Variable not found means that the variable is not present in the file,
nothing more. Note: the hashmap code uses git_config() parsing stack
so whatever error it raises in normal git_config() invocation, it
would be raised here too.

>  - The same goes for the "multi" variant but it is a bit more grave,
>    as a function that returns an "int" can later be updated to
>    return different negative values to signal different kinds of
>    errors, but a function that returns a pointer to string-list can
>    only return one kind of NULL ;-)
>

Null pointer just means no variable found in the files. What other kind
of errors may arise?

>  - For the purpose of "git_config_get_value()", what is the value
>    returned for core.filemode when the configuration file says
>    "[core] filemode\n", i.e. when git_config() callback would get a
>    NULL for value to signal a boolean true?

NULL in value pointer with 0 return value denoting variable found.

>  - Is there a reason why you need a separate and new "config_iter"?
>    What does it do differently from the good-old git_config() and
>    how?  It does not belong to "Querying for Specific Variables"
>    anyway.
>

You mentioned previously in the list for a analogue to git_config()
which supports iterating over the keys.
The link is [1] I think, gmane is not working for me currently.

http://thread.gmane.org/gmane.comp.version-control.git/252567

>> +The config API also provides type specific API functions which do conversion
>> +as well as retrieval for the queried variable, including:
>> +
>> +`git_config_get_int`::
>> +Parse the retrieved value to an integer, including unit factors. Dies on
>> +error; otherwise, allocates and copies the integer into the dest parameter.
>> +Returns 0 on success, or 1 if no value was found.
> 
> "allocates and copies"????  Is a caller forced to do something like
> this?
> 
> 	int *val;
> 
> 	if (!git_config_get_int("int.one", &val)) {
>         	use(*val);
>                 free(val);
> 	}
> 
> I'd expect it to be more like:
> 
> 	int val;
>         if (!git_config_get_int("int.one", &val))
>         	use(val);
>

Yup, you are right, my fault.

> Also, is there a reason why the "not found" signal is "1" (as
> opposed to "-1" for the lower-level get_value() API)?
> 

Many of the type specific functions return -1 for wrongful parsing
like git_config_get_string and git_config_get_maybe_bool, that is
why I changed the return value for such functions.

>> +Custom Configsets
>> +-----------------
>> +
>> +A `config_set` points at an ordered set of config files, each of
>> +which represents what was read and cached in-core from a file.
> 
> This over-specifies the implementation.  Judging from the list of
> API functions below, an implementation is possible without having to
> keep track of what source files were fed in what order (i.e. it can
> just keep a single hash-table of read values and incrementally parse
> the new contents given via configset-add-file into it, without even
> recording the origins, and forget about the source files once it
> finishes parsing).
> 
> As was pointed out during the previous review round, a stack of
> <hash, filename> tuples cannot represent the configuration when
> config-includes are involved.  Also we would eventually want to be
> able to also read from non-file sources (e.g. from a blob, or a
> caller-supplied in-core strings).
> 
> For the purpose of reporting errors at the point of value being
> used, I have a suspicion that you would need to associate the
> <file,line> pair with each individual value string you will keep
> after configset_add_file() parses the file.  The implementation of
> the call-chain may look like:
> 
> 
> 	git_configset_add_file(cs, filename):
>         	file = open filename
>                 lineno = 0
> 		while read line from file:
> 	                git_configset_parse_line(cs, filename, lineno++, line)
> 		close file
> 
> 	git_configset_parse_line(cs, filename, lineno, line):
>         	... this needs to implement a state machine
>                 ... that keeps track of what has been read halfway
>                 ... e.g. we may have seen "[core] crlf =\\\n"
>         	... earlier, which is not a complete input yet,
>                 ... and now we may be looking at "auto" that lets
>                 ... us finally parse one item.
> 
> 	        if state in 'cs' and new 'line' gives us a complete input:
> 			determine key and value
>                         record <value, filename, lineno> for 'key' to cs.hash
>                 update state in 'cs'
> 
> For processing "git -c section.variable=value", we probably would
> call git_configset_parse_line() on the_configset with filename
> set to "command line" or something.
> 
> And then when the caller tries to actually use the value, e.g.
> 
>          git_configset_get_int(cs, key):
> 		look up <value, filename, lineno> for 'key' from the cs.hash
> 		if value is successfully parsed as int:
>                 	return the parsed result
> 		else:
>                 	error("not an int: '%s' (%s:%s)", value, filename, lineno)
> 

Okay, lets see what I can do with it.

> 
>> +It can be used to construct an in-memory cache for config files that
>> +the caller specifies. For example,
> 
> This is almost good to help a reader decide if she might want to use
> it in her code, but we probably want to stress the fact that use of
> a config_set is done for a namespace separate from the main
> configuration system, e.g. ".gitmodules".
> 
>> +---------------------------------------
>> +struct config_set gm_config = CONFIG_SET_INIT;
>> +int b;
>> +/* we add config files to the config_set */
>> +git_configset_add_file(&gm_config, ".gitmodules");
>> +git_configset_add_file(&gm_config, ".gitmodules_alt");
>> +
>> +if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore", &b)) {
>> +/* hack hack hack */
>> +}
>> +
>> +/* when we are done with the configset */
>> +git_configset_clear(&gm_config);
>> +----------------------------------------
>> +
>> +Configset API provides functions for the above mentioned work flow, including:
>> +
>> +`git_configset_init`::
>> +
>> +Returns an allocated and initialized struct `config_set` pointer.
> 
> "allocated"???  Is a caller forced to do this, i.e. always have the
> config-set on heap?
> 
> 	struct config_set *config = git_configset_init();
>         ... use it ...
>         git_configset_clear(config);
> 
> I'd expect it be more like this:
> 
> 	struct config_set config;
> 
> 	git_configset_init(&config);
>         ... use it...
>         git_configset_clear(&config);
> 

Yup, I prefer the second one.
Thanks for the review,
Tanay Abhra.

next prev parent reply	other threads:[~2014-07-03 17:05 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-02  6:01 [PATCH v4 0/3] git config cache & special querying api utilizing the cache Tanay Abhra
2014-07-02  6:01 ` [PATCH v4 1/2] add `config_set` API for caching config files Tanay Abhra
2014-07-02  9:14   ` Matthieu Moy
2014-07-02 11:58     ` Tanay Abhra
2014-07-02 14:32       ` Matthieu Moy
2014-07-02 17:09     ` Junio C Hamano
2014-07-04  4:58     ` Tanay Abhra
2014-07-04  9:17       ` Matthieu Moy
2014-07-04  9:25         ` Tanay Abhra
2014-07-04  9:43           ` Matthieu Moy
2014-07-07 17:02           ` Junio C Hamano
2014-07-02 17:00   ` Junio C Hamano
2014-07-02 17:09     ` Matthieu Moy
2014-07-03 17:05     ` Tanay Abhra [this message]
2014-07-02  6:01 ` [PATCH 2/2] test-config: Add tests for the config_set API Tanay Abhra
2014-07-02  9:29   ` Matthieu Moy
2014-07-02 12:04     ` Tanay Abhra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53B58D49.8010409@gmail.com \
    --to=tanayabh@gmail.com \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karsten.blees@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.