* [PATCH v4 0/3] git config cache & special querying api utilizing the cache @ 2014-07-02 6:01 Tanay Abhra 2014-07-02 6:01 ` [PATCH v4 1/2] add `config_set` API for caching config files Tanay Abhra 2014-07-02 6:01 ` [PATCH 2/2] test-config: Add tests for the config_set API Tanay Abhra 0 siblings, 2 replies; 17+ messages in thread From: Tanay Abhra @ 2014-07-02 6:01 UTC (permalink / raw) To: git Cc: Tanay Abhra, Ramkumar Ramachandra, Matthieu Moy, Karsten Blees, Junio C Hamano, Eric Sunshine Hi, [PATCH v4]: Introduced `config_set` construct which points to a ordered set of config-files cached as hashmaps. Added relevant API functions. For more details see the documentation. Rewrote the git_config_get* family to use `config_set` internally. Added tests for both config_set API and git_config_get family. Added type specific API functions which parses the found value and converts it into a specific type. Most of the changes implemented are the result of discussion in [6]. Thanks to Eric, Ramsay, Junio, Matthieu & Karsten for their suggestions and review. [PATCH v3]: Added flag for NULL values that were causing segfaults in some cases. Added test-config for usage examples. Minor changes and corrections. Refer to discussion thread[5] for more details. Thanks to Matthieu, Jeff and Junio for their valuable suggestions. [PATCH v2]:Changed the string_list to a struct instead of pointer to a struct. Added string-list initilization functions. Minor mistakes corrected acoording to review comments[4]. Thanks to Eric and Matthieu for their review. [PATCH V1]:Most of the invaluable suggestions by Eric Sunshine, Torsten Bogershausen and Jeff King has been implemented[1]. Complete rewrite of config_cache*() family using git_config() as hook as suggested by Jeff. Thanks for the review. [RFC V2]: Improved according to the suggestions by Eric Sunshine and Torsten Bogershausen. Added cache invalidation when config file is changed.[2] I am using git_config_set_multivar_in_file() as an update hook. This is patch is for this year's GSoC. My project is "Git Config API improvements". The link of my proposal is appended below [3]. The aim of this patch series is to generate a cache for querying values from the config files in a non-callback manner as the current method reads and parses the config files every time a value is queried for. The cache is generated from hooking the update_cache function to the current parsing and callback mechanism in config.c. It is implemented as an hashmap using the hashmap-api with variables and its corresponding values list as its members. The values in the list are sorted in order of increasing priority. The cache is initialised the first time when any of the new query functions is called. It is invalidated by using git_config_set_multivar_in_file() as an update hook. We add two new functions to the config-api, git_config_get_string() and git_config_get_string_multi() for querying in a non callback manner from the cache. [1] http://marc.info/?t=140172066200006&r=1&w=2 [2] http://git.661346.n2.nabble.com/RFC-PATCH-0-2-Git-config-cache-amp-special-querying-api-utilizing-the-cache-td7611691.html [3] https://drive.google.com/file/d/0B4suZ-aHqDcnSUZJRXVTTnZUN1E/edit?usp=sharing [4] http://thread.gmane.org/gmane.comp.version-control.git/251073/focus=251369 [5] http://thread.gmane.org/gmane.comp.version-control.git/251704/ [6] http://thread.gmane.org/gmane.comp.version-control.git/252329/ Tanay Abhra (2): config-hash.c test-config .gitignore | 1 + Documentation/technical/api-config.txt | 144 ++++++++++++ Makefile | 2 + cache.h | 41 ++++ config-hash.c | 405 +++++++++++++++++++++++++++++++++ config.c | 3 + t/t1308-config-hash.sh | 163 +++++++++++++ test-config.c | 129 +++++++++++ 8 files changed, 888 insertions(+) create mode 100644 config-hash.c create mode 100755 t/t1308-config-hash.sh create mode 100644 test-config.c -- 1.9.0.GIT ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 6:01 [PATCH v4 0/3] git config cache & special querying api utilizing the cache Tanay Abhra @ 2014-07-02 6:01 ` Tanay Abhra 2014-07-02 9:14 ` Matthieu Moy 2014-07-02 17:00 ` Junio C Hamano 2014-07-02 6:01 ` [PATCH 2/2] test-config: Add tests for the config_set API Tanay Abhra 1 sibling, 2 replies; 17+ messages in thread From: Tanay Abhra @ 2014-07-02 6:01 UTC (permalink / raw) To: git Cc: Tanay Abhra, Ramkumar Ramachandra, Matthieu Moy, Karsten Blees, Junio C Hamano, Eric Sunshine Currently `git_config()` uses a callback mechanism and file rereads for config values. Due to this approach, it is not uncommon for the config files to be parsed several times during the run of a git program, with different callbacks picking out different variables useful to themselves. Add a `config_set` construct that points to an ordered set of config files specified by the user, each of which represents what was read and cached in core as hashmaps. Add two external functions `git_configset_get_value` and `git_configset_get_value_multi` for querying from the config sets, which follow `last one wins` semantic(i.e. if there are multiple matches for the queried key in the files of the configset the value returned will the last value in the last matching file of the configset) Add type specific query functions like `git_configset_get_bool` and similar. Add a default `config_set`, `the_config_set` to cache all key-value pairs read from usual config files(repo specific .git/config, user wide ~/.gitconfig and the global /etc/gitconfig). `the_config_set` uses a single hashmap populated using git_config(). The reason for doing so is twofold, one is to honour include directives, another is to guarantee O(1) lookups for usual config values, as values are queried for hundred of times during a run of a git program. Add two external functions `git_config_get_value` and `git_config_get_value_multi` for querying in a non-callback manner from `the_config_set`. Also, add type specific query functions that are implemented as a thin wrapper around the `config_set` API. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> --- Documentation/technical/api-config.txt | 144 ++++++++++++ Makefile | 1 + cache.h | 41 ++++ config-hash.c | 405 +++++++++++++++++++++++++++++++++ config.c | 3 + 5 files changed, 594 insertions(+) create mode 100644 config-hash.c diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt index 230b3a0..2c02fee 100644 --- a/Documentation/technical/api-config.txt +++ b/Documentation/technical/api-config.txt @@ -77,6 +77,75 @@ To read a specific file in git-config format, use `git_config_from_file`. This takes the same callback and data parameters as `git_config`. +Querying For Specific Variables +------------------------------- + +For programs wanting to query for specific variables in a non-callback +manner, the config API provides two functions `git_config_get_value` +and `git_config_get_value_multi`.They both read values from an internal +cache generated previously from reading the config files. + +`git_config_get_value` takes two parameters, + +- a key string in canonical flat form for which the corresponding value + with the highest priority (i.e. value in the repo config will be + preferred over value in user wide config for the same variable) will + be retrieved. + +- a pointer to a string which will point to the retrieved value. The caller + should not free or modify the value returned as it is owned by the cache. + +`git_config_get_value` returns 0 on success, or -1 for no value found. + +`git_config_get_value_multi` allocates and returns a `string_list` +containing all the values for the key passed as parameter, sorted in order +of increasing priority (Note: caller has to call `string_list_clear` on +the returned pointer and then free it). + +`git_config_get_value_multi` returns NULL for no value found. + +`git_config_clear` is provided for resetting and invalidating the cache. + +`git_config_iter` gives a way to iterate over the keys in cache. Existing +`git_config` callback function signature is used for iterating. + +The config API also provides type specific API functions which do conversion +as well as retrieval for the queried variable, including: + +`git_config_get_int`:: +Parse the retrieved value to an integer, including unit factors. Dies on +error; otherwise, allocates and copies the integer into the dest parameter. +Returns 0 on success, or 1 if no value was found. + +`git_config_get_ulong`:: +Identical to `git_config_get_int` but for unsigned longs. + +`git_config_bool`:: +Parse the retrieved value into a boolean value, respecting keywords like +"true" and "false". Integer values are converted into true/false values +(when they are non-zero or zero, respectively). Other values cause a die(). +If parsing is successful, allocates and copies the result into the dest +parameter. Returns 0 on success, or 1 if no value is found. + +`git_config_get_bool_or_int`:: +Same as `git_config_get_bool`, except that integers are copied as-is, and +an `is_bool` flag is unset. + +`git_config_get_maybe_bool`:: +Same as `git_config_get_bool`, except that it returns -1 on error rather +than dying. + +`git_config_get_string`:: +Allocates and copies the retrieved value string into the `dest` parameter; +if NULL string is given, prints an error message and returns -1. Returns 1, +if no value is found for the queried variable . + +`git_config_get_pathname`:: +Similar to `git_config_get_string`, but expands `~` or `~user` into the +user's home directory when found at the beginning of the path. + +See test-config.c for usage examples. + Value Parsing Helpers --------------------- @@ -134,6 +203,81 @@ int read_file_with_include(const char *file, config_fn_t fn, void *data) `git_config` respects includes automatically. The lower-level `git_config_from_file` does not. +Custom Configsets +----------------- + +A `config_set` points at an ordered set of config files, each of +which represents what was read and cached in-core from a file. +It can be used to construct an in-memory cache for config files that +the caller specifies. For example, + +--------------------------------------- +struct config_set gm_config = CONFIG_SET_INIT; +int b; +/* we add config files to the config_set */ +git_configset_add_file(&gm_config, ".gitmodules"); +git_configset_add_file(&gm_config, ".gitmodules_alt"); + +if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore", &b)) { +/* hack hack hack */ +} + +/* when we are done with the configset */ +git_configset_clear(&gm_config); +---------------------------------------- + +Configset API provides functions for the above mentioned work flow, including: + +`git_configset_init`:: + +Returns an allocated and initialized struct `config_set` pointer. + +`git_configset_add_file`:: + +Parses the file and constructs an in core cache which is appended at the end of +the config_set, dies if there is an error in parsing the file. + +`git_configset_get_value`:: + +Follows "last one wins" semantic, i.e., if there are multiple matches for the +queried key in the files of the configset the value returned will the last value +in the last matching file of the configset. + +It takes three parameters, + +- the `config_set` from which the value will be queried. + +- a key string in canonical flat form + +- a pointer to a string which will point to the retrieved value. The caller + should not free or modify the value returned as it is owned by the cache. + +`git_configset_get_value` returns 0 on success, or -1 for no value found. + +`git_configset_get_value_multi`:: + +`git_configset_get_value_multi` allocates and returns a `string_list` +containing all the values for the key passed as parameter, sorted in order +of increasing priority (Note: caller has to call `string_list_clear` on +the returned pointer and then free it). + +`git_configset_get_value_multi` returns NULL for no value found. + +'git_configset_iter':: + +It gives a way to iterate over the keys in config_set. +Existing `git_config` callback function signature is used for iterating. + +`git_configset_clear`:: +It is used to clear the config_set after use. If the `config_set` was dynamically +allocated using `git_configset_init`, the caller has to free the `config_set` +after calling `git_configset_clear`. + +In addition to above the config_set API provides type specific functions in the +vein of `git_config_get_int` but with an extra parameter, `config_set`. They +all behave similarly to the `git_config_get*()` family described in +"Querying For Specific Variables" above. + Writing Config Files -------------------- diff --git a/Makefile b/Makefile index 07ea105..d503f78 100644 --- a/Makefile +++ b/Makefile @@ -777,6 +777,7 @@ LIB_OBJS += commit.o LIB_OBJS += compat/obstack.o LIB_OBJS += compat/terminal.o LIB_OBJS += config.o +LIB_OBJS += config-hash.o LIB_OBJS += connect.o LIB_OBJS += connected.o LIB_OBJS += convert.o diff --git a/cache.h b/cache.h index df65231..005e945 100644 --- a/cache.h +++ b/cache.h @@ -7,6 +7,7 @@ #include "advice.h" #include "gettext.h" #include "convert.h" +#include "string-list.h" #include SHA1_HEADER #ifndef git_SHA_CTX @@ -1325,6 +1326,46 @@ extern int parse_config_key(const char *var, const char **subsection, int *subsection_len, const char **key); +/* config-hash.c */ +#define CONFIG_SET_INIT { NULL, 0, 0 } + +struct config_set { + struct config_set_item *item; + unsigned int nr, alloc; +}; + +struct config_set_item { + struct hashmap config_hash; + char *filename; +}; + +extern struct config_set *git_configset_init(void); +extern int git_configset_add_file(struct config_set *cs, const char *filename); +extern int git_configset_get_value(struct config_set *cs, const char *key, const char **value); +extern const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key); +extern void git_configset_clear(struct config_set *cs); +extern void git_configset_iter(struct config_set *cs, config_fn_t fn, void *data); +extern int git_configset_get_string(struct config_set *cs, const char *key, const char **dest); +extern int git_configset_get_int(struct config_set *cs, const char *key, int *dest); +extern int git_configset_get_ulong(struct config_set *cs, const char *key, unsigned long *dest); +extern int git_configset_get_bool(struct config_set *cs, const char *key, int *dest); +extern int git_configset_get_bool_or_int(struct config_set *cs, const char *key, int *is_bool, int *dest); +extern int git_configset_get_maybe_bool(struct config_set *cs, const char *key, int *dest); +extern int git_configset_get_pathname(struct config_set *cs, const char *key, const char **dest); + +extern int git_config_get_value(const char *key, const char **value); +extern const struct string_list *git_config_get_value_multi(const char *key); +extern void git_config_clear(void); +extern void git_config_iter(config_fn_t fn, void *data); +extern int git_config_get_string(const char *key, const char **dest); +extern int git_config_get_int(const char *key, int *dest); +extern int git_config_get_ulong(const char *key, unsigned long *dest); +extern int git_config_get_bool(const char *key, int *dest); +extern int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest); +extern int git_config_get_maybe_bool(const char *key, int *dest); +extern int git_config_get_pathname(const char *key, const char **dest); + + extern int committer_ident_sufficiently_given(void); extern int author_ident_sufficiently_given(void); diff --git a/config-hash.c b/config-hash.c new file mode 100644 index 0000000..3e45d03 --- /dev/null +++ b/config-hash.c @@ -0,0 +1,405 @@ +#include "cache.h" +#include "hashmap.h" +#include "string-list.h" + + +/* + * Default config_set that contains key-value pairs from the usual set of config + * config files (i.e repo specific .git/config, user wide ~/.gitconfig and the + * global /etc/gitconfig) + */ +static struct config_set the_config_set; +static int the_config_set_initialised = 0; + +struct config_hash_entry { + struct hashmap_entry ent; + char *key; + struct string_list value_list; +}; + +static int config_hash_add_value(const char *, const char *, + struct config_set_item *, struct config_set *); + +static int config_hash_entry_cmp(const struct config_hash_entry *e1, + const struct config_hash_entry *e2, const void *unused) +{ + return strcmp(e1->key, e2->key); +} + +static void config_hash_init(struct hashmap *config_hash) +{ + hashmap_init(config_hash, (hashmap_cmp_fn)config_hash_entry_cmp, 0); +} + +struct config_set_cb { + struct config_set *cs; + struct config_set_item *ct; +}; + +static int config_hash_callback(const char *key, const char *value, void *cb) +{ + struct config_set_cb *c = cb; + config_hash_add_value(key, value, c->ct, c->cs); + return 0; +} + +static struct hashmap *add_config_hash(const char *filename, struct config_set *cs) +{ + int ret = 0; + struct config_set_item *ct; + struct config_set_cb cb; + ALLOC_GROW(cs->item, cs->nr + 1, cs->alloc); + ct = &cs->item[cs->nr++]; + ct->filename = xstrdup(filename); + config_hash_init(&ct->config_hash); + cb.cs = cs; + cb.ct = ct; + /* + * When filename is "default", hashmap is constructed from the usual set of + * config files (i.e repo specific .git/config, user wide ~/.gitconfig and the + * global /etc/gitconfig), used in `the_config_set` + */ + if (!strcmp(filename, "default")) + git_config(config_hash_callback, &cb); + else + ret = git_config_from_file(config_hash_callback, filename, &cb); + if (!ret) + return &ct->config_hash; + else { + free(ct->filename); + hashmap_free(&ct->config_hash, 1); + cs->nr--; + return NULL; + } +} + +static struct hashmap *get_config_hash(const char *filename, struct config_set *cs) +{ + int i; + for(i = 0; i < cs->nr; i++) { + if (!strcmp(cs->item[i].filename, filename)) + return &cs->item[i].config_hash; + } + return add_config_hash(filename, cs); +} + +static struct config_hash_entry *config_hash_find_entry(const char *key, const char* filename, + struct config_set *cs) +{ + struct hashmap *config_hash; + struct config_hash_entry k; + struct config_hash_entry *found_entry; + char *normalized_key; + int ret; + config_hash = get_config_hash(filename, cs); + ret = git_config_parse_key(key, &normalized_key, NULL); + + if (ret) + return NULL; + + hashmap_entry_init(&k, strhash(normalized_key)); + k.key = normalized_key; + found_entry = hashmap_get(config_hash, &k, NULL); + free(normalized_key); + return found_entry; +} + +static struct string_list *config_hash_get_value(const char *key, const char* filename, + struct config_set *cs) +{ + struct config_hash_entry *e = config_hash_find_entry(key, filename, cs); + return e ? &e->value_list : NULL; +} + +static int config_hash_add_value(const char *key, const char *value, + struct config_set_item *ct, struct config_set *cs) +{ + struct hashmap *config_hash = &ct->config_hash; + struct config_hash_entry *e; + e = config_hash_find_entry(key, ct->filename, cs); + + if (!e) { + e = xmalloc(sizeof(*e)); + hashmap_entry_init(e, strhash(key)); + e->key = xstrdup(key); + memset(&e->value_list, 0, sizeof(e->value_list)); + e->value_list.strdup_strings = 1; + hashmap_add(config_hash, e); + } + /* + * Since the values are being fed by git_config*() callback mechanism, they + * are already normalised. So simply add them without any further munging. + */ + string_list_append_nodup(&e->value_list, value ? xstrdup(value) : NULL); + + return 0; +} + +struct config_set *git_configset_init(void) +{ + struct config_set *cs = xcalloc(1, sizeof(*cs)); + return cs; +} + +int git_configset_add_file(struct config_set *cs, const char *filename) +{ + return get_config_hash(filename, cs) ? 0 : -1; +} + +int git_configset_get_value(struct config_set *cs, const char *key, const char **value) +{ + int i; + struct string_list *values = NULL; + /* + * Follows "last one wins" semantic, i.e., if there are multiple matches for the + * queried key in the files of the configset the value returned will the last value + * in the last matching file of the configset + */ + assert(cs->nr > 0); + for (i = cs->nr-1; i >= 0; i--) { + values = config_hash_get_value(key, cs->item[i].filename, cs); + if (values) + break; + } + if (!values) + return -1; + assert(values->nr > 0); + *value = values->items[values->nr - 1].string; + return 0; +} + +const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key) +{ + int i, j; + struct string_list *values = NULL; + struct string_list *retval = xcalloc(1, sizeof(*retval)); + for (i = 0; i < cs->nr; i++) { + values = config_hash_get_value(key, cs->item[i].filename, cs); + if (values) + for(j = 0; j < values->nr; j++) + string_list_append_nodup(retval, values->items[j].string); + } + if (!retval->nr) { + free(retval); + return NULL; + } + return retval; +} + +/* + * Uses the existing git_config() callback function signature to iterate over + * keys in the configset. + */ +void git_configset_iter(struct config_set *cs, config_fn_t fn, void *data) +{ + int i; + char *value; + struct string_list *strptr; + struct config_hash_entry *entry; + struct hashmap *config_hash; + struct hashmap_iter iter; + for (i = 0; i < cs->nr; i++) { + config_hash = &cs->item[i].config_hash; + hashmap_iter_init(config_hash, &iter); + while ((entry = hashmap_iter_next(&iter))) { + strptr = &entry->value_list; + for (i = 0; i < strptr->nr; i++) { + value = strptr->items[i].string; + fn(entry->key, value, data); + } + } + } +} + +void git_configset_clear(struct config_set *cs) +{ + int i; + struct hashmap *config_hash; + struct config_hash_entry *entry; + struct hashmap_iter iter; + /* No files in config_set so return */ + if (!cs->nr) + return; + for (i = 0; i < cs->nr; i++) { + config_hash = &cs->item[i].config_hash; + free(cs->item[i].filename); + hashmap_iter_init(config_hash, &iter); + while ((entry = hashmap_iter_next(&iter))) { + free(entry->key); + string_list_clear(&entry->value_list, 0); + } + hashmap_free(config_hash, 1); + } + free(cs->item); + cs->item = NULL; + cs->nr = cs->alloc = 0; +} + +int git_configset_get_string(struct config_set *cs, const char *key, const char **dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) + return git_config_string(dest, key, value); + else + return 1; +} + +int git_configset_get_int(struct config_set *cs, const char *key, int *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_int(key, value); + return 0; + } else + return 1; +} + +int git_configset_get_ulong(struct config_set *cs, const char *key, unsigned long *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_ulong(key, value); + return 0; + } else + return 1; +} + +int git_configset_get_bool(struct config_set *cs, const char *key, int *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_bool(key, value); + return 0; + } else + return 1; +} + +int git_configset_get_bool_or_int(struct config_set *cs, const char *key, + int *is_bool, int *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_bool_or_int(key, value, is_bool); + return 0; + } else + return 1; +} + +int git_configset_get_maybe_bool(struct config_set *cs, const char *key, int *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_maybe_bool(key, value); + if (*dest == -1) + return -1; + return 0; + } else + return 1; +} + +int git_configset_get_pathname(struct config_set *cs, const char *key, const char **dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) + return git_config_pathname(dest, key, value); + else + return 1; +} + +int git_config_get_value(const char *key, const char **value) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_value(&the_config_set, key, value); +} + +const struct string_list *git_config_get_value_multi(const char *key) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_value_multi(&the_config_set, key); +} + +void git_config_clear(void) +{ + if (!the_config_set_initialised) + return; + git_configset_clear(&the_config_set); + the_config_set_initialised = 0; +} + +void git_config_iter(config_fn_t fn, void *data) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + git_configset_iter(&the_config_set, fn, data); +} + +int git_config_get_string(const char *key, const char **dest) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_string(&the_config_set, key, dest); +} + +int git_config_get_int(const char *key, int *dest) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_int(&the_config_set, key, dest); +} + +int git_config_get_ulong(const char *key, unsigned long *dest) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_ulong(&the_config_set, key, dest); +} + +int git_config_get_bool(const char *key, int *dest) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_bool(&the_config_set, key, dest); +} + +int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_bool_or_int(&the_config_set, key, is_bool, dest); +} + +int git_config_get_maybe_bool(const char *key, int *dest) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_maybe_bool(&the_config_set, key, dest); +} + +int git_config_get_pathname(const char *key, const char **dest) +{ + if (!the_config_set_initialised) { + git_configset_add_file(&the_config_set, "default"); + the_config_set_initialised = 1; + } + return git_configset_get_pathname(&the_config_set, key, dest); +} diff --git a/config.c b/config.c index a1aef1c..6cffec7 100644 --- a/config.c +++ b/config.c @@ -1708,6 +1708,9 @@ int git_config_set_multivar_in_file(const char *config_filename, lock = NULL; ret = 0; + /* Invalidate the config cache */ + git_config_clear(); + out_free: if (lock) rollback_lock_file(lock); -- 1.9.0.GIT ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 6:01 ` [PATCH v4 1/2] add `config_set` API for caching config files Tanay Abhra @ 2014-07-02 9:14 ` Matthieu Moy 2014-07-02 11:58 ` Tanay Abhra ` (2 more replies) 2014-07-02 17:00 ` Junio C Hamano 1 sibling, 3 replies; 17+ messages in thread From: Matthieu Moy @ 2014-07-02 9:14 UTC (permalink / raw) To: Tanay Abhra Cc: git, Ramkumar Ramachandra, Karsten Blees, Junio C Hamano, Eric Sunshine Tanay Abhra <tanayabh@gmail.com> writes: > Add a `config_set` construct that points to an ordered set of config files > specified by the user, each of which represents what was read and cached > in core as hashmaps. Add two external functions `git_configset_get_value` > and `git_configset_get_value_multi` for querying from the config sets, > which follow `last one wins` semantic(i.e. if there are multiple matches Space before ( > for the queried key in the files of the configset the value returned will > the last value in the last matching file of the configset) Add type will _be_ ? Does this remark also apply to git_configset_get_value_multi? My understanding was that "last one wins" applied only to git_configset_get_value. > Add a default `config_set`, `the_config_set` to cache all key-value pairs I don't like the name "the_config_set". It's not the only one. Perhaps repo_config_set? (not totally satisfactory as it does not contain only the repo, but the repo+user+system) What do others think? > read from usual config files(repo specific .git/config, user wide (You always forget space before '('...) > ~/.gitconfig and the global /etc/gitconfig). `the_config_set` uses a > single hashmap populated using git_config(). The reason for doing so is > twofold, one is to honour include directives, another is to guarantee O(1) > lookups for usual config values, as values are queried for hundred of > times during a run of a git program. What is the reason to deal with `the_config_set` and other config sets differently? You're giving arguments to store `the_config_set` as a single hashmap, but what is the reason to store others as multiple hashmaps? And actually, I don't completely buy your arguments: having 3 or 4 hashmaps (.git/config, ~/.gitconfig, ~/.config/git/config and /etc/gitconfig) would be a O(4) lookup, which is still O(1), and you could deal with include directives by having ".git/config and included files" in a hashmap, "~/.gitconfig and included files" in a second hashmap, ... My guess is that the real argument is "git_config already does what I want and I'm too lazy to change it". And I do consider lazyness as a quality for a computer-scientist ;-). I would personally find it much simpler to have a single hashmap. We'd lose the ability to invalidate the cache for only a single file, but I'm not sure it's worth the code complexity. > +Querying For Specific Variables > +------------------------------- > + > +For programs wanting to query for specific variables in a non-callback > +manner, the config API provides two functions `git_config_get_value` > +and `git_config_get_value_multi`.They both read values from an internal Space after . > +`git_config_iter` gives a way to iterate over the keys in cache. Existing > +`git_config` callback function signature is used for iterating. "Existing" refers to the old state. It's OK in a commit message, but won't make sense in the future, when your non-callback API and the old callback API will live side by side. > +The config API also provides type specific API functions which do conversion > +as well as retrieval for the queried variable, including: > + > +`git_config_get_int`:: > +Parse the retrieved value to an integer, including unit factors. Dies on > +error; otherwise, allocates and copies the integer into the dest parameter. > +Returns 0 on success, or 1 if no value was found. It seems strange to me to return 1 here, and -1 in git_config_get_value to mean the same thing. > +`git_config_bool`:: Isn't it git_config_get_bool? > +/* config-hash.c */ You may point to the documentation in the comment too. > +#define CONFIG_SET_INIT { NULL, 0, 0 } > + > +struct config_set { > + struct config_set_item *item; > + unsigned int nr, alloc; > +}; > + > +struct config_set_item { > + struct hashmap config_hash; > + char *filename; Perhaps const char *? > +static struct hashmap *add_config_hash(const char *filename, struct config_set *cs) > +{ > + int ret = 0; > + struct config_set_item *ct; > + struct config_set_cb cb; > + ALLOC_GROW(cs->item, cs->nr + 1, cs->alloc); > + ct = &cs->item[cs->nr++]; > + ct->filename = xstrdup(filename); > + config_hash_init(&ct->config_hash); > + cb.cs = cs; > + cb.ct = ct; > + /* > + * When filename is "default", hashmap is constructed from the usual set of > + * config files (i.e repo specific .git/config, user wide ~/.gitconfig and the > + * global /etc/gitconfig), used in `the_config_set` > + */ > + if (!strcmp(filename, "default")) How does one read a file actually called "default" with this API? More generally, why do you need a special-case here? I think it would be much better to leave this part unaware of the default, and have a separate function like create_repo_config_hash (or create_the_config_hash) that would call git_config(...). There actually isn't much shared code between the "default" case and the others in your function. > +static struct hashmap *get_config_hash(const char *filename, struct config_set *cs) > +{ > + int i; > + for(i = 0; i < cs->nr; i++) { > + if (!strcmp(cs->item[i].filename, filename)) > + return &cs->item[i].config_hash; > + } > + return add_config_hash(filename, cs); > +} > + > +static struct config_hash_entry *config_hash_find_entry(const char *key, const char* filename, > + struct config_set *cs) I don't get the logic here. Either the caller explicitly manages a config_set variable like config_set my_set = git_configset_init(); git_configset_add_file(my_set, "my-file"); git_configset_get_value(my_set, "some.variable.name"); Or there's an implicit singleton mapping files to hashmaps to allow the user to say git_configset_get_value("my-file", "some.variable.name"); without asking the user to explicitly manipulate config_set variables. It seems to me that your solution is a bad compromise between both solutions: you do require the user to manipulate config_set variables, but you also require a filename to look for the right entry in the list. Did you miss the end of Junio's message: http://thread.gmane.org/gmane.comp.version-control.git/252567 ? If the user explicitly asks for a single entry in the list, then why group them in a list? I guess the question is the same as above, and the more I read the patch, the more I think a config_set should contain a single hashmap. > +static int config_hash_add_value(const char *key, const char *value, > + struct config_set_item *ct, struct config_set *cs) > +{ > + struct hashmap *config_hash = &ct->config_hash; > + struct config_hash_entry *e; > + e = config_hash_find_entry(key, ct->filename, cs); > + > + if (!e) { > + e = xmalloc(sizeof(*e)); > + hashmap_entry_init(e, strhash(key)); > + e->key = xstrdup(key); > + memset(&e->value_list, 0, sizeof(e->value_list)); > + e->value_list.strdup_strings = 1; > + hashmap_add(config_hash, e); > + } > + /* > + * Since the values are being fed by git_config*() callback mechanism, they > + * are already normalised. We usually speak en_US rather than en_UK => normalized. > +int git_config_get_value(const char *key, const char **value) > +{ > + if (!the_config_set_initialised) { > + git_configset_add_file(&the_config_set, "default"); > + the_config_set_initialised = 1; > + } > + return git_configset_get_value(&the_config_set, key, value); > +} > + > +const struct string_list *git_config_get_value_multi(const char *key) > +{ > + if (!the_config_set_initialised) { > + git_configset_add_file(&the_config_set, "default"); > + the_config_set_initialised = 1; > + } > + return git_configset_get_value_multi(&the_config_set, key); > +} These if statements could be refactored, and the_config_set_initialised could be hidden to the caller. Just make a function get_the_config_set() that initializes if needed and returns a pointer to the_config_set. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 9:14 ` Matthieu Moy @ 2014-07-02 11:58 ` Tanay Abhra 2014-07-02 14:32 ` Matthieu Moy 2014-07-02 17:09 ` Junio C Hamano 2014-07-04 4:58 ` Tanay Abhra 2 siblings, 1 reply; 17+ messages in thread From: Tanay Abhra @ 2014-07-02 11:58 UTC (permalink / raw) To: Matthieu Moy Cc: git, Ramkumar Ramachandra, Karsten Blees, Junio C Hamano, Eric Sunshine On 7/2/2014 2:44 PM, Matthieu Moy wrote: > Tanay Abhra <tanayabh@gmail.com> writes: > >> Add a `config_set` construct that points to an ordered set of config files >> specified by the user, each of which represents what was read and cached >> in core as hashmaps. Add two external functions `git_configset_get_value` >> and `git_configset_get_value_multi` for querying from the config sets, >> which follow `last one wins` semantic(i.e. if there are multiple matches > > Space before ( > Noted. Sorry about that. >> for the queried key in the files of the configset the value returned will >> the last value in the last matching file of the configset) Add type > Does this remark also apply to git_configset_get_value_multi? My > understanding was that "last one wins" applied only to > git_configset_get_value. > Maybe a reworded sentence may work, `git_configset_get_value_multi` returns a list of values sorted in order of increasing priority (i.e. last match will be at the end of the list.) >> Add a default `config_set`, `the_config_set` to cache all key-value pairs > > I don't like the name "the_config_set". It's not the only one. Perhaps > repo_config_set? (not totally satisfactory as it does not contain only > the repo, but the repo+user+system) > > What do others think? > >> read from usual config files(repo specific .git/config, user wide >> ~/.gitconfig and the global /etc/gitconfig). `the_config_set` uses a >> single hashmap populated using git_config(). The reason for doing so is >> twofold, one is to honour include directives, another is to guarantee O(1) >> lookups for usual config values, as values are queried for hundred of >> times during a run of a git program. > > What is the reason to deal with `the_config_set` and other config sets > differently? You're giving arguments to store `the_config_set` as a > single hashmap, but what is the reason to store others as multiple > hashmaps? > > And actually, I don't completely buy your arguments: having 3 or 4 > hashmaps (.git/config, ~/.gitconfig, ~/.config/git/config and > /etc/gitconfig) would be a O(4) lookup, which is still O(1), and you > could deal with include directives by having ".git/config and included > files" in a hashmap, "~/.gitconfig and included files" in a second > hashmap, ... > > My guess is that the real argument is "git_config already does what I > want and I'm too lazy to change it". And I do consider lazyness as a > quality for a computer-scientist ;-). > > I would personally find it much simpler to have a single hashmap. We'd > lose the ability to invalidate the cache for only a single file, but I'm > not sure it's worth the code complexity. > Point noted. I can take the multiple hashmap route for `the_config_set`. Still, since it will be the most used config set in the code and for each query it would have to look through n hashmaps hampering performance. I can change it if you want but like you, I don't think it is worth the code complexity. >> +Querying For Specific Variables >> +------------------------------- >> + >> +For programs wanting to query for specific variables in a non-callback >> +manner, the config API provides two functions `git_config_get_value` >> +and `git_config_get_value_multi`. They both read values from an internal >> +`git_config_iter` gives a way to iterate over the keys in cache. Existing >> +`git_config` callback function signature is used for iterating. > > "Existing" refers to the old state. It's OK in a commit message, but > won't make sense in the future, when your non-callback API and the old > callback API will live side by side. > Noted. >> +The config API also provides type specific API functions which do conversion >> +as well as retrieval for the queried variable, including: >> + >> +`git_config_get_int`:: >> +Parse the retrieved value to an integer, including unit factors. Dies on >> +error; otherwise, allocates and copies the integer into the dest parameter. >> +Returns 0 on success, or 1 if no value was found. > > It seems strange to me to return 1 here, and -1 in git_config_get_value > to mean the same thing. > Noted. Some of the type specific function return -1 on wrong parsing, I will put return value 1 for no value found in all of the cases. >> +`git_config_bool`:: >> +/* config-hash.c */ >> +#define CONFIG_SET_INIT { NULL, 0, 0 } >> + >> +struct config_set { >> + struct config_set_item *item; >> + unsigned int nr, alloc; >> +}; >> + >> +struct config_set_item { >> + struct hashmap config_hash; >> + char *filename; >> +static struct hashmap *add_config_hash(const char *filename, struct config_set *cs) >> +{ >> + int ret = 0; >> + struct config_set_item *ct; >> + struct config_set_cb cb; >> + ALLOC_GROW(cs->item, cs->nr + 1, cs->alloc); >> + ct = &cs->item[cs->nr++]; >> + ct->filename = xstrdup(filename); >> + config_hash_init(&ct->config_hash); >> + cb.cs = cs; >> + cb.ct = ct; >> + /* >> + * When filename is "default", hashmap is constructed from the usual set of >> + * config files (i.e repo specific .git/config, user wide ~/.gitconfig and the >> + * global /etc/gitconfig), used in `the_config_set` >> + */ >> + if (!strcmp(filename, "default")) > > How does one read a file actually called "default" with this API? > > More generally, why do you need a special-case here? I think it would be > much better to leave this part unaware of the default, and have a > separate function like create_repo_config_hash (or > create_the_config_hash) that would call git_config(...). There actually > isn't much shared code between the "default" case and the others in your > function. > Noted, a separate function would be much better. >> +static struct hashmap *get_config_hash(const char *filename, struct config_set *cs) >> +{ >> + int i; >> + for(i = 0; i < cs->nr; i++) { >> + if (!strcmp(cs->item[i].filename, filename)) >> + return &cs->item[i].config_hash; >> + } >> + return add_config_hash(filename, cs); >> +} >> + >> +static struct config_hash_entry *config_hash_find_entry(const char *key, const char* filename, >> + struct config_set *cs) > > I don't get the logic here. > > Either the caller explicitly manages a config_set variable like > > config_set my_set = git_configset_init(); > git_configset_add_file(my_set, "my-file"); > git_configset_get_value(my_set, "some.variable.name"); > > Or there's an implicit singleton mapping files to hashmaps to allow the > user to say > > git_configset_get_value("my-file", "some.variable.name"); > > without asking the user to explicitly manipulate config_set variables. > > It seems to me that your solution is a bad compromise between both > solutions: you do require the user to manipulate config_set variables, > but you also require a filename to look for the right entry in the list. > > Did you miss the end of Junio's message: > > http://thread.gmane.org/gmane.comp.version-control.git/252567 > > ? > This is an internal function which is used to iterate over every `config_set` item which contains a hashmap and filename as an identifier. The exposed API which I documented above doesn't require the user to pass the filename when querying for a value. The API works like this, suppose there are two files, A.config B.config [foo] [foo] a = b a = d a = c a = e config_set *my_set = git_configset_init(); git_configset_add_file(my_set, "A.config"); git_configset_add_file(my_set, "B.config"); git_configset_get_value(my_set, "foo.a"); Here get_value calls config_hash_find_entry once for each configset_item "A.config" and "B.config" with key as "foo.a". which in turn returns config_hash_item which contains the key and the value list. We get two string_lists containing values for the key foo.a, {b,c} and {d,e} corresponding to A.config & B.config respectively. get_value then returns the value with the highest priority which is 'e' after going through both of the lists and picking the last one. These steps are done by internal functions not by the user facing API ones. > If the user explicitly asks for a single entry in the list, then why > group them in a list? I guess the question is the same as above, and the > more I read the patch, the more I think a config_set should contain a > single hashmap. > Do you think so after reading the whole patch? It would be much easier to implement config_set using a single hashmap . But Junio had suggested that it would be much better for config_set to be a collection of hashmaps and `the_config_set` to contain a single hashmap[1]. [1]http://thread.gmane.org/gmane.comp.version-control.git/252329/focus=252460 >> +static int config_hash_add_value(const char *key, const char *value, >> + struct config_set_item *ct, struct config_set *cs) >> +{ >> + struct hashmap *config_hash = &ct->config_hash; >> + struct config_hash_entry *e; >> + e = config_hash_find_entry(key, ct->filename, cs); >> + >> + if (!e) { >> + e = xmalloc(sizeof(*e)); >> + hashmap_entry_init(e, strhash(key)); >> + e->key = xstrdup(key); >> + memset(&e->value_list, 0, sizeof(e->value_list)); >> + e->value_list.strdup_strings = 1; >> + hashmap_add(config_hash, e); >> + } >> + /* >> + * Since the values are being fed by git_config*() callback mechanism, they >> + * are already normalised. >> +int git_config_get_value(const char *key, const char **value) >> +{ >> + if (!the_config_set_initialised) { >> + git_configset_add_file(&the_config_set, "default"); >> + the_config_set_initialised = 1; >> + } >> + return git_configset_get_value(&the_config_set, key, value); >> +} >> + >> +const struct string_list *git_config_get_value_multi(const char *key) >> +{ >> + if (!the_config_set_initialised) { >> + git_configset_add_file(&the_config_set, "default"); >> + the_config_set_initialised = 1; >> + } >> + return git_configset_get_value_multi(&the_config_set, key); >> +} > > These if statements could be refactored, and the_config_set_initialised > could be hidden to the caller. Just make a function get_the_config_set() > that initializes if needed and returns a pointer to the_config_set. > Noted. Thanks for the quick review. Tanay. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 11:58 ` Tanay Abhra @ 2014-07-02 14:32 ` Matthieu Moy 0 siblings, 0 replies; 17+ messages in thread From: Matthieu Moy @ 2014-07-02 14:32 UTC (permalink / raw) To: Tanay Abhra Cc: git, Ramkumar Ramachandra, Karsten Blees, Junio C Hamano, Eric Sunshine Tanay Abhra <tanayabh@gmail.com> writes: > On 7/2/2014 2:44 PM, Matthieu Moy wrote: >> Tanay Abhra <tanayabh@gmail.com> writes: > Maybe a reworded sentence may work, > `git_configset_get_value_multi` returns a list of values sorted in order of > increasing priority (i.e. last match will be at the end of the list.) OK. >>> read from usual config files(repo specific .git/config, user wide >>> ~/.gitconfig and the global /etc/gitconfig). `the_config_set` uses a >>> single hashmap populated using git_config(). The reason for doing so is >>> twofold, one is to honour include directives, another is to guarantee O(1) >>> lookups for usual config values, as values are queried for hundred of >>> times during a run of a git program. >> >> What is the reason to deal with `the_config_set` and other config sets >> differently? You're giving arguments to store `the_config_set` as a >> single hashmap, but what is the reason to store others as multiple >> hashmaps? >> >> And actually, I don't completely buy your arguments: having 3 or 4 >> hashmaps (.git/config, ~/.gitconfig, ~/.config/git/config and >> /etc/gitconfig) would be a O(4) lookup, which is still O(1), and you >> could deal with include directives by having ".git/config and included >> files" in a hashmap, "~/.gitconfig and included files" in a second >> hashmap, ... >> >> My guess is that the real argument is "git_config already does what I >> want and I'm too lazy to change it". And I do consider lazyness as a >> quality for a computer-scientist ;-). >> >> I would personally find it much simpler to have a single hashmap. We'd >> lose the ability to invalidate the cache for only a single file, but I'm >> not sure it's worth the code complexity. >> > > Point noted. I can take the multiple hashmap route for `the_config_set`. > Still, since it will be the most used config set in the code and for each > query it would have to look through n hashmaps hampering performance. I > can change it if you want but like you, I don't think it is worth the code > complexity. That's why my suggestion is to use a single hashmap everywhere. I don't have strong opinion either way, but whichever way you go, you should justify the choice better in the commit message. >>> +The config API also provides type specific API functions which do conversion >>> +as well as retrieval for the queried variable, including: >>> + >>> +`git_config_get_int`:: >>> +Parse the retrieved value to an integer, including unit factors. Dies on >>> +error; otherwise, allocates and copies the integer into the dest parameter. >>> +Returns 0 on success, or 1 if no value was found. >> >> It seems strange to me to return 1 here, and -1 in git_config_get_value >> to mean the same thing. >> > > Noted. Some of the type specific function return -1 on wrong parsing, I will > put return value 1 for no value found in all of the cases. I'm not sure I fully get the existing convention. My understanding is that when the extracted value is returned, -1 is used as a special value to mean "no value" (e.g. git_config_maybe_bool can return 1, 0 or -1), but when the extracted value is written to a by-address parameter, then the return value is 1 or 0. >>> +static struct hashmap *get_config_hash(const char *filename, struct config_set *cs) >>> +{ >>> + int i; >>> + for(i = 0; i < cs->nr; i++) { >>> + if (!strcmp(cs->item[i].filename, filename)) >>> + return &cs->item[i].config_hash; >>> + } >>> + return add_config_hash(filename, cs); >>> +} >>> + >>> +static struct config_hash_entry *config_hash_find_entry(const char *key, const char* filename, >>> + struct config_set *cs) >> >> I don't get the logic here. >> >> Either the caller explicitly manages a config_set variable like >> >> config_set my_set = git_configset_init(); >> git_configset_add_file(my_set, "my-file"); >> git_configset_get_value(my_set, "some.variable.name"); >> >> Or there's an implicit singleton mapping files to hashmaps to allow the >> user to say >> >> git_configset_get_value("my-file", "some.variable.name"); >> >> without asking the user to explicitly manipulate config_set variables. >> >> It seems to me that your solution is a bad compromise between both >> solutions: you do require the user to manipulate config_set variables, >> but you also require a filename to look for the right entry in the list. >> >> Did you miss the end of Junio's message: >> >> http://thread.gmane.org/gmane.comp.version-control.git/252567 >> >> ? >> > > This is an internal function which is used to iterate over every `config_set` > item which contains a hashmap and filename as an identifier. > The exposed API which I documented above doesn't require the user to pass the > filename when querying for a value. > > The API works like this, suppose there are two files, > A.config B.config > [foo] [foo] > a = b a = d > a = c a = e > > config_set *my_set = git_configset_init(); > git_configset_add_file(my_set, "A.config"); > git_configset_add_file(my_set, "B.config"); > git_configset_get_value(my_set, "foo.a"); > > Here get_value calls config_hash_find_entry once for each configset_item > "A.config" and "B.config" with key as "foo.a". which in turn returns > config_hash_item which contains the key and the value list. OK, I understand what you did. But it still seem wrong to me. If you inline functions, you are doing something like for (f in files) { // in git_configset_get_value for (g in files) { // in get_config_hash if (f == g) { take the hashmap for g. } } } This is O(n^2) to iterate over an ordered list. OK, the list is small, but still, why not iterate normally over the array? Algorithmically speaking, you do not need the file names here, just iterate over an array of hashmaps. Filenames do not harm in struct config_set, but they are just metadata, not usefull to do the search. > We get two string_lists containing values for the key foo.a, > {b,c} and {d,e} corresponding to A.config & B.config respectively. > > get_value then returns the value with the highest priority which is 'e' > after going through both of the lists and picking the last one. > These steps are done by internal functions not by the user facing > API ones. But then, I have to wonder again what is the benefit of having multiple hashmaps in the config_set, since it has no user-visible benefit. >> If the user explicitly asks for a single entry in the list, then why >> group them in a list? I guess the question is the same as above, and the >> more I read the patch, the more I think a config_set should contain a >> single hashmap. >> > > Do you think so after reading the whole patch? It would be much easier > to implement config_set using a single hashmap . But Junio had suggested > that it would be much better for config_set to be a collection of hashmaps > and `the_config_set` to contain a single hashmap[1]. > > [1]http://thread.gmane.org/gmane.comp.version-control.git/252329/focus=252460 I do not read the message like you. Junio gives pros&cons of "everything in a single hashmap" Vs "one hashmap per file", and explicitely concludes with | I don't know at this point, and thinking these things through | to arrive at a good design is part of the GSoC project after all, so | I'd rather not to think it through to the end myself ;-). I do not read "there should be multiple hashmaps, except in the case of the_configset which should be handled differently". To summarize: Advantages of using a single hashmap per configset: 1) Code is simpler 2) Lookup is slightly more efficient Advantages of using one hashmap per configset: A) Allows sharing hashmaps between configsets B) Allows selectively refreshing the config for a single file without touching the other files' hashmaps. C) Would allow more operations like "tell me which file contains the config key foo.bar", that could be useful in some situation. Right now, I do not think that your implementation uses any of A), B) or C), so you're doing something uselessly complex. In the future, my feeling is that A) is essentially irrelevant (we're talking about a few hundreds of entry for large config files, not something we want to fine-tune in memory), B) would raise tons of corner-cases and would be hard to get right compared to the benefit. C) could be interesting, but the most common case would be to use the_configset, in which there's only one hashmap so it's not applicable without a big code change. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 9:14 ` Matthieu Moy 2014-07-02 11:58 ` Tanay Abhra @ 2014-07-02 17:09 ` Junio C Hamano 2014-07-04 4:58 ` Tanay Abhra 2 siblings, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2014-07-02 17:09 UTC (permalink / raw) To: Matthieu Moy Cc: Tanay Abhra, git, Ramkumar Ramachandra, Karsten Blees, Eric Sunshine Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes: > I don't like the name "the_config_set". It's not the only one. Perhaps > repo_config_set? (not totally satisfactory as it does not contain only > the repo, but the repo+user+system) > > What do others think? I actually do like "the_configset", which goes nicely parallel to "the_index". Neither is the only one, but most of the time our code operates on the primary one and "the_frotz" refers to it (and convenience wrappers that does not specify which set defaults to it). > What is the reason to deal with `the_config_set` and other config sets > differently? You're giving arguments to store `the_config_set` as a > single hashmap, but what is the reason to store others as multiple > hashmaps? Confusion, probably. "the_configset" should be just a singleton instance used to store the values we use for the default config system, but its shape should be exactly the same as the other ones users can use to read .gitmodules and friends with. > And actually, I don't completely buy your arguments: having 3 or 4 > hashmaps (.git/config, ~/.gitconfig, ~/.config/git/config and > /etc/gitconfig) would be a O(4) lookup, which is still O(1), and you > could deal with include directives by having ".git/config and included > files" in a hashmap, "~/.gitconfig and included files" in a second > hashmap, ... OK. > I would personally find it much simpler to have a single hashmap. We'd > lose the ability to invalidate the cache for only a single file, but I'm > not sure it's worth the code complexity. OK, too. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 9:14 ` Matthieu Moy 2014-07-02 11:58 ` Tanay Abhra 2014-07-02 17:09 ` Junio C Hamano @ 2014-07-04 4:58 ` Tanay Abhra 2014-07-04 9:17 ` Matthieu Moy 2 siblings, 1 reply; 17+ messages in thread From: Tanay Abhra @ 2014-07-04 4:58 UTC (permalink / raw) To: Matthieu Moy; +Cc: git, Ramkumar Ramachandra Hi, I have cooked up a single hashmap implementation. What are your thoughts about it? I must say it is much cleaner than the previous attempt and it passes all the tests. I will send the revised patch tomorrow with the corrected documentation, till then please say so if you prefer this one or the multi hashmap one. Cheers, Tanay Abhra. -- >8 -- diff --git a/Makefile b/Makefile index 07ea105..d503f78 100644 --- a/Makefile +++ b/Makefile @@ -777,6 +777,7 @@ LIB_OBJS += commit.o LIB_OBJS += compat/obstack.o LIB_OBJS += compat/terminal.o LIB_OBJS += config.o +LIB_OBJS += config-hash.o LIB_OBJS += connect.o LIB_OBJS += connected.o LIB_OBJS += convert.o diff --git a/cache.h b/cache.h index df65231..2a675f6 100644 --- a/cache.h +++ b/cache.h @@ -7,6 +7,7 @@ #include "advice.h" #include "gettext.h" #include "convert.h" +#include "string-list.h" #include SHA1_HEADER #ifndef git_SHA_CTX @@ -1325,6 +1326,41 @@ extern int parse_config_key(const char *var, const char **subsection, int *subsection_len, const char **key); +/* config-hash.c */ + +struct config_set { + struct hashmap config_hash; + int hash_initialized; +}; + +extern struct config_set *git_configset_init(void); +extern int git_configset_add_file(struct config_set *cs, const char *filename); +extern int git_configset_get_value(struct config_set *cs, const char *key, const char **value); +extern const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key); +extern void git_configset_clear(struct config_set *cs); +extern void git_configset_iter(struct config_set *cs, config_fn_t fn, void *data); +extern int git_configset_get_string(struct config_set *cs, const char *key, const char **dest); +extern int git_configset_get_int(struct config_set *cs, const char *key, int *dest); +extern int git_configset_get_ulong(struct config_set *cs, const char *key, unsigned long *dest); +extern int git_configset_get_bool(struct config_set *cs, const char *key, int *dest); +extern int git_configset_get_bool_or_int(struct config_set *cs, const char *key, int *is_bool, int *dest); +extern int git_configset_get_maybe_bool(struct config_set *cs, const char *key, int *dest); +extern int git_configset_get_pathname(struct config_set *cs, const char *key, const char **dest); + +extern int git_config_get_value(const char *key, const char **value); +extern const struct string_list *git_config_get_value_multi(const char *key); +extern void git_config_clear(void); +extern void git_config_iter(config_fn_t fn, void *data); +extern int git_config_get_string(const char *key, const char **dest); +extern int git_config_get_int(const char *key, int *dest); +extern int git_config_get_ulong(const char *key, unsigned long *dest); +extern int git_config_get_bool(const char *key, int *dest); +extern int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest); +extern int git_config_get_maybe_bool(const char *key, int *dest); +extern int git_config_get_pathname(const char *key, const char **dest); + + extern int committer_ident_sufficiently_given(void); extern int author_ident_sufficiently_given(void); diff --git a/config-hash.c b/config-hash.c new file mode 100644 index 0000000..e12bf57 --- /dev/null +++ b/config-hash.c @@ -0,0 +1,325 @@ +#include "cache.h" +#include "hashmap.h" +#include "string-list.h" + + +/* + * Default config_set that contains key-value pairs from the usual set of config + * config files (i.e repo specific .git/config, user wide ~/.gitconfig and the + * global /etc/gitconfig) + */ +static struct config_set the_config_set; + +struct config_hash_entry { + struct hashmap_entry ent; + char *key; + struct string_list value_list; +}; + +static int config_hash_add_value(const char *, const char *, struct hashmap *); + +static int config_hash_entry_cmp(const struct config_hash_entry *e1, + const struct config_hash_entry *e2, const void *unused) +{ + return strcmp(e1->key, e2->key); +} + +static void config_hash_init(struct hashmap *config_hash) +{ + hashmap_init(config_hash, (hashmap_cmp_fn)config_hash_entry_cmp, 0); +} + +static int config_hash_callback(const char *key, const char *value, void *cb) +{ + struct config_set *cs = cb; + config_hash_add_value(key, value, &cs->config_hash); + return 0; +} + +static int add_configset_hash(const char *filename, struct config_set *cs) +{ + int ret = 0; + if (!cs->hash_initialized) + config_hash_init(&cs->config_hash); + cs->hash_initialized = 1; + ret = git_config_from_file(config_hash_callback, filename, cs); + if (!ret) + return 0; + else { + hashmap_free(&cs->config_hash, 1); + cs->hash_initialized = 0; + return -1; + } +} + +static struct config_hash_entry *config_hash_find_entry(const char *key, + struct hashmap *config_hash) +{ + struct config_hash_entry k; + struct config_hash_entry *found_entry; + char *normalized_key; + int ret; + ret = git_config_parse_key(key, &normalized_key, NULL); + + if (ret) + return NULL; + + hashmap_entry_init(&k, strhash(normalized_key)); + k.key = normalized_key; + found_entry = hashmap_get(config_hash, &k, NULL); + free(normalized_key); + return found_entry; +} + +static struct string_list *configset_get_list(const char *key, struct config_set *cs) +{ + struct config_hash_entry *e = config_hash_find_entry(key, &cs->config_hash); + return e ? &e->value_list : NULL; +} + +static int config_hash_add_value(const char *key, const char *value, struct hashmap *config_hash) +{ + struct config_hash_entry *e; + e = config_hash_find_entry(key, config_hash); + + if (!e) { + e = xmalloc(sizeof(*e)); + hashmap_entry_init(e, strhash(key)); + e->key = xstrdup(key); + memset(&e->value_list, 0, sizeof(e->value_list)); + e->value_list.strdup_strings = 1; + hashmap_add(config_hash, e); + } + /* + * Since the values are being fed by git_config*() callback mechanism, they + * are already normalized. So simply add them without any further munging. + */ + string_list_append_nodup(&e->value_list, value ? xstrdup(value) : NULL); + + return 0; +} + +struct config_set *git_configset_init(void) +{ + struct config_set *cs = xcalloc(1, sizeof(*cs)); + return cs; +} + +int git_configset_add_file(struct config_set *cs, const char *filename) +{ + return add_configset_hash(filename, cs); +} + +int git_configset_get_value(struct config_set *cs, const char *key, const char **value) +{ + struct string_list *values = NULL; + /* + * Follows "last one wins" semantic, i.e., if there are multiple matches for the + * queried key in the files of the configset the value returned will the last value + * in the last matching file of the configset + */ + values = configset_get_list(key, cs); + + if (!values) + return 1; + assert(values->nr > 0); + *value = values->items[values->nr - 1].string; + return 0; +} + +const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key) +{ + return configset_get_list(key, cs); +} + +/* + * Uses the existing git_config() callback function signature to iterate over + * keys in the configset. + */ +void git_configset_iter(struct config_set *cs, config_fn_t fn, void *data) +{ + int i; + char *value; + struct string_list *strptr; + struct config_hash_entry *entry; + struct hashmap_iter iter; + hashmap_iter_init(&cs->config_hash, &iter); + while ((entry = hashmap_iter_next(&iter))) { + strptr = &entry->value_list; + for (i = 0; i < strptr->nr; i++) { + value = strptr->items[i].string; + fn(entry->key, value, data); + } + } +} + +void git_configset_clear(struct config_set *cs) +{ + struct config_hash_entry *entry; + struct hashmap_iter iter; + if (!cs->hash_initialized) + return; + + hashmap_iter_init( &cs->config_hash, &iter); + while ((entry = hashmap_iter_next(&iter))) { + free(entry->key); + string_list_clear(&entry->value_list, 0); + } + hashmap_free(&cs->config_hash, 1); +} + +int git_configset_get_string(struct config_set *cs, const char *key, const char **dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) + return git_config_string(dest, key, value); + else + return 1; +} + +int git_configset_get_int(struct config_set *cs, const char *key, int *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_int(key, value); + return 0; + } else + return 1; +} + +int git_configset_get_ulong(struct config_set *cs, const char *key, unsigned long *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_ulong(key, value); + return 0; + } else + return 1; +} + +int git_configset_get_bool(struct config_set *cs, const char *key, int *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_bool(key, value); + return 0; + } else + return 1; +} + +int git_configset_get_bool_or_int(struct config_set *cs, const char *key, + int *is_bool, int *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_bool_or_int(key, value, is_bool); + return 0; + } else + return 1; +} + +int git_configset_get_maybe_bool(struct config_set *cs, const char *key, int *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_maybe_bool(key, value); + if (*dest == -1) + return -1; + return 0; + } else + return 1; +} + +int git_configset_get_pathname(struct config_set *cs, const char *key, const char **dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) + return git_config_pathname(dest, key, value); + else + return 1; +} + +static int git_config_check_init(void) +{ + int ret = 0; + if (the_config_set.hash_initialized) + return 0; + config_hash_init(&the_config_set.config_hash); + the_config_set.hash_initialized = 1; + ret = git_config(config_hash_callback, &the_config_set); + if (ret >= 0) + return 0; + else { + hashmap_free(&the_config_set.config_hash, 1); + the_config_set.hash_initialized = 0; + return -1; + } +} + +int git_config_get_value(const char *key, const char **value) +{ + git_config_check_init(); + return git_configset_get_value(&the_config_set, key, value); +} + +const struct string_list *git_config_get_value_multi(const char *key) +{ + git_config_check_init(); + return git_configset_get_value_multi(&the_config_set, key); +} + +void git_config_clear(void) +{ + if (!the_config_set.hash_initialized) + return; + git_configset_clear(&the_config_set); + the_config_set.hash_initialized = 0; +} + +void git_config_iter(config_fn_t fn, void *data) +{ + git_config_check_init(); + git_configset_iter(&the_config_set, fn, data); +} + +int git_config_get_string(const char *key, const char **dest) +{ + git_config_check_init(); + return git_configset_get_string(&the_config_set, key, dest); +} + +int git_config_get_int(const char *key, int *dest) +{ + git_config_check_init(); + return git_configset_get_int(&the_config_set, key, dest); +} + +int git_config_get_ulong(const char *key, unsigned long *dest) +{ + git_config_check_init(); + return git_configset_get_ulong(&the_config_set, key, dest); +} + +int git_config_get_bool(const char *key, int *dest) +{ + git_config_check_init(); + return git_configset_get_bool(&the_config_set, key, dest); +} + +int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest) +{ + git_config_check_init(); + return git_configset_get_bool_or_int(&the_config_set, key, is_bool, dest); +} + +int git_config_get_maybe_bool(const char *key, int *dest) +{ + git_config_check_init(); + return git_configset_get_maybe_bool(&the_config_set, key, dest); +} + +int git_config_get_pathname(const char *key, const char **dest) +{ + git_config_check_init(); + return git_configset_get_pathname(&the_config_set, key, dest); +} diff --git a/config.c b/config.c index a1aef1c..6cffec7 100644 --- a/config.c +++ b/config.c @@ -1708,6 +1708,9 @@ int git_config_set_multivar_in_file(const char *config_filename, lock = NULL; ret = 0; + /* Invalidate the config cache */ + git_config_clear(); + out_free: if (lock) rollback_lock_file(lock); -- 1.9.0.GIT ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-04 4:58 ` Tanay Abhra @ 2014-07-04 9:17 ` Matthieu Moy 2014-07-04 9:25 ` Tanay Abhra 0 siblings, 1 reply; 17+ messages in thread From: Matthieu Moy @ 2014-07-04 9:17 UTC (permalink / raw) To: Tanay Abhra; +Cc: git, Ramkumar Ramachandra Tanay Abhra <tanayabh@gmail.com> writes: > Hi, > > I have cooked up a single hashmap implementation. What are your > thoughts about it? I had a quick look, and it looks good to me. I'll make a more detailed review when you send the next series. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-04 9:17 ` Matthieu Moy @ 2014-07-04 9:25 ` Tanay Abhra 2014-07-04 9:43 ` Matthieu Moy 2014-07-07 17:02 ` Junio C Hamano 0 siblings, 2 replies; 17+ messages in thread From: Tanay Abhra @ 2014-07-04 9:25 UTC (permalink / raw) To: Matthieu Moy; +Cc: git, Ramkumar Ramachandra On 7/4/2014 2:47 PM, Matthieu Moy wrote: > Tanay Abhra <tanayabh@gmail.com> writes: > >> Hi, >> >> I have cooked up a single hashmap implementation. What are your >> thoughts about it? > > I had a quick look, and it looks good to me. I'll make a more detailed > review when you send the next series. > One more doubt, does <filename,linenr> for every value has any use other than raising semantic error in typespecific API functions. For example, if we call git_config_get_int(foo.bar), we can show to the user "value not a int at <filename, linenr>". Other than that I cannot think of any other use of it. Currently `git_config_int` dies if value put for parsing is not an int. Junio and Karsten, both raised the point for saving <filename,linenr>, but I can't find any use cases for it other than what I mentioned above. Thanks. Tanay. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-04 9:25 ` Tanay Abhra @ 2014-07-04 9:43 ` Matthieu Moy 2014-07-07 17:02 ` Junio C Hamano 1 sibling, 0 replies; 17+ messages in thread From: Matthieu Moy @ 2014-07-04 9:43 UTC (permalink / raw) To: Tanay Abhra; +Cc: git, Ramkumar Ramachandra Tanay Abhra <tanayabh@gmail.com> writes: > One more doubt, does <filename,linenr> for every value has any use other than > raising semantic error in typespecific API functions. I don't see any other. My suggestion would be: ignore this for now, it's not needed to get a new API that has at least as many features as the existing one. It's easy to add to the data-structure afterwards (although a bit less easy to add to the code that fills-in the data structure). -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-04 9:25 ` Tanay Abhra 2014-07-04 9:43 ` Matthieu Moy @ 2014-07-07 17:02 ` Junio C Hamano 1 sibling, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2014-07-07 17:02 UTC (permalink / raw) To: Tanay Abhra; +Cc: Matthieu Moy, git, Ramkumar Ramachandra Tanay Abhra <tanayabh@gmail.com> writes: > On 7/4/2014 2:47 PM, Matthieu Moy wrote: >> Tanay Abhra <tanayabh@gmail.com> writes: >> >>> Hi, >>> >>> I have cooked up a single hashmap implementation. What are your >>> thoughts about it? >> >> I had a quick look, and it looks good to me. I'll make a more detailed >> review when you send the next series. >> > > One more doubt, does <filename,linenr> for every value has any use other than > raising semantic error in typespecific API functions. > > For example, if we call git_config_get_int(foo.bar), we can show to the user > "value not a int at <filename, linenr>". Other than that I cannot think of > any other use of it. Currently `git_config_int` dies if value put for > parsing is not an int. > > Junio and Karsten, both raised the point for saving <filename,linenr>, but I can't > find any use cases for it other than what I mentioned above. Yes, error reporting is what the pair needs to be kept for. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 6:01 ` [PATCH v4 1/2] add `config_set` API for caching config files Tanay Abhra 2014-07-02 9:14 ` Matthieu Moy @ 2014-07-02 17:00 ` Junio C Hamano 2014-07-02 17:09 ` Matthieu Moy 2014-07-03 17:05 ` Tanay Abhra 1 sibling, 2 replies; 17+ messages in thread From: Junio C Hamano @ 2014-07-02 17:00 UTC (permalink / raw) To: Tanay Abhra Cc: git, Ramkumar Ramachandra, Matthieu Moy, Karsten Blees, Eric Sunshine Tanay Abhra <tanayabh@gmail.com> writes: > diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt > index 230b3a0..2c02fee 100644 > --- a/Documentation/technical/api-config.txt > +++ b/Documentation/technical/api-config.txt > @@ -77,6 +77,75 @@ To read a specific file in git-config format, use > `git_config_from_file`. This takes the same callback and data parameters > as `git_config`. > > +Querying For Specific Variables > +------------------------------- > + > +For programs wanting to query for specific variables in a non-callback > +manner, the config API provides two functions `git_config_get_value` > +and `git_config_get_value_multi`.They both read values from an internal > +cache generated previously from reading the config files. > + > +`git_config_get_value` takes two parameters, > + > +- a key string in canonical flat form for which the corresponding value What is "canonical flat form"? > + with the highest priority (i.e. value in the repo config will be > + preferred over value in user wide config for the same variable) will > + be retrieved. > + > +- a pointer to a string which will point to the retrieved value. The caller > + should not free or modify the value returned as it is owned by the cache. > + > +`git_config_get_value` returns 0 on success, or -1 for no value found. > + > +`git_config_get_value_multi` allocates and returns a `string_list` > +containing all the values for the key passed as parameter, sorted in order > +of increasing priority (Note: caller has to call `string_list_clear` on > +the returned pointer and then free it). > + > +`git_config_get_value_multi` returns NULL for no value found. > + > +`git_config_clear` is provided for resetting and invalidating the cache. > + > +`git_config_iter` gives a way to iterate over the keys in cache. Existing > +`git_config` callback function signature is used for iterating. Instead of doing prose above and then enumeration below, consistently using the enumeration-style would make the descriptions of API functions easier to find and read. Also show what parameters each function takes. E.g. `git_config_get_value(const char *key, const char **value)`:: Find the highest-priority value for the configuration variable `key`, store the pointer to it in `value` and return 0. When the configuration variable `key` is not found, return -1 without touching `value`. A few random thoughts. - Are "a value for the variable is found" and "no value for the variable is found" the only possible outcome? Should somebody (not necessarily the calling code) be notified of an error---for example, if you opened a config file successfully but later found that the file could not be parsed due to a syntax error or an I/O error, is it possible the caller of this function to tell, or are these just some special cases of "variable not found"? - The same goes for the "multi" variant but it is a bit more grave, as a function that returns an "int" can later be updated to return different negative values to signal different kinds of errors, but a function that returns a pointer to string-list can only return one kind of NULL ;-) - For the purpose of "git_config_get_value()", what is the value returned for core.filemode when the configuration file says "[core] filemode\n", i.e. when git_config() callback would get a NULL for value to signal a boolean true? - Is there a reason why you need a separate and new "config_iter"? What does it do differently from the good-old git_config() and how? It does not belong to "Querying for Specific Variables" anyway. > +The config API also provides type specific API functions which do conversion > +as well as retrieval for the queried variable, including: > + > +`git_config_get_int`:: > +Parse the retrieved value to an integer, including unit factors. Dies on > +error; otherwise, allocates and copies the integer into the dest parameter. > +Returns 0 on success, or 1 if no value was found. "allocates and copies"???? Is a caller forced to do something like this? int *val; if (!git_config_get_int("int.one", &val)) { use(*val); free(val); } I'd expect it to be more like: int val; if (!git_config_get_int("int.one", &val)) use(val); Also, is there a reason why the "not found" signal is "1" (as opposed to "-1" for the lower-level get_value() API)? > +`git_config_get_ulong`:: > +Identical to `git_config_get_int` but for unsigned longs. Obviously this is not identical but merely "Similar" to. > +`git_config_bool`:: git_config_get_bool, perhaps? > +Custom Configsets > +----------------- > + > +A `config_set` points at an ordered set of config files, each of > +which represents what was read and cached in-core from a file. This over-specifies the implementation. Judging from the list of API functions below, an implementation is possible without having to keep track of what source files were fed in what order (i.e. it can just keep a single hash-table of read values and incrementally parse the new contents given via configset-add-file into it, without even recording the origins, and forget about the source files once it finishes parsing). As was pointed out during the previous review round, a stack of <hash, filename> tuples cannot represent the configuration when config-includes are involved. Also we would eventually want to be able to also read from non-file sources (e.g. from a blob, or a caller-supplied in-core strings). For the purpose of reporting errors at the point of value being used, I have a suspicion that you would need to associate the <file,line> pair with each individual value string you will keep after configset_add_file() parses the file. The implementation of the call-chain may look like: git_configset_add_file(cs, filename): file = open filename lineno = 0 while read line from file: git_configset_parse_line(cs, filename, lineno++, line) close file git_configset_parse_line(cs, filename, lineno, line): ... this needs to implement a state machine ... that keeps track of what has been read halfway ... e.g. we may have seen "[core] crlf =\\\n" ... earlier, which is not a complete input yet, ... and now we may be looking at "auto" that lets ... us finally parse one item. if state in 'cs' and new 'line' gives us a complete input: determine key and value record <value, filename, lineno> for 'key' to cs.hash update state in 'cs' For processing "git -c section.variable=value", we probably would call git_configset_parse_line() on the_configset with filename set to "command line" or something. And then when the caller tries to actually use the value, e.g. git_configset_get_int(cs, key): look up <value, filename, lineno> for 'key' from the cs.hash if value is successfully parsed as int: return the parsed result else: error("not an int: '%s' (%s:%s)", value, filename, lineno) perhaps? > +It can be used to construct an in-memory cache for config files that > +the caller specifies. For example, This is almost good to help a reader decide if she might want to use it in her code, but we probably want to stress the fact that use of a config_set is done for a namespace separate from the main configuration system, e.g. ".gitmodules". > +--------------------------------------- > +struct config_set gm_config = CONFIG_SET_INIT; > +int b; > +/* we add config files to the config_set */ > +git_configset_add_file(&gm_config, ".gitmodules"); > +git_configset_add_file(&gm_config, ".gitmodules_alt"); > + > +if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore", &b)) { > +/* hack hack hack */ > +} > + > +/* when we are done with the configset */ > +git_configset_clear(&gm_config); > +---------------------------------------- > + > +Configset API provides functions for the above mentioned work flow, including: > + > +`git_configset_init`:: > + > +Returns an allocated and initialized struct `config_set` pointer. "allocated"??? Is a caller forced to do this, i.e. always have the config-set on heap? struct config_set *config = git_configset_init(); ... use it ... git_configset_clear(config); I'd expect it be more like this: struct config_set config; git_configset_init(&config); ... use it... git_configset_clear(&config); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 17:00 ` Junio C Hamano @ 2014-07-02 17:09 ` Matthieu Moy 2014-07-03 17:05 ` Tanay Abhra 1 sibling, 0 replies; 17+ messages in thread From: Matthieu Moy @ 2014-07-02 17:09 UTC (permalink / raw) To: Junio C Hamano Cc: Tanay Abhra, git, Ramkumar Ramachandra, Karsten Blees, Eric Sunshine Junio C Hamano <gitster@pobox.com> writes: > Tanay Abhra <tanayabh@gmail.com> writes: > >> diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt >> index 230b3a0..2c02fee 100644 >> --- a/Documentation/technical/api-config.txt >> +++ b/Documentation/technical/api-config.txt >> @@ -77,6 +77,75 @@ To read a specific file in git-config format, use >> `git_config_from_file`. This takes the same callback and data parameters >> as `git_config`. >> >> +Querying For Specific Variables >> +------------------------------- >> + >> +For programs wanting to query for specific variables in a non-callback >> +manner, the config API provides two functions `git_config_get_value` >> +and `git_config_get_value_multi`.They both read values from an internal >> +cache generated previously from reading the config files. >> + >> +`git_config_get_value` takes two parameters, >> + >> +- a key string in canonical flat form for which the corresponding value > > What is "canonical flat form"? Actually, it's defined above in the same file (was here before the patch): - the name of the parsed variable. This is in canonical "flat" form: the section, subsection, and variable segments will be separated by dots, and the section and variable segments will be all lowercase. E.g., But it might make sense to reword the doc, e.g. introduce a glossary section at the beginning. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/2] add `config_set` API for caching config files 2014-07-02 17:00 ` Junio C Hamano 2014-07-02 17:09 ` Matthieu Moy @ 2014-07-03 17:05 ` Tanay Abhra 1 sibling, 0 replies; 17+ messages in thread From: Tanay Abhra @ 2014-07-03 17:05 UTC (permalink / raw) To: Junio C Hamano Cc: git, Ramkumar Ramachandra, Matthieu Moy, Karsten Blees, Eric Sunshine On 7/2/2014 10:30 PM, Junio C Hamano wrote: > Tanay Abhra <tanayabh@gmail.com> writes: > >> diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt >> index 230b3a0..2c02fee 100644 >> --- a/Documentation/technical/api-config.txt >> +++ b/Documentation/technical/api-config.txt >> @@ -77,6 +77,75 @@ To read a specific file in git-config format, use >> +`git_config_get_value` returns 0 on success, or -1 for no value found. >> + >> +`git_config_get_value_multi` allocates and returns a `string_list` >> +containing all the values for the key passed as parameter, sorted in order >> +of increasing priority (Note: caller has to call `string_list_clear` on >> +the returned pointer and then free it). >> + >> +`git_config_get_value_multi` returns NULL for no value found. >> + >> +`git_config_clear` is provided for resetting and invalidating the cache. >> + >> +`git_config_iter` gives a way to iterate over the keys in cache. Existing >> +`git_config` callback function signature is used for iterating. > > Instead of doing prose above and then enumeration below, > consistently using the enumeration-style would make the descriptions > of API functions easier to find and read. Also show what parameters > each function takes. E.g. > Noted. > > A few random thoughts. > > - Are "a value for the variable is found" and "no value for the > variable is found" the only possible outcome? Should somebody > (not necessarily the calling code) be notified of an error---for > example, if you opened a config file successfully but later found > that the file could not be parsed due to a syntax error or an I/O > error, is it possible the caller of this function to tell, or are > these just some special cases of "variable not found"? The syntactical errors when parsing the file are shown when it fails to construct the hashmap. Whenever a caller calls for a value for the first time, the file is read and parsed and if it fails during parsing it dies raising a error specifying the lineno and filename. Variable not found means that the variable is not present in the file, nothing more. Note: the hashmap code uses git_config() parsing stack so whatever error it raises in normal git_config() invocation, it would be raised here too. > - The same goes for the "multi" variant but it is a bit more grave, > as a function that returns an "int" can later be updated to > return different negative values to signal different kinds of > errors, but a function that returns a pointer to string-list can > only return one kind of NULL ;-) > Null pointer just means no variable found in the files. What other kind of errors may arise? > - For the purpose of "git_config_get_value()", what is the value > returned for core.filemode when the configuration file says > "[core] filemode\n", i.e. when git_config() callback would get a > NULL for value to signal a boolean true? NULL in value pointer with 0 return value denoting variable found. > - Is there a reason why you need a separate and new "config_iter"? > What does it do differently from the good-old git_config() and > how? It does not belong to "Querying for Specific Variables" > anyway. > You mentioned previously in the list for a analogue to git_config() which supports iterating over the keys. The link is [1] I think, gmane is not working for me currently. http://thread.gmane.org/gmane.comp.version-control.git/252567 >> +The config API also provides type specific API functions which do conversion >> +as well as retrieval for the queried variable, including: >> + >> +`git_config_get_int`:: >> +Parse the retrieved value to an integer, including unit factors. Dies on >> +error; otherwise, allocates and copies the integer into the dest parameter. >> +Returns 0 on success, or 1 if no value was found. > > "allocates and copies"???? Is a caller forced to do something like > this? > > int *val; > > if (!git_config_get_int("int.one", &val)) { > use(*val); > free(val); > } > > I'd expect it to be more like: > > int val; > if (!git_config_get_int("int.one", &val)) > use(val); > Yup, you are right, my fault. > Also, is there a reason why the "not found" signal is "1" (as > opposed to "-1" for the lower-level get_value() API)? > Many of the type specific functions return -1 for wrongful parsing like git_config_get_string and git_config_get_maybe_bool, that is why I changed the return value for such functions. >> +Custom Configsets >> +----------------- >> + >> +A `config_set` points at an ordered set of config files, each of >> +which represents what was read and cached in-core from a file. > > This over-specifies the implementation. Judging from the list of > API functions below, an implementation is possible without having to > keep track of what source files were fed in what order (i.e. it can > just keep a single hash-table of read values and incrementally parse > the new contents given via configset-add-file into it, without even > recording the origins, and forget about the source files once it > finishes parsing). > > As was pointed out during the previous review round, a stack of > <hash, filename> tuples cannot represent the configuration when > config-includes are involved. Also we would eventually want to be > able to also read from non-file sources (e.g. from a blob, or a > caller-supplied in-core strings). > > For the purpose of reporting errors at the point of value being > used, I have a suspicion that you would need to associate the > <file,line> pair with each individual value string you will keep > after configset_add_file() parses the file. The implementation of > the call-chain may look like: > > > git_configset_add_file(cs, filename): > file = open filename > lineno = 0 > while read line from file: > git_configset_parse_line(cs, filename, lineno++, line) > close file > > git_configset_parse_line(cs, filename, lineno, line): > ... this needs to implement a state machine > ... that keeps track of what has been read halfway > ... e.g. we may have seen "[core] crlf =\\\n" > ... earlier, which is not a complete input yet, > ... and now we may be looking at "auto" that lets > ... us finally parse one item. > > if state in 'cs' and new 'line' gives us a complete input: > determine key and value > record <value, filename, lineno> for 'key' to cs.hash > update state in 'cs' > > For processing "git -c section.variable=value", we probably would > call git_configset_parse_line() on the_configset with filename > set to "command line" or something. > > And then when the caller tries to actually use the value, e.g. > > git_configset_get_int(cs, key): > look up <value, filename, lineno> for 'key' from the cs.hash > if value is successfully parsed as int: > return the parsed result > else: > error("not an int: '%s' (%s:%s)", value, filename, lineno) > Okay, lets see what I can do with it. > >> +It can be used to construct an in-memory cache for config files that >> +the caller specifies. For example, > > This is almost good to help a reader decide if she might want to use > it in her code, but we probably want to stress the fact that use of > a config_set is done for a namespace separate from the main > configuration system, e.g. ".gitmodules". > >> +--------------------------------------- >> +struct config_set gm_config = CONFIG_SET_INIT; >> +int b; >> +/* we add config files to the config_set */ >> +git_configset_add_file(&gm_config, ".gitmodules"); >> +git_configset_add_file(&gm_config, ".gitmodules_alt"); >> + >> +if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore", &b)) { >> +/* hack hack hack */ >> +} >> + >> +/* when we are done with the configset */ >> +git_configset_clear(&gm_config); >> +---------------------------------------- >> + >> +Configset API provides functions for the above mentioned work flow, including: >> + >> +`git_configset_init`:: >> + >> +Returns an allocated and initialized struct `config_set` pointer. > > "allocated"??? Is a caller forced to do this, i.e. always have the > config-set on heap? > > struct config_set *config = git_configset_init(); > ... use it ... > git_configset_clear(config); > > I'd expect it be more like this: > > struct config_set config; > > git_configset_init(&config); > ... use it... > git_configset_clear(&config); > Yup, I prefer the second one. Thanks for the review, Tanay Abhra. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/2] test-config: Add tests for the config_set API 2014-07-02 6:01 [PATCH v4 0/3] git config cache & special querying api utilizing the cache Tanay Abhra 2014-07-02 6:01 ` [PATCH v4 1/2] add `config_set` API for caching config files Tanay Abhra @ 2014-07-02 6:01 ` Tanay Abhra 2014-07-02 9:29 ` Matthieu Moy 1 sibling, 1 reply; 17+ messages in thread From: Tanay Abhra @ 2014-07-02 6:01 UTC (permalink / raw) To: git Cc: Tanay Abhra, Ramkumar Ramachandra, Matthieu Moy, Karsten Blees, Junio C Hamano, Eric Sunshine Expose the `config_set` C API as a set of simple commands in order to facilitate testing. Add tests for the `config_set` API as well as for `git_config_get_*()` family for the usual config files. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> --- .gitignore | 1 + Makefile | 1 + t/t1308-config-hash.sh | 163 +++++++++++++++++++++++++++++++++++++++++++++++++ test-config.c | 129 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 294 insertions(+) create mode 100755 t/t1308-config-hash.sh create mode 100644 test-config.c diff --git a/.gitignore b/.gitignore index 42294e5..eeb66e2 100644 --- a/.gitignore +++ b/.gitignore @@ -176,6 +176,7 @@ /gitweb/static/gitweb.js /gitweb/static/gitweb.min.* /test-chmtime +/test-config /test-ctype /test-date /test-delta diff --git a/Makefile b/Makefile index d503f78..e875070 100644 --- a/Makefile +++ b/Makefile @@ -548,6 +548,7 @@ X = PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS)) TEST_PROGRAMS_NEED_X += test-chmtime +TEST_PROGRAMS_NEED_X += test-config TEST_PROGRAMS_NEED_X += test-ctype TEST_PROGRAMS_NEED_X += test-date TEST_PROGRAMS_NEED_X += test-delta diff --git a/t/t1308-config-hash.sh b/t/t1308-config-hash.sh new file mode 100755 index 0000000..6c059c5 --- /dev/null +++ b/t/t1308-config-hash.sh @@ -0,0 +1,163 @@ +#!/bin/sh + +test_description='Test git config-hash API in different settings' + +. ./test-lib.sh + +test_expect_success 'clear default config' ' + rm -f .git/config +' + +cat > .git/config << EOF +[core] + penguin = very blue + Movie = BadPhysics + UPPERCASE = true + MixedCase = true + my = + foo + baz = sam +[Cores] + WhatEver = Second +[my "Foo bAr"] + hi = hello +[core] + baz = bat + baz = hask +[lamb] + chop = 65 + head = none +[goat] + legs = 4 + head = true + skin = false + nose = 1 + horns +EOF + +test_expect_success 'get value for a simple key' ' + echo "very blue" >expect && + test-config get_value core.penguin >actual && + test_cmp expect actual +' + +test_expect_success 'get value for a key with value as an empty string' ' + echo "" >expect && + test-config get_value core.my >actual && + test_cmp expect actual +' + +test_expect_success 'get value for a key with value as NULL' ' + echo "(NULL)" >expect && + test-config get_value core.foo >actual && + test_cmp expect actual +' +test_expect_success 'upper case key' ' + echo "true" >expect && + test-config get_value core.UPPERCASE >actual && + test_cmp expect actual +' +test_expect_success 'mixed case key' ' + echo "true" >expect && + test-config get_value core.MixedCase >actual && + test_cmp expect actual +' +test_expect_success 'key and value with mixed case' ' + echo "BadPhysics" >expect && + test-config get_value core.Movie >actual && + test_cmp expect actual +' + +test_expect_success 'key with case sensitive subsection' ' + echo "hello" >expect && + test-config get_value "my.Foo bAr.hi">actual && + test_cmp expect actual +' + +test_expect_success 'find value with misspelled key' ' + echo "Value not found for \"my.fOo Bar.hi\"" >expect && + test_must_fail test-config get_value "my.fOo Bar.hi">actual && + test_cmp expect actual +' + +test_expect_success 'find value with the highest priority' ' + echo hask >expect && + test-config get_value "core.baz">actual && + test_cmp expect actual +' + +test_expect_success 'find integer value for a key' ' + echo 65 >expect && + test-config get_int lamb.chop >actual && + test_cmp expect actual +' + +test_expect_success 'find integer if value is non parse-able' ' + echo 65 >expect && + test_must_fail test-config get_int lamb.head >actual && + test_must_fail test_cmp expect actual +' + +cat > expect << EOF +1 +0 +1 +1 +1 +EOF + +test_expect_success 'find bool value for the entered key' ' + test-config get_bool goat.head >>actual && + test-config get_bool goat.skin >>actual && + test-config get_bool goat.nose >>actual && + test-config get_bool goat.horns >>actual && + test-config get_bool goat.legs >>actual && + test_cmp expect actual +' + +cat > expect << EOF +sam +bat +hask +EOF + +test_expect_success 'find multiple values' ' + test-config get_value_multi "core.baz">actual && + test_cmp expect actual +' + +cat > config2 << EOF +[core] + baz = lama +[my] + new = silk +[core] + baz = ball +EOF + +test_expect_success 'find value from a configset' ' + echo silk > expect && + test-config configset_get_value 2 config2 .git/config my.new >actual && + test_cmp expect actual +' + +test_expect_success 'find value with highest priority from a configset' ' + echo hask > expect && + test-config configset_get_value 2 config2 .git/config core.baz >actual && + test_cmp expect actual +' + +cat > except << EOF +sam +bat +hask +lama +ball +EOF + +test_expect_success 'find value_list for a key from a configset' ' + test-config configset_get_value 2 config2 .git/config core.baz >actual && + test_cmp expect actual +' + +test_done diff --git a/test-config.c b/test-config.c new file mode 100644 index 0000000..32c912a --- /dev/null +++ b/test-config.c @@ -0,0 +1,129 @@ +#include "cache.h" +#include "hashmap.h" +#include "string-list.h" + +/* + * This program exposes the C API of the `config_set` mechanism + * as a set of simple commands in order to facilitate testing. + * + * Reads stdin and prints result of command to stdout: + * + * get_value -> prints the value with highest priority for the entered key + * + * get_value_multi -> prints all values for the entered key in increasing order + * of priority + * + * get_int -> print integer value for the entered key or die + * + * get_bool -> print bool value for the entered key or die + * + * configset_get_value -> first makes a config-set of files entered as arguments + * and then returns value with the highest priority for the + * entered key. + * + * configset_get_value_multi -> first makes a config-set of files entered as + * arguments and then returns value_list for the entered + * key sorted in ascending order of priority. + * + * Examples: + * + * To print the value with highest priority for key "foo.bAr Baz.rock": + * test-config get_value "foo.bAr Baz.rock" + * + */ + + +int main(int argc, char **argv) +{ + int i, no_of_files; + int ret = 0; + const char *v; + int val; + const struct string_list *strptr; + struct config_set cs = CONFIG_SET_INIT; + if (argc == 3 && !strcmp(argv[1], "get_value")) { + if (!git_config_get_value(argv[2], &v)) { + if (!v) + printf("(NULL)\n"); + else + printf("%s\n", v); + return 0; + } else { + printf("Value not found for \"%s\"\n", argv[2]); + return -1; + } + } else if (argc == 3 && !strcmp(argv[1], "get_value_multi")) { + strptr = git_config_get_value_multi(argv[2]); + if (strptr) { + for (i = 0; i < strptr->nr; i++) { + v = strptr->items[i].string; + if (!v) + printf("(NULL)\n"); + else + printf("%s\n", v); + } + return 0; + } else { + printf("Value not found for \"%s\"\n", argv[2]); + return -1; + } + } else if (argc == 3 && !strcmp(argv[1], "get_int")) { + if (!git_config_get_int(argv[2], &val)) { + printf("%d\n", val); + return 0; + } else { + printf("Value not found for \"%s\"\n", argv[2]); + return -1; + } + } else if (argc == 3 && !strcmp(argv[1], "get_bool")) { + if (!git_config_get_bool(argv[2], &val)) { + printf("%d\n", val); + return 0; + } else { + printf("Value not found for \"%s\"\n", argv[2]); + return -1; + } + } else if (!strcmp(argv[1], "configset_get_value")) { + no_of_files = git_config_int("unused", argv[2]); + for (i = 1; i <= no_of_files; i++) { + ret = git_configset_add_file(&cs, argv[i + 2]); + if (ret) + return -1; + } + if (!git_configset_get_value(&cs, argv[3 + no_of_files], &v)) { + if (!v) + printf("(NULL)\n"); + else + printf("%s\n", v); + return 0; + } else { + printf("Value not found for \"%s\"\n", argv[2]); + return -1; + } + + } else if (!strcmp(argv[1], "configset_get_value_multi")) { + no_of_files = git_config_int("unused", argv[2]); + for (i = 1; i <= no_of_files; i++) { + ret = git_configset_add_file(&cs, argv[i + 2]); + if (ret) + return -1; + } + strptr = git_configset_get_value_multi(&cs, argv[3 + no_of_files]); + if (strptr) { + for (i = 0; i < strptr->nr; i++) { + v = strptr->items[i].string; + if (!v) + printf("(NULL)\n"); + else + printf("%s\n", v); + } + return 0; + } else { + printf("Value not found for \"%s\"\n", argv[2]); + return -1; + } + } + + fprintf(stderr, "%s: Please check the syntax and the function name\n", argv[0]); + return 1; +} -- 1.9.0.GIT ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] test-config: Add tests for the config_set API 2014-07-02 6:01 ` [PATCH 2/2] test-config: Add tests for the config_set API Tanay Abhra @ 2014-07-02 9:29 ` Matthieu Moy 2014-07-02 12:04 ` Tanay Abhra 0 siblings, 1 reply; 17+ messages in thread From: Matthieu Moy @ 2014-07-02 9:29 UTC (permalink / raw) To: Tanay Abhra Cc: git, Ramkumar Ramachandra, Karsten Blees, Junio C Hamano, Eric Sunshine Tanay Abhra <tanayabh@gmail.com> writes: > +test_expect_success 'clear default config' ' > + rm -f .git/config > +' > + > +cat > .git/config << EOF t/README says: - Put all code inside test_expect_success and other assertions. Even code that isn't a test per se, but merely some setup code should be inside a test assertion. Even these cat > would better be in a test_expect_success 'initialize config'. (Not applied everywhere in Git's code essentially because some tests were written before the guideline was set and never updated). > +[core] > + penguin = very blue > + Movie = BadPhysics > + UPPERCASE = true > + MixedCase = true > + my = > + foo > + baz = sam > +[Cores] > + WhatEver = Second > +[my "Foo bAr"] > + hi = hello To really stress the "case sensitive middle part" case, you should also have other sections like [my "foo bar"] hi = lower-case [my "FOO BAR"] hi = upper-case and check that you get the right value for my.*.hi Similarly, I'd add a [CORE] and a [CoRe] section to check that their content is actually merged with [core]. > +test_expect_success 'get value for a key with value as an empty string' ' > + echo "" >expect && > + test-config get_value core.my >actual && > + test_cmp expect actual > +' > + > +test_expect_success 'get value for a key with value as NULL' ' > + echo "(NULL)" >expect && > + test-config get_value core.foo >actual && > + test_cmp expect actual > +' > +test_expect_success 'upper case key' ' Keep the style consistent, if you separate tests with a single blank line, do it everywhere. > +cat > expect << EOF See above, should be in test_expect_success. Also, >expect, not > expect. There are other instances. > +1 > +0 > +1 > +1 > +1 > +EOF > + > +test_expect_success 'find bool value for the entered key' ' > + test-config get_bool goat.head >>actual && The first one should be a single >, or you should clear actual before the test. > +int main(int argc, char **argv) > +{ > + int i, no_of_files; > + int ret = 0; > + const char *v; > + int val; > + const struct string_list *strptr; > + struct config_set cs = CONFIG_SET_INIT; > + if (argc == 3 && !strcmp(argv[1], "get_value")) { > + if (!git_config_get_value(argv[2], &v)) { > + if (!v) > + printf("(NULL)\n"); > + else > + printf("%s\n", v); > + return 0; > + } else { > + printf("Value not found for \"%s\"\n", argv[2]); > + return -1; > + } > + } else if (argc == 3 && !strcmp(argv[1], "get_value_multi")) { > + strptr = git_config_get_value_multi(argv[2]); > + if (strptr) { > + for (i = 0; i < strptr->nr; i++) { > + v = strptr->items[i].string; > + if (!v) > + printf("(NULL)\n"); > + else > + printf("%s\n", v); > + } > + return 0; > + } else { > + printf("Value not found for \"%s\"\n", argv[2]); > + return -1; > + } > + } else if (argc == 3 && !strcmp(argv[1], "get_int")) { > + if (!git_config_get_int(argv[2], &val)) { > + printf("%d\n", val); > + return 0; > + } else { > + printf("Value not found for \"%s\"\n", argv[2]); > + return -1; > + } > + } else if (argc == 3 && !strcmp(argv[1], "get_bool")) { > + if (!git_config_get_bool(argv[2], &val)) { > + printf("%d\n", val); > + return 0; > + } else { > + printf("Value not found for \"%s\"\n", argv[2]); > + return -1; > + } > + } else if (!strcmp(argv[1], "configset_get_value")) { > + no_of_files = git_config_int("unused", argv[2]); Why ask the user to give a number of files on the command-line. With a syntax like test-config configset_get_value <key> <files>... you could just use argc to iterate over argv. Here, you trust the user to provide the right value, and most likely segfault otherwise (and this is not really documented). I know this is only test code, but why not do it right anyway ;-). -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] test-config: Add tests for the config_set API 2014-07-02 9:29 ` Matthieu Moy @ 2014-07-02 12:04 ` Tanay Abhra 0 siblings, 0 replies; 17+ messages in thread From: Tanay Abhra @ 2014-07-02 12:04 UTC (permalink / raw) To: Matthieu Moy Cc: git, Ramkumar Ramachandra, Karsten Blees, Junio C Hamano, Eric Sunshine On 7/2/2014 2:59 PM, Matthieu Moy wrote: > Tanay Abhra <tanayabh@gmail.com> writes: > >> +test_expect_success 'clear default config' ' >> + rm -f .git/config >> +' >> + >> +cat > .git/config << EOF > > t/README says: > > - Put all code inside test_expect_success and other assertions. > > Even code that isn't a test per se, but merely some setup code > should be inside a test assertion. > > Even these cat > would better be in a test_expect_success 'initialize > config'. > > (Not applied everywhere in Git's code essentially because some tests > were written before the guideline was set and never updated). Sorry about that. I followed t1300-repo-config.sh which has these mistakes also. >> +[core] >> + penguin = very blue >> + Movie = BadPhysics >> + UPPERCASE = true >> + MixedCase = true >> + my = >> + foo >> + baz = sam >> +[Cores] >> + WhatEver = Second >> +[my "Foo bAr"] >> + hi = hello > > To really stress the "case sensitive middle part" case, you should also > have other sections like > > [my "foo bar"] > hi = lower-case > [my "FOO BAR"] > hi = upper-case > > and check that you get the right value for my.*.hi > > Similarly, I'd add a [CORE] and a [CoRe] section to check that their > content is actually merged with [core]. > Noted. >> +test_expect_success 'get value for a key with value as an empty string' ' >> + echo "" >expect && >> + test-config get_value core.my >actual && >> + test_cmp expect actual >> +' >> + >> +test_expect_success 'get value for a key with value as NULL' ' >> + echo "(NULL)" >expect && >> + test-config get_value core.foo >actual && >> + test_cmp expect actual >> +' >> +test_expect_success 'upper case key' ' > > Keep the style consistent, if you separate tests with a single blank > line, do it everywhere. > >> +cat > expect << EOF > > See above, should be in test_expect_success. > > Also, >expect, not > expect. > > There are other instances. > Noted. Again copied t1300-repo-config.sh style for cat. >> +1 >> +0 >> +1 >> +1 >> +1 >> +EOF >> + >> +test_expect_success 'find bool value for the entered key' ' >> + test-config get_bool goat.head >>actual && > > The first one should be a single >, or you should clear actual before > the test. > Noted. >> +int main(int argc, char **argv) >> +{ >> + int i, no_of_files; >> + int ret = 0; >> + const char *v; >> + int val; >> + const struct string_list *strptr; >> + struct config_set cs = CONFIG_SET_INIT; > > > >> + if (argc == 3 && !strcmp(argv[1], "get_value")) { >> + if (!git_config_get_value(argv[2], &v)) { >> + if (!v) >> + printf("(NULL)\n"); >> + else >> + printf("%s\n", v); >> + return 0; >> + } else { >> + printf("Value not found for \"%s\"\n", argv[2]); >> + return -1; >> + } >> + } else if (argc == 3 && !strcmp(argv[1], "get_value_multi")) { >> + strptr = git_config_get_value_multi(argv[2]); >> + if (strptr) { >> + for (i = 0; i < strptr->nr; i++) { >> + v = strptr->items[i].string; >> + if (!v) >> + printf("(NULL)\n"); >> + else >> + printf("%s\n", v); >> + } >> + return 0; >> + } else { >> + printf("Value not found for \"%s\"\n", argv[2]); >> + return -1; >> + } >> + } else if (argc == 3 && !strcmp(argv[1], "get_int")) { >> + if (!git_config_get_int(argv[2], &val)) { >> + printf("%d\n", val); >> + return 0; >> + } else { >> + printf("Value not found for \"%s\"\n", argv[2]); >> + return -1; >> + } >> + } else if (argc == 3 && !strcmp(argv[1], "get_bool")) { >> + if (!git_config_get_bool(argv[2], &val)) { >> + printf("%d\n", val); >> + return 0; >> + } else { >> + printf("Value not found for \"%s\"\n", argv[2]); >> + return -1; >> + } >> + } else if (!strcmp(argv[1], "configset_get_value")) { >> + no_of_files = git_config_int("unused", argv[2]); > > Why ask the user to give a number of files on the command-line. With a > syntax like > > test-config configset_get_value <key> <files>... > > you could just use argc to iterate over argv. Here, you trust the user > to provide the right value, and most likely segfault otherwise (and this > is not really documented). I know this is only test code, but why not do > it right anyway ;-). > Yup, your way is much better, thanks for the review. Tanay. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-07-07 17:02 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-02 6:01 [PATCH v4 0/3] git config cache & special querying api utilizing the cache Tanay Abhra 2014-07-02 6:01 ` [PATCH v4 1/2] add `config_set` API for caching config files Tanay Abhra 2014-07-02 9:14 ` Matthieu Moy 2014-07-02 11:58 ` Tanay Abhra 2014-07-02 14:32 ` Matthieu Moy 2014-07-02 17:09 ` Junio C Hamano 2014-07-04 4:58 ` Tanay Abhra 2014-07-04 9:17 ` Matthieu Moy 2014-07-04 9:25 ` Tanay Abhra 2014-07-04 9:43 ` Matthieu Moy 2014-07-07 17:02 ` Junio C Hamano 2014-07-02 17:00 ` Junio C Hamano 2014-07-02 17:09 ` Matthieu Moy 2014-07-03 17:05 ` Tanay Abhra 2014-07-02 6:01 ` [PATCH 2/2] test-config: Add tests for the config_set API Tanay Abhra 2014-07-02 9:29 ` Matthieu Moy 2014-07-02 12:04 ` Tanay Abhra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).