git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCH v2 0/2] Git config cache & special querying api utilizing the cache
@ 2014-06-02 14:47 Tanay Abhra
  2014-06-02 14:47 ` [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval Tanay Abhra
  2014-06-02 14:47 ` [RFC/PATCH v2 2/2] config: Add new query functions to the api docs Tanay Abhra
  0 siblings, 2 replies; 7+ messages in thread
From: Tanay Abhra @ 2014-06-02 14:47 UTC (permalink / raw)
  To: git; +Cc: Tanay Abhra, Ramkumar Ramachandra, Matthieu Moy, Eric Sunshine

Hi,

[V2]: Improved according to the suggestions by Eric Sunshine and Torsten Bogershausen.
      Added cache invalidation when config file is changed.
      I am using git_config_set_multivar_in_file() as an update hook.

This is my first patch series for this year's GSoC. My project is
"Git Config API improvements". The link of my proposal is appended below [1].

The aim of this patch series is to generate a cache for querying values from
the config files in a non-callback manner as the current method reads and
parses the config files every time a value is queried for.

The cache is generated from hooking the update_cache function to the current
parsing and callback mechanism in config.c. It is implemented as an hashmap
using the hashmap-api with variables and its corresponding values list as
its members. The values in the list are sorted in order of increasing priority.
The cache is initialised in git_config_early() as it is the first time a `git_config`
function is called during program startup. setup_git_directory_gently() calls
git_config_early() which in turn reads every config file (local, user and
global config files).

get_value() in config.c feeds variable and values into the callback function.
Using this function as a hook, we update the cache. Also, we add two new
functions to the config-api git_config_get_string() and
git_config_get_string_multi() for querying in a non callback manner from
the cache.

I have run the tests and debug the code and it works, but I have to add a
few things,

1. Metadata about the variables and values. I have added only the file
   from each variable value pair comes in an upcoming series.

What else should I add or implement ;is my approach right? 

[1] https://drive.google.com/file/d/0B4suZ-aHqDcnSUZJRXVTTnZUN1E/edit?usp=sharing

Cheers,
Tanay Abhra.

Tanay Abhra (2):
  config: Add hashtable for config parsing & retrieval
  config: Add new query functions to the api docs

 Documentation/technical/api-config.txt |  18 +++++
 cache.h                                |   2 +
 config.c                               | 118 +++++++++++++++++++++++++++++++++
 3 files changed, 138 insertions(+)

-- 
1.9.0.GIT

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval
  2014-06-02 14:47 [RFC/PATCH v2 0/2] Git config cache & special querying api utilizing the cache Tanay Abhra
@ 2014-06-02 14:47 ` Tanay Abhra
  2014-06-02 19:33   ` Torsten Bögershausen
                     ` (2 more replies)
  2014-06-02 14:47 ` [RFC/PATCH v2 2/2] config: Add new query functions to the api docs Tanay Abhra
  1 sibling, 3 replies; 7+ messages in thread
From: Tanay Abhra @ 2014-06-02 14:47 UTC (permalink / raw)
  To: git; +Cc: Tanay Abhra, Ramkumar Ramachandra, Matthieu Moy, Eric Sunshine

Add a hash table to cache all key-value pairs read from config files
(repo specific .git/config, user wide ~/.gitconfig and the global
/etc/gitconfig). Add two external functions `git_config_get_string` and
`git_config_get_string_multi` for querying in a non-callback manner from the
hash table.

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
---
 cache.h  |   2 ++
 config.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 120 insertions(+)

diff --git a/cache.h b/cache.h
index 107ac61..2038150 100644
--- a/cache.h
+++ b/cache.h
@@ -1271,6 +1271,8 @@ extern int check_repository_format_version(const char *var, const char *value, v
 extern int git_env_bool(const char *, int);
 extern int git_config_system(void);
 extern int config_error_nonbool(const char *);
+extern const char *git_config_get_string(const char *);
+extern const struct string_list *git_config_get_string_multi(const char *);
 #if defined(__GNUC__) && ! defined(__clang__)
 #define config_error_nonbool(s) (config_error_nonbool(s), -1)
 #endif
diff --git a/config.c b/config.c
index a30cb5c..23c9403 100644
--- a/config.c
+++ b/config.c
@@ -9,6 +9,8 @@
 #include "exec_cmd.h"
 #include "strbuf.h"
 #include "quote.h"
+#include "hashmap.h"
+#include "string-list.h"
 
 struct config_source {
 	struct config_source *prev;
@@ -37,6 +39,112 @@ static struct config_source *cf;
 
 static int zlib_compression_seen;
 
+static struct hashmap config_cache;
+
+struct config_cache_entry {
+	struct hashmap_entry ent;
+	char *key;
+	struct string_list *value_list;
+};
+
+static int hashmap_is_init = 0;
+
+static int config_cache_entry_cmp_icase(const struct config_cache_entry *e1,
+				 const struct config_cache_entry *e2, const char* key)
+{
+	return strcasecmp(e1->key, key ? key : e2->key);
+}
+
+static void config_cache_init(void)
+{
+	hashmap_init(&config_cache, (hashmap_cmp_fn)config_cache_entry_cmp_icase, 0);
+}
+
+static void config_cache_free(void)
+{
+	struct config_cache_entry* entry;
+	struct hashmap_iter iter;
+	hashmap_iter_init(&config_cache, &iter);
+	while (entry = hashmap_iter_next(&iter)) {
+		free(entry->key);
+		string_list_clear(entry->value_list, 0);
+	}
+	hashmap_free(&config_cache, 1);
+}
+
+static struct config_cache_entry *config_cache_find_entry(const char *key)
+{
+	struct hashmap_entry k;
+	hashmap_entry_init(&k, strihash(key));
+	return hashmap_get(&config_cache, &k, key);
+}
+
+static struct string_list *config_cache_get_value(const char *key)
+{
+	struct config_cache_entry *e = config_cache_find_entry(key);
+	return e ? e->value_list : NULL;
+}
+
+
+static void config_cache_set_value(const char *key, const char *value)
+{
+	struct config_cache_entry *e;
+
+	e = config_cache_find_entry(key);
+	if (!e) {
+		e = malloc(sizeof(*e));
+		hashmap_entry_init(e, strihash(key));
+		e->key=xstrdup(key);
+		e->value_list = xcalloc(sizeof(struct string_list), 1);
+		e->value_list->strdup_strings = 1;
+		string_list_append(e->value_list, value);
+		hashmap_add(&config_cache, e);
+	} else {
+		if (!unsorted_string_list_has_string(e->value_list, value))
+			string_list_append(e->value_list, value);
+	}
+}
+
+static void config_cache_remove_value(const char *key, const char *value)
+{
+	struct config_cache_entry *e;
+	int i;
+
+	e = config_cache_find_entry(key);
+	if(!e)
+		return;
+	/* If value == NULL then remove all the entries for the key */
+	if(!value) {
+		string_list_clear(e->value_list, 0);
+		free(e->value_list);
+		hashmap_remove(&config_cache, e, NULL);
+		return;
+	}
+	/* All the occurances of "value" will be deleted */
+	for (i = 0; i < e->value_list->nr; i++)
+		if (!strcmp(value, e->value_list->items[i].string))
+			unsorted_string_list_delete_item(e->value_list, i, 0);
+	if(e->value_list->nr == 0) {
+		string_list_clear(e->value_list, 0);
+		free(e->value_list);
+		hashmap_remove(&config_cache, e, NULL);
+	}
+}
+
+extern const char *git_config_get_string(const char *name)
+{
+	struct string_list *values;
+	values = config_cache_get_value(name);
+	if (!values)
+		return NULL;
+	return values->items[values->nr - 1].string;
+}
+
+extern const struct string_list *git_config_get_string_multi(const char *key)
+{
+	return config_cache_get_value(key);
+}
+
 static int config_file_fgetc(struct config_source *conf)
 {
 	return fgetc(conf->u.file);
@@ -333,6 +441,7 @@ static int get_value(config_fn_t fn, void *data, struct strbuf *name)
 		if (!value)
 			return -1;
 	}
+	config_cache_set_value(name->buf, value);
 	return fn(name->buf, value, data);
 }
 
@@ -1135,6 +1244,11 @@ int git_config_early(config_fn_t fn, void *data, const char *repo_config)
 	char *xdg_config = NULL;
 	char *user_config = NULL;
 
+	if (!hashmap_is_init) {
+		config_cache_init();
+		hashmap_is_init = 1;
+	}
+
 	home_config_paths(&user_config, &xdg_config, "config");
 
 	if (git_config_system() && !access_or_die(git_etc_gitconfig(), R_OK, 0)) {
@@ -1570,6 +1684,7 @@ int git_config_set_multivar_in_file(const char *config_filename,
 		if (!store_write_section(fd, key) ||
 		    !store_write_pair(fd, key, value))
 			goto write_err_out;
+		else config_cache_set_value(key, value);
 	} else {
 		struct stat st;
 		char *contents;
@@ -1673,6 +1788,9 @@ int git_config_set_multivar_in_file(const char *config_filename,
 			}
 			if (!store_write_pair(fd, key, value))
 				goto write_err_out;
+			else config_cache_set_value(key, value);
+		} else {
+			config_cache_remove_value(key, NULL);
 		}
 
 		/* write the rest of the config */
-- 
1.9.0.GIT

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC/PATCH v2 2/2] config: Add new query functions to the api docs
  2014-06-02 14:47 [RFC/PATCH v2 0/2] Git config cache & special querying api utilizing the cache Tanay Abhra
  2014-06-02 14:47 ` [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval Tanay Abhra
@ 2014-06-02 14:47 ` Tanay Abhra
  2014-06-03 15:35   ` Matthieu Moy
  1 sibling, 1 reply; 7+ messages in thread
From: Tanay Abhra @ 2014-06-02 14:47 UTC (permalink / raw)
  To: git; +Cc: Tanay Abhra, Ramkumar Ramachandra, Matthieu Moy, Eric Sunshine

Add explanations for `git_config_get_string_multi` and `git_config_get_string`
which utilize the config hash table for querying in a non-callback manner for
a specific variable.

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
---
 Documentation/technical/api-config.txt | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt
index 230b3a0..8f1a37e 100644
--- a/Documentation/technical/api-config.txt
+++ b/Documentation/technical/api-config.txt
@@ -77,6 +77,24 @@ To read a specific file in git-config format, use
 `git_config_from_file`. This takes the same callback and data parameters
 as `git_config`.
 
+Querying For Specific Variables
+-------------------------------
+
+For programs wanting to query for specific variables in a non-callback
+manner, the config API provides two functions `git_config_get_string`
+and `git_config_get_string_multi`. They both take a single parameter,
+
+- a key string in canonical flat form for which the corresponding values
+  will be retrieved and returned.
+
+They both read values from an internal cache generated previously from
+reading the config files. `git_config_get_string` returns the value with
+the highest priority(i.e. value in the repo config will be preferred over
+value in user wide config for the same variable).
+
+`git_config_get_string_multi` returns a `string_list` containing all the
+values for that particular key, sorted in order of increasing priority.
+
 Value Parsing Helpers
 ---------------------
 
-- 
1.9.0.GIT

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval
  2014-06-02 14:47 ` [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval Tanay Abhra
@ 2014-06-02 19:33   ` Torsten Bögershausen
  2014-06-03  7:58   ` Jeff King
  2014-06-09  8:17   ` Eric Sunshine
  2 siblings, 0 replies; 7+ messages in thread
From: Torsten Bögershausen @ 2014-06-02 19:33 UTC (permalink / raw)
  To: Tanay Abhra, git; +Cc: Ramkumar Ramachandra, Matthieu Moy, Eric Sunshine

On 2014-06-02 16.47, Tanay Abhra wrote:

[]
Please see 3 minor remarks inline.
> --- a/config.c
> +++ b/config.c
> @@ -9,6 +9,8 @@
>  #include "exec_cmd.h"
>  #include "strbuf.h"
>  #include "quote.h"
> +#include "hashmap.h"
> +#include "string-list.h"
>  
>  struct config_source {
>  	struct config_source *prev;
> @@ -37,6 +39,112 @@ static struct config_source *cf;
>  
>  static int zlib_compression_seen;
>  
> +static struct hashmap config_cache;
> +
> +struct config_cache_entry {
> +	struct hashmap_entry ent;
> +	char *key;
> +	struct string_list *value_list;
> +};
> +
> +static int hashmap_is_init = 0;
we don't need the " = 0", as all static data is initialized to 0 (or NULL) 
> +
> +static int config_cache_entry_cmp_icase(const struct config_cache_entry *e1,
> +				 const struct config_cache_entry *e2, const char* key)
the * should be aligned to the variable "key":
const char *key

[]

> +static void config_cache_set_value(const char *key, const char *value)
> +{
> +	struct config_cache_entry *e;
> +
> +	e = config_cache_find_entry(key);
> +	if (!e) {
> +		e = malloc(sizeof(*e));
I think we need xmalloc() here (from wrapper.c)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval
  2014-06-02 14:47 ` [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval Tanay Abhra
  2014-06-02 19:33   ` Torsten Bögershausen
@ 2014-06-03  7:58   ` Jeff King
  2014-06-09  8:17   ` Eric Sunshine
  2 siblings, 0 replies; 7+ messages in thread
From: Jeff King @ 2014-06-03  7:58 UTC (permalink / raw)
  To: Tanay Abhra; +Cc: git, Ramkumar Ramachandra, Matthieu Moy, Eric Sunshine

On Mon, Jun 02, 2014 at 07:47:39AM -0700, Tanay Abhra wrote:

> +static int config_cache_entry_cmp_icase(const struct config_cache_entry *e1,
> +				 const struct config_cache_entry *e2, const char* key)
> +{
> +	return strcasecmp(e1->key, key ? key : e2->key);
> +}
> +
> +static void config_cache_init(void)
> +{
> +	hashmap_init(&config_cache, (hashmap_cmp_fn)config_cache_entry_cmp_icase, 0);
> +}

It looks like this is meant to treat config keys like "diff.renames" as
case-insensitive. That's OK for two-part names, but longer names with a
middle section, like "remote.Foo.url", should have their middle sections
be case-sensitive.

Also, please avoid casting function ptrs. Give your function the correct
hashmap_cmp_fn signature, and cast from "void *" internally if you need
to.

> +static void config_cache_free(void)
> +{
> +	struct config_cache_entry* entry;
> +	struct hashmap_iter iter;
> +	hashmap_iter_init(&config_cache, &iter);
> +	while (entry = hashmap_iter_next(&iter)) {

Please wrap assignment inside a loop condition with an extra (), like:

  while ((entry = hashmap_iter_next(&iter)))

This suppresses gcc's "did you mean ==" warning. If you are not
compiling with "-Wall -Werror" for development, you probably should be.

> +static struct config_cache_entry *config_cache_find_entry(const char *key)
> +{
> +	struct hashmap_entry k;
> +	hashmap_entry_init(&k, strihash(key));
> +	return hashmap_get(&config_cache, &k, key);
> +}

As above, strihash isn't quite right here. It needs a custom function
that takes into account the case-sensitivity of the middle section of
the key name.

An alternative is to only feed already-normalized names into the hash
code, and then tell the hash to be case-sensitive.

> +static void config_cache_set_value(const char *key, const char *value)
> +{
> +	struct config_cache_entry *e;
> +
> +	e = config_cache_find_entry(key);
> +	if (!e) {
> +		e = malloc(sizeof(*e));
> +		hashmap_entry_init(e, strihash(key));
> +		e->key=xstrdup(key);

Style: please keep whitespace around "=".

> +	} else {
> +		if (!unsorted_string_list_has_string(e->value_list, value))
> +			string_list_append(e->value_list, value);
> +	}

I don't think we want to suppress duplicates here. If a caller reads the
value with git_config_get_string(), then the duplicates do not matter
(we show only the one with precedence). If they use the "multi()" form,
then they _should_ see each version. It is not up to the config code to
decide that duplicate values for the same key are not interesting; only
the caller knows that.

> +static void config_cache_remove_value(const char *key, const char *value)
> +{
> +	struct config_cache_entry *e;
> +	int i;
> +
> +	e = config_cache_find_entry(key);
> +	if(!e)

Style: if (!e)

> +	/* If value == NULL then remove all the entries for the key */
> +	if(!value) {

Ditto (and elsewhere, but I'll stop noting each).

> +extern const char *git_config_get_string(const char *name)
> +{
> +	struct string_list *values;
> +	values = config_cache_get_value(name);
> +	if (!values)
> +		return NULL;
> +	return values->items[values->nr - 1].string;
> +}

The assumption here is that values->nr is never 0. I think that is true,
because we do not create the cache entry until we get a value (and
remove it if we drop the last value). However, it might be worth
documenting that with an assertion.

> @@ -333,6 +441,7 @@ static int get_value(config_fn_t fn, void *data, struct strbuf *name)
>               if (!value)
>                       return -1;
>       }
> +     config_cache_set_value(name->buf, value);
>       return fn(name->buf, value, data);
>  }

The function get_value gets called for each entry from a parsed config
file. However, it does not get called for command-line config from "git
-c". You'd need to update git_config_parse_parameter for that.

However, I wonder if this is the right place to hook into the cache at
all. The cache should be able to sit _atop_ the existing callback
system. So you could easily implement it as a separate layer, and just
lazily initialize it like:

  static int config_cache_callback(const char *key, const char *val, void *cache)
  {
	config_cache_set_value(cache, key, val);
	return 0;
  }

  static struct hashmap *get_config_cache(void)
  {
	static int initialized;
	static struct hashmap cache;
	if (!initialized) {
		hash_map_init(&cache, 
		git_config(config_cache_callback, &cache);
		initialized = 1;
	}
  }

and then teach config_cache_get_value and friends to access the cache
through "get_config_cache".

> @@ -1570,6 +1684,7 @@ int git_config_set_multivar_in_file(const char *config_filename,
>  		if (!store_write_section(fd, key) ||
>  		    !store_write_pair(fd, key, value))
>  			goto write_err_out;
> +		else config_cache_set_value(key, value);
>  	} else {
>  		struct stat st;
>  		char *contents;
> @@ -1673,6 +1788,9 @@ int git_config_set_multivar_in_file(const char *config_filename,
>  			}
>  			if (!store_write_pair(fd, key, value))
>  				goto write_err_out;
> +			else config_cache_set_value(key, value);
> +		} else {
> +			config_cache_remove_value(key, NULL);
>  		}
>  
>  		/* write the rest of the config */

These two seem suspect to me. How do we know that config_filename that
we are manipulating is even referenced by the config cache? When we
remove an entry, how do we know we are getting the entry from that file,
and not a similar entry in another file?

Stepping back, though, do we need to keep the cache up to date here with
micro-changes at all? If we have updated the config files on disk, then
the simplest thing would be to just invalidate the whole cache (and then
lazily re-read it if we actually need it again). This should be a rare
case (I suspect that "git init" and "git clone" might want this, but
that's about it).

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/PATCH v2 2/2] config: Add new query functions to the api docs
  2014-06-02 14:47 ` [RFC/PATCH v2 2/2] config: Add new query functions to the api docs Tanay Abhra
@ 2014-06-03 15:35   ` Matthieu Moy
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Moy @ 2014-06-03 15:35 UTC (permalink / raw)
  To: Tanay Abhra; +Cc: git, Ramkumar Ramachandra, Eric Sunshine

Tanay Abhra <tanayabh@gmail.com> writes:

> Add explanations for `git_config_get_string_multi` and `git_config_get_string`
> which utilize the config hash table for querying in a non-callback manner for
> a specific variable.

I'd squash this into the previous patch: the reader appreciates having
the documentation when reviewing the code itself (so that one can check
that the documented behavior matches the implementation).

> +the highest priority(i.e. value in the repo config will be preferred over

Missing space before (.

Other than that, this seems good.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval
  2014-06-02 14:47 ` [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval Tanay Abhra
  2014-06-02 19:33   ` Torsten Bögershausen
  2014-06-03  7:58   ` Jeff King
@ 2014-06-09  8:17   ` Eric Sunshine
  2 siblings, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2014-06-09  8:17 UTC (permalink / raw)
  To: Tanay Abhra
  Cc: Git List, Ramkumar Ramachandra, Matthieu Moy,
	Torsten Bögershausen, Jeff King

In addition to the valuable review comments by Torsten and Peff, find
more below...

On Mon, Jun 2, 2014 at 10:47 AM, Tanay Abhra <tanayabh@gmail.com> wrote:
> Add a hash table to cache all key-value pairs read from config files
> (repo specific .git/config, user wide ~/.gitconfig and the global
> /etc/gitconfig). Add two external functions `git_config_get_string` and
> `git_config_get_string_multi` for querying in a non-callback manner from the
> hash table.
>
> Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
> ---
> diff --git a/config.c b/config.c
> index a30cb5c..23c9403 100644
> --- a/config.c
> +++ b/config.c
> @@ -9,6 +9,8 @@
>  #include "exec_cmd.h"
>  #include "strbuf.h"
>  #include "quote.h"
> +#include "hashmap.h"
> +#include "string-list.h"
>
>  struct config_source {
>         struct config_source *prev;
> @@ -37,6 +39,112 @@ static struct config_source *cf;
>
>  static int zlib_compression_seen;
>
> +static struct hashmap config_cache;
> +
> +struct config_cache_entry {
> +       struct hashmap_entry ent;
> +       char *key;
> +       struct string_list *value_list;

Same question as in my v1 review [1]: Why is this 'struct string_list
*' rather than just 'struct string_list'?

[1]: http://git.661346.n2.nabble.com/RFC-PATCH-0-2-Git-config-cache-amp-special-querying-api-utilizing-the-cache-td7611691.html#a7611860

> +};
> +
> +static int hashmap_is_init = 0;
> +
> +static int config_cache_entry_cmp_icase(const struct config_cache_entry *e1,
> +                                const struct config_cache_entry *e2, const char* key)
> +{
> +       return strcasecmp(e1->key, key ? key : e2->key);

Are you planning on taking advantage of this extra complexity (the
'key ? key : e2->key' stuff)? At present, the lone caller _always_
provides the 'key', so 'e2->key' is never used. If not needed,
dropping it will simplify things for future readers of the code. (But,
see below for more commentary...)

> +}
> +
> +static void config_cache_init(void)
> +{
> +       hashmap_init(&config_cache, (hashmap_cmp_fn)config_cache_entry_cmp_icase, 0);
> +}
> +
> +static void config_cache_free(void)
> +{
> +       struct config_cache_entry* entry;

Style: struct config_cache_entry *entry;

> +       struct hashmap_iter iter;
> +       hashmap_iter_init(&config_cache, &iter);
> +       while (entry = hashmap_iter_next(&iter)) {
> +               free(entry->key);
> +               string_list_clear(entry->value_list, 0);
> +       }
> +       hashmap_free(&config_cache, 1);
> +}

config_cache_free() is never called.

> +static struct config_cache_entry *config_cache_find_entry(const char *key)
> +{
> +       struct hashmap_entry k;
> +       hashmap_entry_init(&k, strihash(key));
> +       return hashmap_get(&config_cache, &k, key);

This feels very dangerous. You've declared
config_cache_entry_cmp_icase() as taking a 'struct config_cache_entry
*' as its second argument, however, you are passing a 'struct
hashmap_entry *'. While it may work correctly in this special case as
you've written it, where config_cache_entry_cmp_icase() never accesses
any members of its second argument beyond 'ent', it is not
transparent. A future programmer may trust the type of the incoming
second argument and change config_cache_entry_cmp_icase() to access
elements beyond 'ent' without realizing that those elements aren't
actually present.

It seems safer to code this as in the example section of
Documentation/api-hashmap.txt:

    struct config_cache_entry k;
    hashmap_entry_init(&k, strihash(key));
    k.key = key; /* maybe cast-away const here */
    return hashmap_get(&config_cache, &k, NULL);

And modify config_cache_entry_cmp_icase() to always access its second
argument and treat the third as unused.

> +}
> +
> +static struct string_list *config_cache_get_value(const char *key)
> +{
> +       struct config_cache_entry *e = config_cache_find_entry(key);
> +       return e ? e->value_list : NULL;
> +}
> +
> +
> +static void config_cache_set_value(const char *key, const char *value)
> +{
> +       struct config_cache_entry *e;
> +
> +       e = config_cache_find_entry(key);
> +       if (!e) {
> +               e = malloc(sizeof(*e));
> +               hashmap_entry_init(e, strihash(key));
> +               e->key=xstrdup(key);
> +               e->value_list = xcalloc(sizeof(struct string_list), 1);
> +               e->value_list->strdup_strings = 1;
> +               string_list_append(e->value_list, value);
> +               hashmap_add(&config_cache, e);
> +       } else {
> +               if (!unsorted_string_list_has_string(e->value_list, value))
> +                       string_list_append(e->value_list, value);
> +       }

Also mentioned in [1]: As there is only a single 'if' statement within
this 'else' clause, you could turn this into an 'else if' and drop one
level of indentation. (Although, as Peff mentioned in his review, you
probably don't want to be folding out duplicate values anyhow, so this
'if' would go away.)

> +}
> +
> +static void config_cache_remove_value(const char *key, const char *value)
> +{
> +       struct config_cache_entry *e;
> +       int i;
> +
> +       e = config_cache_find_entry(key);
> +       if(!e)
> +               return;
> +       /* If value == NULL then remove all the entries for the key */
> +       if(!value) {
> +               string_list_clear(e->value_list, 0);
> +               free(e->value_list);
> +               hashmap_remove(&config_cache, e, NULL);

Isn't this leaking e->key?

> +               return;
> +       }
> +       /* All the occurances of "value" will be deleted */

s/occurances/occurrences/

> +       for (i = 0; i < e->value_list->nr; i++)
> +               if (!strcmp(value, e->value_list->items[i].string))
> +                       unsorted_string_list_delete_item(e->value_list, i, 0);

This loop is broken. It wants to delete all elements matching 'value',
but it will skip some. For instance, if 'value_list' contains items
["foo", "foo"], and you ask it to remove all "foo", it will return
["foo"]. This is because you increment 'i' even though the 'i'th
element has just been removed. To fix, either don't increment 'i' when
an item is removed, or reverse the order of visitation (from back to
front).

> +       if(e->value_list->nr == 0) {
> +               string_list_clear(e->value_list, 0);
> +               free(e->value_list);
> +               hashmap_remove(&config_cache, e, NULL);

Isn't this leaking e->key?

> +       }
> +}
> +
> +extern const char *git_config_get_string(const char *name)
> +{
> +       struct string_list *values;
> +       values = config_cache_get_value(name);
> +       if (!values)
> +               return NULL;
> +       return values->items[values->nr - 1].string;
> +}
> +
> +extern const struct string_list *git_config_get_string_multi(const char *key)
> +{
> +       return config_cache_get_value(key);
> +}
> +
>  static int config_file_fgetc(struct config_source *conf)
>  {
>         return fgetc(conf->u.file);
> @@ -333,6 +441,7 @@ static int get_value(config_fn_t fn, void *data, struct strbuf *name)
>                 if (!value)
>                         return -1;
>         }
> +       config_cache_set_value(name->buf, value);
>         return fn(name->buf, value, data);
>  }
>
> @@ -1135,6 +1244,11 @@ int git_config_early(config_fn_t fn, void *data, const char *repo_config)
>         char *xdg_config = NULL;
>         char *user_config = NULL;
>
> +       if (!hashmap_is_init) {
> +               config_cache_init();
> +               hashmap_is_init = 1;
> +       }
> +
>         home_config_paths(&user_config, &xdg_config, "config");
>
>         if (git_config_system() && !access_or_die(git_etc_gitconfig(), R_OK, 0)) {
> @@ -1570,6 +1684,7 @@ int git_config_set_multivar_in_file(const char *config_filename,
>                 if (!store_write_section(fd, key) ||
>                     !store_write_pair(fd, key, value))
>                         goto write_err_out;
> +               else config_cache_set_value(key, value);
>         } else {
>                 struct stat st;
>                 char *contents;
> @@ -1673,6 +1788,9 @@ int git_config_set_multivar_in_file(const char *config_filename,
>                         }
>                         if (!store_write_pair(fd, key, value))
>                                 goto write_err_out;
> +                       else config_cache_set_value(key, value);
> +               } else {
> +                       config_cache_remove_value(key, NULL);
>                 }
>
>                 /* write the rest of the config */
> --
> 1.9.0.GIT
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-06-09  8:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-02 14:47 [RFC/PATCH v2 0/2] Git config cache & special querying api utilizing the cache Tanay Abhra
2014-06-02 14:47 ` [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval Tanay Abhra
2014-06-02 19:33   ` Torsten Bögershausen
2014-06-03  7:58   ` Jeff King
2014-06-09  8:17   ` Eric Sunshine
2014-06-02 14:47 ` [RFC/PATCH v2 2/2] config: Add new query functions to the api docs Tanay Abhra
2014-06-03 15:35   ` Matthieu Moy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).