All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
To: Yao Zhao <zhaox383@umn.edu>
Cc: mhagger@alum.mit.edu, peff@peff.net, git@vger.kernel.org
Subject: Re: [GSOC 2014]idea:Git Configuration API Improvement
Date: Thu, 20 Mar 2014 10:10:23 +0100	[thread overview]
Message-ID: <vpqd2hh5j7k.fsf@anie.imag.fr> (raw)
In-Reply-To: <1395300220-7540-1-git-send-email-zhaox383@umn.edu> (Yao Zhao's message of "Thu, 20 Mar 2014 02:23:40 -0500")

Hi,

Yao Zhao <zhaox383@umn.edu> writes:

> First is about when to start reading configuration file to cache. My
> idea is the time user starts call command that need configuration
> information (need to read configuration file).

I'd actually load the configuration lazily, when Git first requires a
configuration variable's value. Something like

int config_has_been_loaded = 0;

git_config() {
	if (!config_has_been_loaded) {
		load_config();
		config_has_been_loaded = 1;
	} else if (cache_is_outdated()) {
		load_config();
	} else { /* Nothing to do, we're good */ }
	do_something_with_loaded_config();
}

> Second is about data structure. I read Peff's email listed on idea
> page. He indicated two methods and I prefer syntax tree.

Why?

(In general, explaining why you chose something is more important than
explaining what you chose)

> I think there should be three or more syntax tree in the cache. One
> for system, one for global and one for local. If user indicate a file
> to be configuration file, add one more tree. Or maybe we can build one
> tree and tag every node to indicate where it belongs to.

A tree (AST, Abstract syntax tree) can be interesting if you have some
source-to-source transformations to do on the configuration files (i.e.
edit the config files themselves).

For read-only accesses, I would find it more natural to have a
data-structure that reflects the configuration variables themselves, not
the way they appear in the config file. For example, a map (hashtable)
associating to each config variable the corresponding value (which may
be a scalar value or a list, depending on the variable).

But the really important part here is the API exposed to the user, not
the internal data-structure. A map would be "more efficient" (O(1) or
O(log(n)) access), but traversing the AST for each config request would
not really harm: this is currently what we're doing, except that we
currently re-parse the file each time. OTOH, the API should hide the AST
for most uses. If the user wants the value of configuration variable
"foo", the code to do that should not be much more complex than
get_value_for_config_variable("foo"). (well, I did oversimplify a bit
here).

> Third one is about when to write back to file, I am really confused
> about it. I think one way could be when user leave git repository
> using "cd" to go back. But I am not sure if git could detect user
> calls "cd" to leave repository.

There semes to be a misunderstanding here. The point of the project is
to have a per-process cache, but Git does not normally store a state in
memory between two calls. IOW, when you run

  git status
  cd ../
  git log

The call to "git status" creates a process, but the process dies before
you run "cd". The call to "git log" is a different process. It can
re-use things that "git status" left on disk, but not in-memory data
structures.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

  parent reply	other threads:[~2014-03-20  9:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-20  7:23 [GSOC 2014]idea:Git Configuration API Improvement Yao Zhao
2014-03-20  8:47 ` Michael Haggerty
2014-03-20  9:10 ` Matthieu Moy [this message]
2014-03-20 17:36   ` Junio C Hamano
2014-03-20 21:15   ` Yao Zhao
2014-03-21  8:23     ` Matthieu Moy
     [not found]       ` <CAP4BmmHAuWYgeMvk11ywJ5eXQunCWPepGEE3UTKfiVn60RQVrw@mail.gmail.com>
2014-03-21 13:42         ` Matthieu Moy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vpqd2hh5j7k.fsf@anie.imag.fr \
    --to=matthieu.moy@grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    --cc=zhaox383@umn.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.