git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jens Lehmann <Jens.Lehmann@web.de>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH v2] Performance optimization for detection of modified submodules
Date: Mon, 18 Jan 2010 17:44:14 -0800	[thread overview]
Message-ID: <7v1vhmg975.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <4B54C3EA.9090200@web.de> (Jens Lehmann's message of "Mon\, 18 Jan 2010 21\:26\:18 +0100")

Jens Lehmann <Jens.Lehmann@web.de> writes:

> - Changed the type of the new dirty_submodule flags to unsigned.

Why?  An unsigned used in place of a bool raises an eyebrow as it is more
common to use "int" (the most natural type on the platform).  Going
against tradition makes readers waste time wondering if there were some
other reason why the code couldn't use "int" and had to use "unsigned"
(e.g. "Hmmpphh, it looked like a mere boolean but the use of 'unsigned'
suggests there might be something deeper going on.  Is this used as a
bitfield?  Does this count and cannot go negative?").

> @@ -173,12 +174,16 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  			if (silent_on_removed)
>  				continue;
>  			diff_addremove(&revs->diffopt, '-', ce->ce_mode,
> -				       ce->sha1, ce->name);
> +				       ce->sha1, ce->name, 0);

Removed from work tree; cannot be dirty---good.

> +		if (S_ISGITLINK(ce->ce_mode)
> +		    && (!changed || (revs->diffopt.output_format & DIFF_FORMAT_PATCH))
> +		    && is_submodule_modified(ce->name)) {
> +			changed = 1;
> +			dirty_submodule = 1;
> +		}

If the index records a submodule commit, and the commit checked out in the
submodule is different from that commit, ce_compare_gitlink() called from
ce_match_stat() would have already said "changed".  If we want "-dirty",
we need to call is_submodule_modified(), but otherwise we don't.  Looks
good.

Does DIFF_FORMAT_PATCH cover all cases where we may want a reliable value
in "dirty_submodule" here?  Should use of "--submodule=log" affect this
decision?

> @@ -188,7 +193,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  		newmode = ce_mode_from_stat(ce, st.st_mode);
>  		diff_change(&revs->diffopt, oldmode, newmode,
>  			    ce->sha1, (changed ? null_sha1 : ce->sha1),
> -			    ce->name);
> +			    ce->name, 0, dirty_submodule);

LHS is always index and RHS is work tree whose dirtiness is in
dirty_submodule; good.

> @@ -204,16 +209,18 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  static void diff_index_show_file(struct rev_info *revs,
>  				 const char *prefix,
>  				 struct cache_entry *ce,
> -				 const unsigned char *sha1, unsigned int mode)
> +				 const unsigned char *sha1, unsigned int mode,
> +				 unsigned dirty_submodule)
>  {
>  	diff_addremove(&revs->diffopt, prefix[0], mode,
> -		       sha1, ce->name);
> +		       sha1, ce->name, dirty_submodule);

Mental note to myself.  prefix[0] is either '-' (removed from the work
tree) or '+' (added to the work tree).  Added submodule could be
immediately dirty.

> @@ -251,15 +263,17 @@ static void show_new_file(struct rev_info *revs,
>  {
> ...
> -	diff_index_show_file(revs, "+", new, sha1, mode);
> +	diff_index_show_file(revs, "+", new, sha1, mode, dirty_submodule);

And this caller handles that case correctly; good.

> @@ -270,11 +284,13 @@ static int show_modified(struct rev_info *revs,
>  {
>  	unsigned int mode, oldmode;
>  	const unsigned char *sha1;
> +	unsigned dirty_submodule = 0;
>
> -	if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0) {
> +	if (get_stat_data(new, &sha1, &mode, cached, match_missing,
> +			  &dirty_submodule, revs->diffopt.output_format) < 0) {
>  		if (report_missing)
>  			diff_index_show_file(revs, "-", old,
> -					     old->sha1, old->ce_mode);
> +					     old->sha1, old->ce_mode, 0);

Again, removed from work tree cannot be dirty; good.

> @@ -309,7 +325,7 @@ static int show_modified(struct rev_info *revs,
>  		return 0;
>
>  	diff_change(&revs->diffopt, oldmode, mode,
> -		    old->sha1, sha1, old->name);
> +		    old->sha1, sha1, old->name, 0, dirty_submodule);
>  	return 0;
>  }

Comparing between a tree entry and either an index entry or a file in the
work tree.  When cached (i.e. not looking at work tree), get_stat_data()
doesn't touch dirty_submodule, so this won't make noises about submodule
dirtyness, which is good.

> @@ -356,7 +372,7 @@ static void do_oneway_diff(struct unpack_trees_options *o,
>  	 * Something removed from the tree?
>  	 */
>  	if (!idx) {
> -		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode);
> +		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode, 0);

Removed from the work tree and canont be dirty; good.

> diff --git a/diff.c b/diff.c
> index 012b3d3..f130a36 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3736,7 +3736,7 @@ int diff_result_code(struct diff_options *opt, int status)
>  void diff_addremove(struct diff_options *options,
>  		    int addremove, unsigned mode,
>  		    const unsigned char *sha1,
> -		    const char *concatpath)
> +		    const char *concatpath, unsigned dirty_submodule)

A removed submodule cannot be dirty, so we can get away with only one
dirty_submodule that always refers to the RHS (i.e. "two"); I see.

> diff --git a/diffcore.h b/diffcore.h
> index 5b63458..66687c3 100644
> --- a/diffcore.h
> +++ b/diffcore.h
> @@ -42,6 +42,7 @@ struct diff_filespec {
>  #define DIFF_FILE_VALID(spec) (((spec)->mode) != 0)
>  	unsigned should_free : 1; /* data should be free()'ed */
>  	unsigned should_munmap : 1; /* data should be munmap()'ed */
> +	unsigned dirty_submodule : 1;  /* For submodules: its work tree is dirty */

By the way, we might want to consolidate these bitfields into a single 

	unsigned should_free:1,
        	 should_munmap:1,
                 dirty_submodule:1;

in a separate clean-up patch, after all the dust settles.

  reply	other threads:[~2010-01-19  1:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-18 20:26 [PATCH v2] Performance optimization for detection of modified submodules Jens Lehmann
2010-01-19  1:44 ` Junio C Hamano [this message]
2010-01-19  8:35   ` Jens Lehmann
2010-01-19 18:09     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7v1vhmg975.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=Jens.Lehmann@web.de \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).