* diff machinery cleanup
@ 2006-08-10 8:24 Jeff King
2006-08-10 9:36 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2006-08-10 8:24 UTC (permalink / raw)
To: git
I'm trying to run two different 'diffs' in the same C program, and the
results of the second diff depend on whether or not the first diff has
run. A minimal case is below. When show_updated() is run, show_changed()
results in all files listed as "unmerged". If show_updated() is
commented out, the output for show_changed() is as expected.
-- >8 --
#include "cache.h"
#include "object.h"
#include "commit.h"
#include "diff.h"
#include "revision.h"
void show_updated() {
struct rev_info rev;
const char *argv[] = { NULL, "HEAD", NULL };
init_revisions(&rev, NULL);
setup_revisions(2, argv, &rev, NULL);
rev.diffopt.output_format = DIFF_FORMAT_NAME_STATUS;
run_diff_index(&rev, 1);
}
void show_changed() {
struct rev_info rev;
const char *argv[] = { NULL, NULL };
init_revisions(&rev, NULL);
setup_revisions(1, argv, &rev, NULL);
rev.diffopt.output_format = DIFF_FORMAT_NAME_STATUS;
run_diff_files(&rev, 0);
}
int main(int argc, char **argv) {
printf("updated:\n");
show_updated();
printf("changed:\n");
show_changed();
return 0;
}
-- >8 --
It seems clear that there's some global magic touched by the first diff
that impacts the second. I have to give up on finding it for tonight,
but I'm hoping somebody who knows more about the code will find it
obvious (or can tell me that I'm doing something else horribly wrong in
the above, or that these functions were never intended to be called
within the same program).
-Peff
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: diff machinery cleanup
2006-08-10 8:24 diff machinery cleanup Jeff King
@ 2006-08-10 9:36 ` Junio C Hamano
2006-08-10 10:38 ` Jeff King
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2006-08-10 9:36 UTC (permalink / raw)
To: git
Jeff King <peff@peff.net> writes:
> It seems clear that there's some global magic touched by the first diff
> that impacts the second. I have to give up on finding it for tonight,
> but I'm hoping somebody who knows more about the code will find it
> obvious (or can tell me that I'm doing something else horribly wrong in
> the above, or that these functions were never intended to be called
> within the same program).
In general, run_diff_X are _not_ designed to run twice.
The run_diff_index() function munges the index while doing its
work (e.g. mark_merge_entries() hoists unmerged entries to stage
3 -- and worse yet creating duplicate entries for the same path
at stage 3; read_tree() reads the entries into stage 1 so that
it can be compared in-index with stage 0 entries). The other
function, run_diff_files() have the same assumption but does not
touch index if I recall correctly.
If you are working in "next" branch where Johannes's merge-recur
work introduced discard_cache(), you could fake this somehow
stashing away a copy of the original index, and once you are
done with run_diff_index(), clean the slate by calling
discard_cache() once you are done, and swap the original index
in before running run_diff_files().
To solve this cleanly without doing the index munging hack, you
would (actually, I would) need to have a new path walker that
walks index, tree and working tree in parallel, which I was
working on in the git-status/git-commit rewrite I started and
discarded a few days ago.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: diff machinery cleanup
2006-08-10 9:36 ` Junio C Hamano
@ 2006-08-10 10:38 ` Jeff King
2006-08-10 17:06 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2006-08-10 10:38 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Thu, Aug 10, 2006 at 02:36:49AM -0700, Junio C Hamano wrote:
> In general, run_diff_X are _not_ designed to run twice.
OK, makes sense. As you probably guessed, the reason is for the
run-status in C.
> If you are working in "next" branch where Johannes's merge-recur
> work introduced discard_cache(), you could fake this somehow
> stashing away a copy of the original index, and once you are
> done with run_diff_index(), clean the slate by calling
> discard_cache() once you are done, and swap the original index
> in before running run_diff_files().
OK, doing a discard_cache() between the call to run_diff_index and
run_diff_files seems to clear up the problem. But if I understand
correctly, are you saying that run_diff_index has munged the index on
disk, and I really need to be poking at a temporary copy? If so, why
isn't that a problem when running (e.g.) "git-diff-index; git-ls-files"?
> To solve this cleanly without doing the index munging hack, you
> would (actually, I would) need to have a new path walker that
> walks index, tree and working tree in parallel, which I was
> working on in the git-status/git-commit rewrite I started and
> discarded a few days ago.
That does sound the cleanest, and it would enable a more useful status
message, as you mentioned before. What caused you to stop working on it?
Infeasible, or simply more infeasible than you would like right now?
-Peff
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: diff machinery cleanup
2006-08-10 10:38 ` Jeff King
@ 2006-08-10 17:06 ` Junio C Hamano
2006-08-10 20:10 ` Jeff King
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2006-08-10 17:06 UTC (permalink / raw)
To: Jeff King; +Cc: git
Jeff King <peff@peff.net> writes:
> OK, doing a discard_cache() between the call to run_diff_index and
> run_diff_files seems to clear up the problem. But if I understand
> correctly, are you saying that run_diff_index has munged the index on
> disk, and I really need to be poking at a temporary copy? If so, why
> isn't that a problem when running (e.g.) "git-diff-index; git-ls-files"?
No, run_diff_index munges the index in-core, and it does not
writes it out for obvious reasons.
Some of the "interrogation commands" do munge the index without
writing it out because that is the easiest/cleanest way to
implement what they do. ls-files does it to filter by paths,
for example.
> That does sound the cleanest, and it would enable a more useful status
> message, as you mentioned before. What caused you to stop working on it?
> Infeasible, or simply more infeasible than you would like right now?
Finding out somebody was already working on it?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: diff machinery cleanup
2006-08-10 17:06 ` Junio C Hamano
@ 2006-08-10 20:10 ` Jeff King
2006-08-10 21:02 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2006-08-10 20:10 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Thu, Aug 10, 2006 at 10:06:42AM -0700, Junio C Hamano wrote:
> No, run_diff_index munges the index in-core, and it does not
> writes it out for obvious reasons.
Ah, OK, that makes more sense. Is there any reason then that this
wouldn't work (it certainly seems to, but I don't want to be causing
invisible problems that will come back later)? You seemed to outline a
much more complex procedure in your original mail.
run_diff_index();
discard_cache();
...
run_diff_files();
I'd like to go with this simple solution at first for the C runstatus,
and then hopefully move to simultaneous diffing later.
-Peff
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: diff machinery cleanup
2006-08-10 20:10 ` Jeff King
@ 2006-08-10 21:02 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2006-08-10 21:02 UTC (permalink / raw)
To: Jeff King; +Cc: git
Jeff King <peff@peff.net> writes:
> Ah, OK, that makes more sense. Is there any reason then that this
> wouldn't work (it certainly seems to, but I don't want to be causing
> invisible problems that will come back later)?
It should. I wanted to avoid re-reading the index from the
filesystem for some reason I cannot justify anymore. The
response was sent before my first caffeine for the day, perhaps
that was the reason ;-).
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-08-10 21:02 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-10 8:24 diff machinery cleanup Jeff King
2006-08-10 9:36 ` Junio C Hamano
2006-08-10 10:38 ` Jeff King
2006-08-10 17:06 ` Junio C Hamano
2006-08-10 20:10 ` Jeff King
2006-08-10 21:02 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).