* Re: Races on ref .lock files?
From: Bryan Turner @ 2016-12-16 23:34 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Andreas Krey, Git Users
In-Reply-To: <xmqqpokru6yg.fsf@gitster.mtv.corp.google.com>
Andreas,
Bitbucket Server developer here. Typically these errors on your client
are indicative of git gc --auto being triggered by git-receive-pack on
the server. Auto GC directly attached to a push in a repository with
pull requests often fails due to concurrent ref updates linked to
background pull request processing.
If you'd like to investigate more in depth, I'd encourage you to
create a ticket on support.atlassian.com so we can work with you.
Otherwise, if you just want to prevent seeing these messages, you can
either fork the relevant repository in Bitbucket Server (which
disables auto GC), or run "git config gc.auto 0" in
/opt/apps/.../repositories/68. Once auto GC is disabled, Bitbucket
Server will automatically take over managing GC for the repository
without any additional configuration required.
Note that we're working on revamping our GC handling such that auto GC
will always be disabled for all repositories and managed explicitly
within Bitbucket Server instead, so a future upgrade should
automatically prevent these messages from appearing on clients.
Best regards,
Bryan Turner
On Fri, Dec 16, 2016 at 9:20 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Andreas Krey <a.krey@gmx.de> writes:
>
>> We're occasionally seeing a lot of
>>
>> error: cannot lock ref 'stash-refs/pull-requests/18978/to': Unable to create '/opt/apps/..../repositories/68/stash-refs/pull-requests/18978/to.lock': File exists.
>>
>> from the server side with fetches as well as pushes. (Bitbucket server.)
>>
>> What I find strange is that neither the fetches nor the pushes even
>> touch these refs (but the bitbucket triggers underneath might).
>>
>> But my question is whether there are race conditions that can cause
>> such messages in regular operation - they continue with 'If no other git
>> process is currently running, this probably means a git process crashed
>> in this repository earlier.' which indicates some level of anticipation.
>
> I think (and I think you also think) these messages come from the
> Bitbucket side, not your "git push" (or "git fetch"). Not having
> seen Bitbucket's sources, I can only guess, but assuming that its
> pull-request is triggered from their Web frontend like GitHub's
> does, it is quite possible when you try to "push" into (or "fetch"
> from, for that matter) a repository, somebody is clicking a button
> to create that ref. We do not know what their "receive-pack" that
> responds to your "git push" (or "upload-pack" for "git fetch") does
> when there are locked refs. I'd naively think that unless you are
> pushing to that ref you showed an error message for, the receiving
> end shouldn't care if the ref is being written by somebody else, but
> who knows ;-) They may have their own reasons wanting to lock that
> ref that we think would be irrelevant for the operation, causing
> errors.
>
>
>
^ permalink raw reply
* Re: [RFC/PATCH] Makefile: suppress some cppcheck false-positives
From: Jeff King @ 2016-12-16 22:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Chris Packham, git, stefan.naewe, gitter.spiros
In-Reply-To: <xmqqshpnsoij.fsf@gitster.mtv.corp.google.com>
On Fri, Dec 16, 2016 at 10:43:48AM -0800, Junio C Hamano wrote:
> > diff --git a/nedmalloc.supp b/nedmalloc.supp
> [...]
> > diff --git a/regcomp.supp b/regcomp.supp
>
> Yuck for both files for multiple reasons.
>
> I do not think it is a good idea to allow these files to clutter the
> top-level of tree. How often do we expect that we may have to add
> more of these files? Every time we start borrowing code from third
> parties?
>
> What is the goal we want to achieve by running cppcheck?
>
> a. Our code must be clean but we do not bother "fixing" [*1*] the
> code we borrow from third parties and squelch output instead?
>
> b. Both our own code and third party code we borrow need to be free
> of errors and misdetections from cppcheck?
>
> c. Something else?
>
> If a. is what we aim for, perhaps a better option may be not to run
> cppcheck for the code we borrowed from third-party at all in the
> first place.
>
> If b. is our goal, we need to make sure that the false positive rate
> of cppcheck is acceptably low.
I think (b) is the goal; we'd hope that both our code and third party
code would be bug-free. I think it's a fact of life with a static
analysis tool that we're going to have to silence some false positives.
I think Chris started with the ones in compat because we are pretty sure
they won't change much, so suppressing them by line number is easy. But
we'd need to revisit this for our code, too. So just turning it off for
compat/ is only punting on the problem for a little while. :)
I do think it would be less gross if we could put these files into
compat/nedmalloc, etc. I don't know if you can specify
--suppressions-list multiple times, but certainly we could do some
pre-processing like:
find . -name '*.cppcheck' |
while read suppfile; do
dir=$(dirname $suppfile)
sed "s{^}{$dir/}" <$suppfile
done >master-suppfile
cppcheck --suppressions-file=master-suppfile
That would at least let us drop individual suppression files into their
respective directories.
I do wonder, though, if the "inline" suppressions would be less painful.
It looks like doing:
// cppcheck-suppress uninitialized
int var = var;
would work. I'm not sure if it understands non-C99 comments, though
maybe it is time for us to loosen that rule. And suppressing the false
positives that way does avoid fighting with gcc's analyzer, since we're
not changing the code.
The real question is how often we'd have to sprinkle those comments, and
how painful it would be. I see only 4 false positives that need
suppressed in our code, but 2 of them rub me the wrong way. They are due
to the tool failing to realize that die() is marked with NORETURN.
Marking some site as "no, this isn't a double-free, the other code path
would have died" feels like the wrong spot. The tool failure isn't where
we're marking, but rather 10 lines above.
-Peff
^ permalink raw reply
* Re: index-pack outside of repository?
From: Junio C Hamano @ 2016-12-16 21:55 UTC (permalink / raw)
To: Jeff King; +Cc: git
In-Reply-To: <20161216214322.xibllaw2iibhc5nv@sigill.intra.peff.net>
Jeff King <peff@peff.net> writes:
> Ah, I only checked that it did not do anything terrible like write into
> ".git". But it looks like it still looks at the git_dir value as part of
> the collision check.
>
> Here's a patch, on top of the rest of the series.
Thanks for a quick turnaround, as always.
> -- >8 --
> Subject: [PATCH] index-pack: skip collision check when not in repository
>
> You can run "git index-pack path/to/foo.pack" outside of a
> repository to generate an index file, or just to verify the
> contents. There's no point in doing a collision check, since
> we obviously do not have any objects to collide with.
>
> The current code will blindly look in .git/objects based on
> the result of setup_git_env(). That effectively gives us the
> right answer (since we won't find any objects), but it's a
> waste of time, and it conflicts with our desire to
> eventually get rid of the "fallback to .git" behavior of
> setup_git_env().
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I didn't do a test, as this doesn't really have any user-visible
> behavior change.
>
> I guess technically if you had a non-repo with
> ".git/objects/12/3456..." in it we would erroneously read it, but that's
> kind of bizarre. The interesting test is that when merged with
> jk/no-looking-at-dotgit-outside-repo-final, the test in t5300 doesn't
> die.
Yes.
>
> builtin/index-pack.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index d450a6ada2..f4b87c6c9f 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -787,13 +787,15 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
> const unsigned char *sha1)
> {
> void *new_data = NULL;
> - int collision_test_needed;
> + int collision_test_needed = 0;
>
> assert(data || obj_entry);
>
> - read_lock();
> - collision_test_needed = has_sha1_file_with_flags(sha1, HAS_SHA1_QUICK);
> - read_unlock();
> + if (startup_info->have_repository) {
> + read_lock();
> + collision_test_needed = has_sha1_file_with_flags(sha1, HAS_SHA1_QUICK);
> + read_unlock();
> + }
>
> if (collision_test_needed && !data) {
> read_lock();
^ permalink raw reply
* Re: "disabling bitmap writing, as some objects are not being packed"?
From: Jeff King @ 2016-12-16 21:49 UTC (permalink / raw)
To: David Turner; +Cc: Junio C Hamano, git, Nguyễn Thái Ngọc Duy
In-Reply-To: <1481924416.28176.19.camel@frank>
On Fri, Dec 16, 2016 at 04:40:16PM -0500, David Turner wrote:
> I would assume, based on the documentation, that auto gc would be doing
> an all-into-one repack:
> "If the number of packs exceeds the value of gc.autopacklimit, then
> existing packs (except those marked with a .keep file) are
> consolidated into a single pack by using the -A option of git
> repack."
>
> I don't have any settings that limit the size of packs, either. And a
> manual git repack -a -d creates only a single pack. Its loneliness
> doesn't last long, because pretty soon a new pack is created by an
> incoming push.
The interesting code is in need_to_gc():
/*
* If there are too many loose objects, but not too many
* packs, we run "repack -d -l". If there are too many packs,
* we run "repack -A -d -l". Otherwise we tell the caller
* there is no need.
*/
if (too_many_packs())
add_repack_all_option();
else if (!too_many_loose_objects())
return 0;
So if you have (say) 10 packs and 10,000 objects, we'll incrementally
pack those objects into a single new pack.
I never noticed this myself because we do not use auto-gc at GitHub at
all. We only ever do a big all-into-one repack.
> Unless this just means that some objects are being kept loose (perhaps
> because they are unreferenced)?
If they're unreferenced, they won't be part of the new pack. You might
accumulate loose objects that are ejected from previous packs, which
could trigger auto-gc to do an incremental pack (even though it wouldn't
be productive, because they're unreferenced!). You may also get them
from pushes (small pushes will be exploded into loose objects by
default).
-Peff
^ permalink raw reply
* Re: index-pack outside of repository?
From: Jeff King @ 2016-12-16 21:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqlgvfso16.fsf@gitster.mtv.corp.google.com>
On Fri, Dec 16, 2016 at 10:54:13AM -0800, Junio C Hamano wrote:
> I am tempted to suggest an intermediate step that comes before
> b1ef400eec ("setup_git_env: avoid blind fall-back to ".git"",
> 2016-10-20), which is the attached, and publish that as part of an
> official release. That way, we'll see what is broken without
> hurting people too much (unless they or their scripts care about
> extra message given to the standard error stream). I suspect that
> released Git has a slightly larger user base than what is cooked on
> 'next'.
>
> environment.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/environment.c b/environment.c
> index 0935ec696e..88f857331e 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -167,8 +167,11 @@ static void setup_git_env(void)
> const char *replace_ref_base;
>
> git_dir = getenv(GIT_DIR_ENVIRONMENT);
> - if (!git_dir)
> + if (!git_dir) {
> + if (!startup_info->have_repository)
> + warning("BUG: please report this at git@vger.kernel.org");
> git_dir = DEFAULT_GIT_DIR_ENVIRONMENT;
> + }
Yes, I think this is a nice way to ease into the change. I wish I had
thought of it when doing the original series, and we could have shipped
it in v2.11. :)
-Peff
^ permalink raw reply
* Re: index-pack outside of repository?
From: Jeff King @ 2016-12-16 21:43 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqq37hnu50y.fsf@gitster.mtv.corp.google.com>
On Fri, Dec 16, 2016 at 10:01:49AM -0800, Junio C Hamano wrote:
> >> I think 2/3 is a good change to ensure we get a reasonable error for
> >> "index-pack --stdin", and 3/3 is a very good cleanup. Both of them
> >> of course are enabled by 1/3.
> >>
> >> We still fail "nongit git index-pack tmp.pack" with a BUG: though.
> >
> > Wait.
> >
> > This only happens with a stalled-and-to-be-discarded topic on 'pu'.
> > Please don't waste time digging it (yet).
>
> Don't wait ;-). My mistake. We can see t5300 broken with this
> change and b1ef400eec ("setup_git_env: avoid blind fall-back to
> ".git"", 2016-10-20) without anything else. We still need to
> address it.
Ah, I only checked that it did not do anything terrible like write into
".git". But it looks like it still looks at the git_dir value as part of
the collision check.
Here's a patch, on top of the rest of the series.
-- >8 --
Subject: [PATCH] index-pack: skip collision check when not in repository
You can run "git index-pack path/to/foo.pack" outside of a
repository to generate an index file, or just to verify the
contents. There's no point in doing a collision check, since
we obviously do not have any objects to collide with.
The current code will blindly look in .git/objects based on
the result of setup_git_env(). That effectively gives us the
right answer (since we won't find any objects), but it's a
waste of time, and it conflicts with our desire to
eventually get rid of the "fallback to .git" behavior of
setup_git_env().
Signed-off-by: Jeff King <peff@peff.net>
---
I didn't do a test, as this doesn't really have any user-visible
behavior change. I guess technically if you had a non-repo with
".git/objects/12/3456..." in it we would erroneously read it, but that's
kind of bizarre. The interesting test is that when merged with
jk/no-looking-at-dotgit-outside-repo-final, the test in t5300 doesn't
die.
builtin/index-pack.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index d450a6ada2..f4b87c6c9f 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -787,13 +787,15 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
const unsigned char *sha1)
{
void *new_data = NULL;
- int collision_test_needed;
+ int collision_test_needed = 0;
assert(data || obj_entry);
- read_lock();
- collision_test_needed = has_sha1_file_with_flags(sha1, HAS_SHA1_QUICK);
- read_unlock();
+ if (startup_info->have_repository) {
+ read_lock();
+ collision_test_needed = has_sha1_file_with_flags(sha1, HAS_SHA1_QUICK);
+ read_unlock();
+ }
if (collision_test_needed && !data) {
read_lock();
--
2.11.0.348.g960a0b554
^ permalink raw reply related
* Re: [PATCH v7 0/7] recursively grep across submodules
From: Junio C Hamano @ 2016-12-16 21:42 UTC (permalink / raw)
To: Brandon Williams; +Cc: git, peff, sbeller, jonathantanmy, jacob.keller, j6t
In-Reply-To: <1481915002-162130-1-git-send-email-bmwill@google.com>
Brandon Williams <bmwill@google.com> writes:
> Changes in v7:
> * Rebased on 'origin/bw/realpath-wo-chdir' in order to fix the race condition
> that occurs when verifying a submodule's gitdir.
> * Reverted is_submodule_populated() to use resolve_gitdir() now that there is
> no race condition.
> * Added !MINGW to a test in t7814 so that it won't run on windows. This is due
> to testing if colons in filenames are still handled correctly, yet windows
> doesn't allow colons in filenames.
Nice.
Will queue again to see if those on other platforms have troubles
with it. I read it through again and think the series is ready for
'next'.
Thanks.
^ permalink raw reply
* Re: "disabling bitmap writing, as some objects are not being packed"?
From: David Turner @ 2016-12-16 21:40 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git, Nguyễn Thái Ngọc Duy
In-Reply-To: <20161216213214.z3mzkp2xqnwrqkh2@sigill.intra.peff.net>
On Fri, 2016-12-16 at 16:32 -0500, Jeff King wrote:
> On Fri, Dec 16, 2016 at 01:28:00PM -0800, Junio C Hamano wrote:
>
> > > 2. I don't understand what would cause that message. That is, what bad
> > > thing am I doing that I should stop doing? I've briefly skimmed the
> > > code and commit message, but the answer isn't leaping out at me.
> >
> > Enabling bitmap generation for incremental packing that does not
> > cram everything into a single pack is triggering it, I would
> > presume. Perhaps we should ignore -b option in most of the cases
> > and enable it only for "repack -a -d -f" codepath? Or detect that
> > we are being run from "gc --auto" and automatically disable -b? I
> > have a feeling that an approach along that line is closer to the
> > real solution than tweaking report_last_gc_error() and trying to
> > deduce if we are making any progress.
>
> Ah, indeed. I was thinking in my other response that "git gc" would
> always kick off an all-into-one repack. But "gc --auto" will not in
> certain cases. And yes, in those cases you definitely would want
> --no-write-bitmap-index. I think it would be reasonable for "git repack"
> to disable bitmap-writing automatically when not doing an all-into-one
> repack.
I do not have alternates and am not using --local. Nor do I have .keep
packs.
I would assume, based on the documentation, that auto gc would be doing
an all-into-one repack:
"If the number of packs exceeds the value of gc.autopacklimit, then
existing packs (except those marked with a .keep file) are
consolidated into a single pack by using the -A option of git
repack."
I don't have any settings that limit the size of packs, either. And a
manual git repack -a -d creates only a single pack. Its loneliness
doesn't last long, because pretty soon a new pack is created by an
incoming push.
Unless this just means that some objects are being kept loose (perhaps
because they are unreferenced)?
^ permalink raw reply
* Re: "disabling bitmap writing, as some objects are not being packed"?
From: Jeff King @ 2016-12-16 21:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: David Turner, git, Nguyễn Thái Ngọc Duy
In-Reply-To: <xmqqpokrr2cf.fsf@gitster.mtv.corp.google.com>
On Fri, Dec 16, 2016 at 01:28:00PM -0800, Junio C Hamano wrote:
> > 2. I don't understand what would cause that message. That is, what bad
> > thing am I doing that I should stop doing? I've briefly skimmed the
> > code and commit message, but the answer isn't leaping out at me.
>
> Enabling bitmap generation for incremental packing that does not
> cram everything into a single pack is triggering it, I would
> presume. Perhaps we should ignore -b option in most of the cases
> and enable it only for "repack -a -d -f" codepath? Or detect that
> we are being run from "gc --auto" and automatically disable -b? I
> have a feeling that an approach along that line is closer to the
> real solution than tweaking report_last_gc_error() and trying to
> deduce if we are making any progress.
Ah, indeed. I was thinking in my other response that "git gc" would
always kick off an all-into-one repack. But "gc --auto" will not in
certain cases. And yes, in those cases you definitely would want
--no-write-bitmap-index. I think it would be reasonable for "git repack"
to disable bitmap-writing automatically when not doing an all-into-one
repack.
-Peff
^ permalink raw reply
* Re: "disabling bitmap writing, as some objects are not being packed"?
From: Junio C Hamano @ 2016-12-16 21:28 UTC (permalink / raw)
To: David Turner; +Cc: git, peff, Nguyễn Thái Ngọc Duy
In-Reply-To: <1481922331.28176.11.camel@frank>
David Turner <novalis@novalis.org> writes:
> I'm a bit confused by the message "disabling bitmap writing, as some
> objects are not being packed". I see it the my gc.log file on my git
> server.
> 1. Its presence in the gc.log file prevents future automatic garbage
> collection. This seems bad. I understand the desire to avoid making
> things worse if a past gc has run into issues. But this warning is
> non-fatal; the only consequence is that many operations get slower. But
> a lack of gc when there are too many packs causes that consequence too
> (often a much worse slowdown than would be caused by the missing
> bitmap).
>
> So I wonder if it would be better for auto gc to grep gc.log for fatal
> errors (as opposed to warnings) and only skip running if any are found.
> Alternately, we could simply put warnings into gc.log.warning and
> reserve gc.log for fatal errors. I'm not sure which would be simpler.
I am not sure if string matching is really a good idea, as I'd
assume that these messages are eligible for i18n.
329e6e8794 ("gc: save log from daemonized gc --auto and print it
next time", 2015-09-19) wanted to notice that auto-gc is not
making progress and used the presense of error messages as a cue.
In your case, I think the auto-gc _is_ making progress, reducing
number of loose objects in the repository or consolidating many
packfiles into one, and the message is only about the fact that
packing is punting and not producing a bitmap as you asked, which
is different from not making any progress. I do not think log vs
warn is a good criteria to tell them apart, either.
In any case, as the error message asks the user to do, the user
eventually wants to correct the root cause before removing the
gc.log; I am not sure report_last_gc_error() is the place to correct
this in the first place.
> 2. I don't understand what would cause that message. That is, what bad
> thing am I doing that I should stop doing? I've briefly skimmed the
> code and commit message, but the answer isn't leaping out at me.
Enabling bitmap generation for incremental packing that does not
cram everything into a single pack is triggering it, I would
presume. Perhaps we should ignore -b option in most of the cases
and enable it only for "repack -a -d -f" codepath? Or detect that
we are being run from "gc --auto" and automatically disable -b? I
have a feeling that an approach along that line is closer to the
real solution than tweaking report_last_gc_error() and trying to
deduce if we are making any progress.
^ permalink raw reply
* Re: "disabling bitmap writing, as some objects are not being packed"?
From: Jeff King @ 2016-12-16 21:27 UTC (permalink / raw)
To: David Turner; +Cc: git
In-Reply-To: <1481922331.28176.11.camel@frank>
On Fri, Dec 16, 2016 at 04:05:31PM -0500, David Turner wrote:
> 1. Its presence in the gc.log file prevents future automatic garbage
> collection. This seems bad. I understand the desire to avoid making
> things worse if a past gc has run into issues. But this warning is
> non-fatal; the only consequence is that many operations get slower. But
> a lack of gc when there are too many packs causes that consequence too
> (often a much worse slowdown than would be caused by the missing
> bitmap).
>
> So I wonder if it would be better for auto gc to grep gc.log for fatal
> errors (as opposed to warnings) and only skip running if any are found.
> Alternately, we could simply put warnings into gc.log.warning and
> reserve gc.log for fatal errors. I'm not sure which would be simpler.
Without thinking too hard on it, that seems like the appropriate
solution to me, too.
> 2. I don't understand what would cause that message. That is, what bad
> thing am I doing that I should stop doing? I've briefly skimmed the
> code and commit message, but the answer isn't leaping out at me.
Do you have alternates and are using --local? Do you have .keep packs
and have set repack.packKeptObjects to false?
There are other ways (e.g., an incremental repack), but I think those
are the likely ones to get via "git gc".
-Peff
^ permalink raw reply
* "disabling bitmap writing, as some objects are not being packed"?
From: David Turner @ 2016-12-16 21:05 UTC (permalink / raw)
To: git; +Cc: peff
I'm a bit confused by the message "disabling bitmap writing, as some
objects are not being packed". I see it the my gc.log file on my git
server.
1. Its presence in the gc.log file prevents future automatic garbage
collection. This seems bad. I understand the desire to avoid making
things worse if a past gc has run into issues. But this warning is
non-fatal; the only consequence is that many operations get slower. But
a lack of gc when there are too many packs causes that consequence too
(often a much worse slowdown than would be caused by the missing
bitmap).
So I wonder if it would be better for auto gc to grep gc.log for fatal
errors (as opposed to warnings) and only skip running if any are found.
Alternately, we could simply put warnings into gc.log.warning and
reserve gc.log for fatal errors. I'm not sure which would be simpler.
2. I don't understand what would cause that message. That is, what bad
thing am I doing that I should stop doing? I've briefly skimmed the
code and commit message, but the answer isn't leaping out at me.
^ permalink raw reply
* test failure
From: Ramsay Jones @ 2016-12-16 20:32 UTC (permalink / raw)
To: Lars Schneider; +Cc: Junio C Hamano, GIT Mailing-list
Hi Lars,
For the last two days, I've noticed t0021.15 on the 'pu' branch has been failing intermittently (well it fails with: 'make test >ptest-out', but
when run by hand, it fails only say 1-in-6, 1-in-18, etc.).
[yes, it's a bit strange; this hasn't changed in a couple of weeks!]
I don't have time to investigate further tonight and, since I had not
heard anyone else complain, I thought I should let you know.
See below for the output from a failing run. [Note: this is on Linux
Mint 18, tonight's pu branch @7c7984401].
ATB,
Ramsay Jones
$ ./t0021-conversion.sh -i -v
...
ok 14 - diff does not reuse worktree files that need cleaning
expecting success:
test_config_global filter.protocol.process "rot13-filter.pl clean smudge" &&
test_config_global filter.protocol.required true &&
rm -rf repo &&
mkdir repo &&
(
cd repo &&
git init &&
echo "*.r filter=protocol" >.gitattributes &&
git add . &&
git commit -m "test commit 1" &&
git branch empty-branch &&
cp "$TEST_ROOT/test.o" test.r &&
cp "$TEST_ROOT/test2.o" test2.r &&
mkdir testsubdir &&
cp "$TEST_ROOT/test3 'sq',\$x=.o" "testsubdir/test3 'sq',\$x=.r" &&
>test4-empty.r &&
S=$(file_size test.r) &&
S2=$(file_size test2.r) &&
S3=$(file_size "testsubdir/test3 'sq',\$x=.r") &&
filter_git add . &&
cat >expected.log <<-EOF &&
START
init handshake complete
IN: clean test.r $S [OK] -- OUT: $S . [OK]
IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
IN: clean testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
STOP
EOF
test_cmp_count expected.log rot13-filter.log &&
filter_git commit -m "test commit 2" &&
cat >expected.log <<-EOF &&
START
init handshake complete
IN: clean test.r $S [OK] -- OUT: $S . [OK]
IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
IN: clean testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
IN: clean test.r $S [OK] -- OUT: $S . [OK]
IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
IN: clean testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
STOP
EOF
test_cmp_count expected.log rot13-filter.log &&
rm -f test2.r "testsubdir/test3 'sq',\$x=.r" &&
filter_git checkout --quiet --no-progress . &&
cat >expected.log <<-EOF &&
START
init handshake complete
IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]
IN: smudge testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
STOP
EOF
test_cmp_exclude_clean expected.log rot13-filter.log &&
filter_git checkout --quiet --no-progress empty-branch &&
cat >expected.log <<-EOF &&
START
init handshake complete
IN: clean test.r $S [OK] -- OUT: $S . [OK]
STOP
EOF
test_cmp_exclude_clean expected.log rot13-filter.log &&
filter_git checkout --quiet --no-progress master &&
cat >expected.log <<-EOF &&
START
init handshake complete
IN: smudge test.r $S [OK] -- OUT: $S . [OK]
IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]
IN: smudge test4-empty.r 0 [OK] -- OUT: 0 [OK]
IN: smudge testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
STOP
EOF
test_cmp_exclude_clean expected.log rot13-filter.log &&
test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.r &&
test_cmp_committed_rot13 "$TEST_ROOT/test2.o" test2.r &&
test_cmp_committed_rot13 "$TEST_ROOT/test3 'sq',\$x=.o" "testsubdir/test3 'sq',\$x=.r"
)
Initialized empty Git repository in /home/ramsay/git/t/trash directory.t0021-conversion/repo/.git/
[master (root-commit) 56d459b] test commit 1
Author: A U Thor <author@example.com>
1 file changed, 1 insertion(+)
create mode 100644 .gitattributes
[master 9ea74df] test commit 2
Author: A U Thor <author@example.com>
4 files changed, 5 insertions(+)
create mode 100644 test.r
create mode 100644 test2.r
create mode 100644 test4-empty.r
create mode 100644 testsubdir/test3 'sq',$x=.r
sort: cannot read: rot13-filter.log: No such file or directory
--- expected.log 2016-12-16 20:14:29.037426091 +0000
+++ rot13-filter.log 2016-12-16 20:14:29.041426091 +0000
@@ -1,7 +0,0 @@
-x IN: clean test.r 57 [OK] -- OUT: 57 . [OK]
-x IN: clean test2.r 14 [OK] -- OUT: 14 . [OK]
-x IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
-x IN: clean testsubdir/test3 'sq',$x=.r 49 [OK] -- OUT: 49 . [OK]
- 1 START
- 1 STOP
- 1 init handshake complete
not ok 15 - required process filter should filter data
#
# test_config_global filter.protocol.process "rot13-filter.pl clean smudge" &&
# test_config_global filter.protocol.required true &&
# rm -rf repo &&
# mkdir repo &&
# (
# cd repo &&
# git init &&
#
# echo "*.r filter=protocol" >.gitattributes &&
# git add . &&
# git commit -m "test commit 1" &&
# git branch empty-branch &&
#
# cp "$TEST_ROOT/test.o" test.r &&
# cp "$TEST_ROOT/test2.o" test2.r &&
# mkdir testsubdir &&
# cp "$TEST_ROOT/test3 'sq',\$x=.o" "testsubdir/test3 'sq',\$x=.r" &&
# >test4-empty.r &&
#
# S=$(file_size test.r) &&
# S2=$(file_size test2.r) &&
# S3=$(file_size "testsubdir/test3 'sq',\$x=.r") &&
#
# filter_git add . &&
# cat >expected.log <<-EOF &&
# START
# init handshake complete
# IN: clean test.r $S [OK] -- OUT: $S . [OK]
# IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
# IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
# IN: clean testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
# STOP
# EOF
# test_cmp_count expected.log rot13-filter.log &&
#
# filter_git commit -m "test commit 2" &&
# cat >expected.log <<-EOF &&
# START
# init handshake complete
# IN: clean test.r $S [OK] -- OUT: $S . [OK]
# IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
# IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
# IN: clean testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
# IN: clean test.r $S [OK] -- OUT: $S . [OK]
# IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
# IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK]
# IN: clean testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
# STOP
# EOF
# test_cmp_count expected.log rot13-filter.log &&
#
# rm -f test2.r "testsubdir/test3 'sq',\$x=.r" &&
#
# filter_git checkout --quiet --no-progress . &&
# cat >expected.log <<-EOF &&
# START
# init handshake complete
# IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]
# IN: smudge testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
# STOP
# EOF
# test_cmp_exclude_clean expected.log rot13-filter.log &&
#
# filter_git checkout --quiet --no-progress empty-branch &&
# cat >expected.log <<-EOF &&
# START
# init handshake complete
# IN: clean test.r $S [OK] -- OUT: $S . [OK]
# STOP
# EOF
# test_cmp_exclude_clean expected.log rot13-filter.log &&
#
# filter_git checkout --quiet --no-progress master &&
# cat >expected.log <<-EOF &&
# START
# init handshake complete
# IN: smudge test.r $S [OK] -- OUT: $S . [OK]
# IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]
# IN: smudge test4-empty.r 0 [OK] -- OUT: 0 [OK]
# IN: smudge testsubdir/test3 'sq',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
# STOP
# EOF
# test_cmp_exclude_clean expected.log rot13-filter.log &&
#
# test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.r &&
# test_cmp_committed_rot13 "$TEST_ROOT/test2.o" test2.r &&
# test_cmp_committed_rot13 "$TEST_ROOT/test3 'sq',\$x=.o" "testsubdir/test3 'sq',\$x=.r"
# )
#
$
^ permalink raw reply
* Re: [PATCH v2 20/34] sequencer (rebase -i): copy commit notes at end
From: Junio C Hamano @ 2016-12-16 19:38 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, Kevin Daudt, Dennis Kaarsemaker
In-Reply-To: <a2ea8bd57461fe0969f3df09a0374005f12ac6c7.1481642927.git.johannes.schindelin@gmx.de>
Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> +static void flush_rewritten_pending(void) {
> + struct strbuf buf = STRBUF_INIT;
> + unsigned char newsha1[20];
> + FILE *out;
> +
> + if (strbuf_read_file(&buf, rebase_path_rewritten_pending(), 82) > 0 &&
> + !get_sha1("HEAD", newsha1) &&
> + (out = fopen(rebase_path_rewritten_list(), "a"))) {
An error in fopen() here ...
> + ...
> + }
> + strbuf_release(&buf);
> +}
> +
> +static void record_in_rewritten(struct object_id *oid,
> + enum todo_command next_command) {
> + FILE *out = fopen(rebase_path_rewritten_pending(), "a");
> +
> + if (!out)
> + return;
... and here are ignored as an insignificant error in the scripted
version, and this one does the same.
> +
> + fprintf(out, "%s\n", oid_to_hex(oid));
> + fclose(out);
> +
> + if (!is_fixup(next_command))
> + flush_rewritten_pending();
> +}
> +
> static int do_pick_commit(enum todo_command command, struct commit *commit,
> struct replay_opts *opts, int final_fixup)
> {
> @@ -1750,6 +1797,17 @@ static int is_final_fixup(struct todo_list *todo_list)
> return 1;
> }
>
> +static enum todo_command peek_command(struct todo_list *todo_list, int offset)
> +{
> + int i;
> +
> + for (i = todo_list->current + offset; i < todo_list->nr; i++)
> + if (todo_list->items[i].command != TODO_NOOP)
> + return todo_list->items[i].command;
Makes me wonder, after having commented on 07/34 regarding the fact
that in the end you would end up having three variants of no-op
(i.e. NOOP, DROP and COMMENT), what definition of a "command" this
function uses to return its result, when asked to "peek". I suspect
that this will be updated in a later patch to do "< TODO_NOOP"
instead? If so, then that answers one question in my comment on
07/34, i.e.
If a check for "is it one of the no-op commands?" appears only
here, a single liner comment may be sufficient (but necessary)
to help readers. Otherwise a single-liner helper function
(similar to is_fixup() you have) with a descriptive name would
be better than a single liner comment.
The answer is "no, it is not just there" hence the conclusion is "we
want a helper with a descriptive name".
^ permalink raw reply
* Re: [PATCH v15 08/27] bisect--helper: `is_expected_rev` & `check_expected_revs` shell function in C
From: Pranit Bauva @ 2016-12-16 19:35 UTC (permalink / raw)
To: Stephan Beyer; +Cc: Git List
In-Reply-To: <a4c7fec8-0e84-eb53-ca22-c369ce3facfa@gmx.net>
Hey Stephan,
On Thu, Nov 17, 2016 at 5:17 AM, Stephan Beyer <s-beyer@gmx.net> wrote:
> Hi,
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index d84ba86..c542e8b 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -123,13 +123,40 @@ static int bisect_reset(const char *commit)
>> return bisect_clean_state();
>> }
>>
>> +static int is_expected_rev(const char *expected_hex)
>> +{
>> + struct strbuf actual_hex = STRBUF_INIT;
>> + int res = 0;
>> + if (strbuf_read_file(&actual_hex, git_path_bisect_expected_rev(), 0) >= 40) {
>> + strbuf_trim(&actual_hex);
>> + res = !strcmp(actual_hex.buf, expected_hex);
>> + }
>> + strbuf_release(&actual_hex);
>> + return res;
>> +}
>
> I am not sure it does what it should.
>
> I would expect the following behavior from this function:
> - file does not exist (or is "broken") => return 0
> - actual_hex != expected_hex => return 0
> - otherwise return 1
>
> If I am not wrong, the code does the following instead:
> - file does not exist (or is "broken") => return 0
> - actual_hex != expected_hex => return 1
> - otherwise => return 0
It seems that I didn't carefully see what the shell code is (or
apparently did a mistake in understanding it ;)). I think the C
version does exactly what the shell version does. Can you confirm it
once again, please?
Regards,
Pranit Bauva
^ permalink raw reply
* Re: [PATCH v2 18/34] sequencer (rebase -i): refactor setting the reflog message
From: Junio C Hamano @ 2016-12-16 19:28 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, Kevin Daudt, Dennis Kaarsemaker
In-Reply-To: <6f7ec89ac91ba1aa860a28c27ab93f60099f1ddb.1481642927.git.johannes.schindelin@gmx.de>
Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> This makes the code DRYer, with the obvious benefit that we can enhance
> the code further in a single place.
>
> We can also reuse the functionality elsewhere by calling this new
> function.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> sequencer.c | 33 ++++++++++++++++++++++++++-------
> 1 file changed, 26 insertions(+), 7 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 33fb36dcbe..d20efad562 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1750,6 +1750,26 @@ static int is_final_fixup(struct todo_list *todo_list)
> return 1;
> }
>
> +static const char *reflog_message(struct replay_opts *opts,
> + const char *sub_action, const char *fmt, ...)
> +{
> + va_list ap;
> + static struct strbuf buf = STRBUF_INIT;
> +
> + va_start(ap, fmt);
> + strbuf_reset(&buf);
> + strbuf_addstr(&buf, action_name(opts));
> + if (sub_action)
> + strbuf_addf(&buf, " (%s)", sub_action);
> + if (fmt) {
> + strbuf_addstr(&buf, ": ");
> + strbuf_vaddf(&buf, fmt, ap);
> + }
> + va_end(ap);
> +
> + return buf.buf;
> +}
It is unlikely for a single caller to want to format another reflog
entry after formating one but before consuming it [*1*], so this
"call this, and you can use the return value until you call it the
next time without worrying about leakage" is quite a reasonable
pattern to employ.
[Footnote]
*1* And it is unlikely that this will run in a multi-threaded
environment, either ;-)
^ permalink raw reply
* Re: [PATCH v2 11/34] sequencer (rebase -i): remove CHERRY_PICK_HEAD when no longer needed
From: Junio C Hamano @ 2016-12-16 19:13 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, Kevin Daudt, Dennis Kaarsemaker
In-Reply-To: <81ba5f7ddb3a1a66e878b955094b7ae00f2cd781.1481642927.git.johannes.schindelin@gmx.de>
Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> The scripted version of the interactive rebase already does that.
Sensible. I was wondering why this wasn't there while reviewing
10/34, comparing the two (this is not a suggestion to squash this
into the previous step).
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> sequencer.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 855d3ba503..abffaf3b40 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1835,8 +1835,13 @@ static int commit_staged_changes(struct replay_opts *opts)
>
> if (has_unstaged_changes(1))
> return error(_("cannot rebase: You have unstaged changes."));
> - if (!has_uncommitted_changes(0))
> + if (!has_uncommitted_changes(0)) {
> + const char *cherry_pick_head = git_path("CHERRY_PICK_HEAD");
> +
> + if (file_exists(cherry_pick_head) && unlink(cherry_pick_head))
> + return error(_("could not remove CHERRY_PICK_HEAD"));
> return 0;
> + }
>
> if (file_exists(rebase_path_amend())) {
> struct strbuf rev = STRBUF_INIT;
^ permalink raw reply
* Re: [PATCH v2 15/34] sequencer (rebase -i): leave a patch upon error
From: Junio C Hamano @ 2016-12-16 19:23 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, Kevin Daudt, Dennis Kaarsemaker
In-Reply-To: <091217525a7ff71794b3544680571ce9814a297f.1481642927.git.johannes.schindelin@gmx.de>
Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> When doing an interactive rebase, we want to leave a 'patch' file for
> further inspection by the user (even if we never tried to actually apply
> that patch, since we're cherry-picking instead).
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
Yup.
The other day, I was kind of surprised to see the "patch" file
produced when I tried to do "git rebase -i HEAD^^ HEAD" with the one
in current Git (not yours), marked the first one "edit", and then it
gave me control back. Obviously there was no conflict and I could
just do "git show" if I wanted to see what the original change was,
but the "patch" file was there. I personally never have looked at
the "patch" file in such a situation, and I kind of feel it is
wasteful, but I can see people expect to find one there whenever
"rebase -i" stops and gives control back. It is good that you are
preserving the behaviour.
> sequencer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/sequencer.c b/sequencer.c
> index a4e9b326ba..4361fe0e94 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1777,6 +1777,9 @@ static int pick_commits(struct todo_list *todo_list, struct replay_opts *opts)
> return error_failed_squash(item->commit, opts,
> item->arg_len, item->arg);
> }
> + else if (res && is_rebase_i(opts))
> + return res | error_with_patch(item->commit,
> + item->arg, item->arg_len, opts, res, 0);
> }
> else if (item->command == TODO_EXEC) {
> char *end_of_arg = (char *)(item->arg + item->arg_len);
^ permalink raw reply
* Re: [PATCH v2 14/34] sequencer (rebase -i): update refs after a successful rebase
From: Junio C Hamano @ 2016-12-16 19:19 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, Kevin Daudt, Dennis Kaarsemaker
In-Reply-To: <596e3cf410a339c3212eea76394fe49be1c05ef8.1481642927.git.johannes.schindelin@gmx.de>
Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> An interactive rebase operates on a detached HEAD (to keep the reflog
> of the original branch relatively clean), and updates the branch only
> at the end.
>
> Now that the sequencer learns to perform interactive rebases, it also
> needs to learn the trick to update the branch before removing the
> directory containing the state of the interactive rebase.
>
> We introduce a new head_ref variable in a wider scope than necessary at
> the moment, to allow for a later patch that prints out "Successfully
> rebased and updated <ref>".
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> sequencer.c | 32 +++++++++++++++++++++++++++++++-
> 1 file changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index a6625e765d..a4e9b326ba 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -100,6 +100,8 @@ static GIT_PATH_FUNC(rebase_path_stopped_sha, "rebase-merge/stopped-sha")
> static GIT_PATH_FUNC(rebase_path_gpg_sign_opt, "rebase-merge/gpg_sign_opt")
> static GIT_PATH_FUNC(rebase_path_orig_head, "rebase-merge/orig-head")
> static GIT_PATH_FUNC(rebase_path_verbose, "rebase-merge/verbose")
> +static GIT_PATH_FUNC(rebase_path_head_name, "rebase-merge/head-name")
> +static GIT_PATH_FUNC(rebase_path_onto, "rebase-merge/onto")
>
> static inline int is_rebase_i(const struct replay_opts *opts)
> {
> @@ -1793,12 +1795,39 @@ static int pick_commits(struct todo_list *todo_list, struct replay_opts *opts)
> }
>
> if (is_rebase_i(opts)) {
> - struct strbuf buf = STRBUF_INIT;
> + struct strbuf head_ref = STRBUF_INIT, buf = STRBUF_INIT;
>
> /* Stopped in the middle, as planned? */
> if (todo_list->current < todo_list->nr)
> return 0;
>
> + if (read_oneliner(&head_ref, rebase_path_head_name(), 0) &&
> + starts_with(head_ref.buf, "refs/")) {
> + unsigned char head[20], orig[20];
> +
> + if (get_sha1("HEAD", head))
> + return error(_("cannot read HEAD"));
> + if (!read_oneliner(&buf, rebase_path_orig_head(), 0) ||
> + get_sha1_hex(buf.buf, orig))
> + return error(_("could not read orig-head"));
> + strbuf_addf(&buf, "rebase -i (finish): %s onto ",
> + head_ref.buf);
> + if (!read_oneliner(&buf, rebase_path_onto(), 0))
> + return error(_("could not read 'onto'"));
> + if (update_ref(buf.buf, head_ref.buf, head, orig,
> + REF_NODEREF, UPDATE_REFS_MSG_ON_ERR))
> + return error(_("could not update %s"),
> + head_ref.buf);
> + strbuf_reset(&buf);
> + strbuf_addf(&buf,
> + "rebase -i (finish): returning to %s",
> + head_ref.buf);
> + if (create_symref("HEAD", head_ref.buf, buf.buf))
> + return error(_("could not update HEAD to %s"),
> + head_ref.buf);
All of the above return error() calls leak head_ref.buf; in addition
some leak buf.buf, too.
> + strbuf_reset(&buf);
> + }
> +
> if (opts->verbose) {
> const char *argv[] = {
> "diff-tree", "--stat", NULL, NULL
> @@ -1813,6 +1842,7 @@ static int pick_commits(struct todo_list *todo_list, struct replay_opts *opts)
> strbuf_reset(&buf);
> }
> strbuf_release(&buf);
> + strbuf_release(&head_ref);
> }
>
> /*
^ permalink raw reply
* Re: [PATCH 1/1] mingw: intercept isatty() to handle /dev/null as Git expects it
From: Junio C Hamano @ 2016-12-16 19:05 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Johannes Schindelin, git, Pranit Bauva
In-Reply-To: <5977e71d-da58-7cb0-bc69-343bb3a1341d@kdbg.org>
Johannes Sixt <j6t@kdbg.org> writes:
> Am 16.12.2016 um 19:08 schrieb Junio C Hamano:
>> Sorry for not having waited for you to chime in before agreeing to
>> fast-track this one, and now this is in 'master'.
>
> No reason to be sorry, things happen... Dscho's request for
> fast-tracking was very reasonable, and the patch looked "obviously
> correct".
Oh, I do agree it was reasonable and looked obviously correct. And
I suspect that it did not make anything worse, either, so there is
not much harm done, other than that I now have to remember not to
merge it down without further fixes to 'maint' and declare the NUL
thing fixed ;-)
> Unfortunately, I'm away from my Windows box over the weekend. It will
> have to wait until Monday before I can test this idea.
Thanks.
^ permalink raw reply
* [PATCH v7 7/7] grep: search history of moved submodules
From: Brandon Williams @ 2016-12-16 19:03 UTC (permalink / raw)
To: git
Cc: peff, sbeller, jonathantanmy, gitster, jacob.keller, j6t,
Brandon Williams
In-Reply-To: <1481915002-162130-1-git-send-email-bmwill@google.com>
If a submodule was renamed at any point since it's inception then if you
were to try and grep on a commit prior to the submodule being moved, you
wouldn't be able to find a working directory for the submodule since the
path in the past is different from the current path.
This patch teaches grep to find the .git directory for a submodule in
the parents .git/modules/ directory in the event the path to the
submodule in the commit that is being searched differs from the state of
the currently checked out commit. If found, the child process that is
spawned to grep the submodule will chdir into its gitdir instead of a
working directory.
In order to override the explicit setting of submodule child process's
gitdir environment variable (which was introduced in '10f5c526')
`GIT_DIR_ENVIORMENT` needs to be pushed onto child process's env_array.
This allows the searching of history from a submodule's gitdir, rather
than from a working directory.
Signed-off-by: Brandon Williams <bmwill@google.com>
---
builtin/grep.c | 20 +++++++++++++++++--
t/t7814-grep-recurse-submodules.sh | 41 ++++++++++++++++++++++++++++++++++++++
2 files changed, 59 insertions(+), 2 deletions(-)
diff --git a/builtin/grep.c b/builtin/grep.c
index 5918a26..2c727ef 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -547,6 +547,7 @@ static int grep_submodule_launch(struct grep_opt *opt,
name = gs->name;
prepare_submodule_repo_env(&cp.env_array);
+ argv_array_push(&cp.env_array, GIT_DIR_ENVIRONMENT);
/* Add super prefix */
argv_array_pushf(&cp.args, "--super-prefix=%s%s/",
@@ -615,8 +616,23 @@ static int grep_submodule(struct grep_opt *opt, const unsigned char *sha1,
{
if (!is_submodule_initialized(path))
return 0;
- if (!is_submodule_populated(path))
- return 0;
+ if (!is_submodule_populated(path)) {
+ /*
+ * If searching history, check for the presense of the
+ * submodule's gitdir before skipping the submodule.
+ */
+ if (sha1) {
+ const struct submodule *sub =
+ submodule_from_path(null_sha1, path);
+ if (sub)
+ path = git_path("modules/%s", sub->name);
+
+ if (!(is_directory(path) && is_git_directory(path)))
+ return 0;
+ } else {
+ return 0;
+ }
+ }
#ifndef NO_PTHREADS
if (num_threads) {
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index d5fc316..67247a0 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -186,6 +186,47 @@ test_expect_success !MINGW 'grep recurse submodule colon in name' '
test_cmp expect actual
'
+test_expect_success 'grep history with moved submoules' '
+ git init parent &&
+ test_when_finished "rm -rf parent" &&
+ echo "foobar" >parent/file &&
+ git -C parent add file &&
+ git -C parent commit -m "add file" &&
+
+ git init sub &&
+ test_when_finished "rm -rf sub" &&
+ echo "foobar" >sub/file &&
+ git -C sub add file &&
+ git -C sub commit -m "add file" &&
+
+ git -C parent submodule add ../sub dir/sub &&
+ git -C parent commit -m "add submodule" &&
+
+ cat >expect <<-\EOF &&
+ dir/sub/file:foobar
+ file:foobar
+ EOF
+ git -C parent grep -e "foobar" --recurse-submodules >actual &&
+ test_cmp expect actual &&
+
+ git -C parent mv dir/sub sub-moved &&
+ git -C parent commit -m "moved submodule" &&
+
+ cat >expect <<-\EOF &&
+ file:foobar
+ sub-moved/file:foobar
+ EOF
+ git -C parent grep -e "foobar" --recurse-submodules >actual &&
+ test_cmp expect actual &&
+
+ cat >expect <<-\EOF &&
+ HEAD^:dir/sub/file:foobar
+ HEAD^:file:foobar
+ EOF
+ git -C parent grep -e "foobar" --recurse-submodules HEAD^ >actual &&
+ test_cmp expect actual
+'
+
test_incompatible_with_recurse_submodules ()
{
test_expect_success "--recurse-submodules and $1 are incompatible" "
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related
* [PATCH v7 4/7] grep: add submodules as a grep source type
From: Brandon Williams @ 2016-12-16 19:03 UTC (permalink / raw)
To: git
Cc: peff, sbeller, jonathantanmy, gitster, jacob.keller, j6t,
Brandon Williams
In-Reply-To: <1481915002-162130-1-git-send-email-bmwill@google.com>
Add `GREP_SOURCE_SUBMODULE` as a grep_source type and cases for this new
type in the various switch statements in grep.c.
When initializing a grep_source with type `GREP_SOURCE_SUBMODULE` the
identifier can either be NULL (to indicate that the working tree will be
used) or a SHA1 (the REV of the submodule to be grep'd). If the
identifier is a SHA1 then we want to fall through to the
`GREP_SOURCE_SHA1` case to handle the copying of the SHA1.
Signed-off-by: Brandon Williams <bmwill@google.com>
---
grep.c | 16 +++++++++++++++-
grep.h | 1 +
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/grep.c b/grep.c
index 1194d35..0dbdc1d 100644
--- a/grep.c
+++ b/grep.c
@@ -1735,12 +1735,23 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type,
case GREP_SOURCE_FILE:
gs->identifier = xstrdup(identifier);
break;
+ case GREP_SOURCE_SUBMODULE:
+ if (!identifier) {
+ gs->identifier = NULL;
+ break;
+ }
+ /*
+ * FALL THROUGH
+ * If the identifier is non-NULL (in the submodule case) it
+ * will be a SHA1 that needs to be copied.
+ */
case GREP_SOURCE_SHA1:
gs->identifier = xmalloc(20);
hashcpy(gs->identifier, identifier);
break;
case GREP_SOURCE_BUF:
gs->identifier = NULL;
+ break;
}
}
@@ -1760,6 +1771,7 @@ void grep_source_clear_data(struct grep_source *gs)
switch (gs->type) {
case GREP_SOURCE_FILE:
case GREP_SOURCE_SHA1:
+ case GREP_SOURCE_SUBMODULE:
free(gs->buf);
gs->buf = NULL;
gs->size = 0;
@@ -1831,8 +1843,10 @@ static int grep_source_load(struct grep_source *gs)
return grep_source_load_sha1(gs);
case GREP_SOURCE_BUF:
return gs->buf ? 0 : -1;
+ case GREP_SOURCE_SUBMODULE:
+ break;
}
- die("BUG: invalid grep_source type");
+ die("BUG: invalid grep_source type to load");
}
void grep_source_load_driver(struct grep_source *gs)
diff --git a/grep.h b/grep.h
index 5856a23..267534c 100644
--- a/grep.h
+++ b/grep.h
@@ -161,6 +161,7 @@ struct grep_source {
GREP_SOURCE_SHA1,
GREP_SOURCE_FILE,
GREP_SOURCE_BUF,
+ GREP_SOURCE_SUBMODULE,
} type;
void *identifier;
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related
* [PATCH v7 6/7] grep: enable recurse-submodules to work on <tree> objects
From: Brandon Williams @ 2016-12-16 19:03 UTC (permalink / raw)
To: git
Cc: peff, sbeller, jonathantanmy, gitster, jacob.keller, j6t,
Brandon Williams
In-Reply-To: <1481915002-162130-1-git-send-email-bmwill@google.com>
Teach grep to recursively search in submodules when provided with a
<tree> object. This allows grep to search a submodule based on the state
of the submodule that is present in a commit of the super project.
When grep is provided with a <tree> object, the name of the object is
prefixed to all output. In order to provide uniformity of output
between the parent and child processes the option `--parent-basename`
has been added so that the child can preface all of it's output with the
name of the parent's object instead of the name of the commit SHA1 of
the submodule. This changes output from the command
`git grep -e. -l --recurse-submodules HEAD`
from:
HEAD:file
<commit sha1 of submodule>:sub/file
to:
HEAD:file
HEAD:sub/file
Signed-off-by: Brandon Williams <bmwill@google.com>
---
Documentation/git-grep.txt | 13 ++++-
builtin/grep.c | 76 ++++++++++++++++++++++++---
t/t7814-grep-recurse-submodules.sh | 103 ++++++++++++++++++++++++++++++++++++-
tree-walk.c | 28 ++++++++++
4 files changed, 211 insertions(+), 9 deletions(-)
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 17aa1ba..71f32f3 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -26,7 +26,7 @@ SYNOPSIS
[--threads <num>]
[-f <file>] [-e] <pattern>
[--and|--or|--not|(|)|-e <pattern>...]
- [--recurse-submodules]
+ [--recurse-submodules] [--parent-basename <basename>]
[ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | <tree>...]
[--] [<pathspec>...]
@@ -91,7 +91,16 @@ OPTIONS
--recurse-submodules::
Recursively search in each submodule that has been initialized and
- checked out in the repository.
+ checked out in the repository. When used in combination with the
+ <tree> option the prefix of all submodule output will be the name of
+ the parent project's <tree> object.
+
+--parent-basename <basename>::
+ For internal use only. In order to produce uniform output with the
+ --recurse-submodules option, this option can be used to provide the
+ basename of a parent's <tree> object to a submodule so the submodule
+ can prefix its output with the parent's name rather than the SHA1 of
+ the submodule.
-a::
--text::
diff --git a/builtin/grep.c b/builtin/grep.c
index dca0be6..5918a26 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -19,6 +19,7 @@
#include "dir.h"
#include "pathspec.h"
#include "submodule.h"
+#include "submodule-config.h"
static char const * const grep_usage[] = {
N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
@@ -28,6 +29,7 @@ static char const * const grep_usage[] = {
static const char *super_prefix;
static int recurse_submodules;
static struct argv_array submodule_options = ARGV_ARRAY_INIT;
+static const char *parent_basename;
static int grep_submodule_launch(struct grep_opt *opt,
const struct grep_source *gs);
@@ -534,19 +536,53 @@ static int grep_submodule_launch(struct grep_opt *opt,
{
struct child_process cp = CHILD_PROCESS_INIT;
int status, i;
+ const char *end_of_base;
+ const char *name;
struct work_item *w = opt->output_priv;
+ end_of_base = strchr(gs->name, ':');
+ if (gs->identifier && end_of_base)
+ name = end_of_base + 1;
+ else
+ name = gs->name;
+
prepare_submodule_repo_env(&cp.env_array);
/* Add super prefix */
argv_array_pushf(&cp.args, "--super-prefix=%s%s/",
super_prefix ? super_prefix : "",
- gs->name);
+ name);
argv_array_push(&cp.args, "grep");
+ /*
+ * Add basename of parent project
+ * When performing grep on a tree object the filename is prefixed
+ * with the object's name: 'tree-name:filename'. In order to
+ * provide uniformity of output we want to pass the name of the
+ * parent project's object name to the submodule so the submodule can
+ * prefix its output with the parent's name and not its own SHA1.
+ */
+ if (gs->identifier && end_of_base)
+ argv_array_pushf(&cp.args, "--parent-basename=%.*s",
+ (int) (end_of_base - gs->name),
+ gs->name);
+
/* Add options */
- for (i = 0; i < submodule_options.argc; i++)
+ for (i = 0; i < submodule_options.argc; i++) {
+ /*
+ * If there is a tree identifier for the submodule, add the
+ * rev after adding the submodule options but before the
+ * pathspecs. To do this we listen for the '--' and insert the
+ * sha1 before pushing the '--' onto the child process argv
+ * array.
+ */
+ if (gs->identifier &&
+ !strcmp("--", submodule_options.argv[i])) {
+ argv_array_push(&cp.args, sha1_to_hex(gs->identifier));
+ }
+
argv_array_push(&cp.args, submodule_options.argv[i]);
+ }
cp.git_cmd = 1;
cp.dir = gs->path;
@@ -673,12 +709,22 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
enum interesting match = entry_not_interesting;
struct name_entry entry;
int old_baselen = base->len;
+ struct strbuf name = STRBUF_INIT;
+ int name_base_len = 0;
+ if (super_prefix) {
+ strbuf_addstr(&name, super_prefix);
+ name_base_len = name.len;
+ }
while (tree_entry(tree, &entry)) {
int te_len = tree_entry_len(&entry);
if (match != all_entries_interesting) {
- match = tree_entry_interesting(&entry, base, tn_len, pathspec);
+ strbuf_addstr(&name, base->buf + tn_len);
+ match = tree_entry_interesting(&entry, &name,
+ 0, pathspec);
+ strbuf_setlen(&name, name_base_len);
+
if (match == all_entries_not_interesting)
break;
if (match == entry_not_interesting)
@@ -690,8 +736,7 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
if (S_ISREG(entry.mode)) {
hit |= grep_sha1(opt, entry.oid->hash, base->buf, tn_len,
check_attr ? base->buf + tn_len : NULL);
- }
- else if (S_ISDIR(entry.mode)) {
+ } else if (S_ISDIR(entry.mode)) {
enum object_type type;
struct tree_desc sub;
void *data;
@@ -707,12 +752,18 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
hit |= grep_tree(opt, pathspec, &sub, base, tn_len,
check_attr);
free(data);
+ } else if (recurse_submodules && S_ISGITLINK(entry.mode)) {
+ hit |= grep_submodule(opt, entry.oid->hash, base->buf,
+ base->buf + tn_len);
}
+
strbuf_setlen(base, old_baselen);
if (hit && opt->status_only)
break;
}
+
+ strbuf_release(&name);
return hit;
}
@@ -736,6 +787,10 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
if (!data)
die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
+ /* Use parent's name as base when recursing submodules */
+ if (recurse_submodules && parent_basename)
+ name = parent_basename;
+
len = name ? strlen(name) : 0;
strbuf_init(&base, PATH_MAX + len + 1);
if (len) {
@@ -762,6 +817,12 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
for (i = 0; i < nr; i++) {
struct object *real_obj;
real_obj = deref_tag(list->objects[i].item, NULL, 0);
+
+ /* load the gitmodules file for this rev */
+ if (recurse_submodules) {
+ submodule_free();
+ gitmodules_config_sha1(real_obj->oid.hash);
+ }
if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path)) {
hit = 1;
if (opt->status_only)
@@ -902,6 +963,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
N_("ignore files specified via '.gitignore'"), 1),
OPT_BOOL(0, "recurse-submodules", &recurse_submodules,
N_("recursivley search in each submodule")),
+ OPT_STRING(0, "parent-basename", &parent_basename,
+ N_("basename"),
+ N_("prepend parent project's basename to output")),
OPT_GROUP(""),
OPT_BOOL('v', "invert-match", &opt.invert,
N_("show non-matching lines")),
@@ -1154,7 +1218,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
}
}
- if (recurse_submodules && (!use_index || untracked || list.nr))
+ if (recurse_submodules && (!use_index || untracked))
die(_("option not supported with --recurse-submodules."));
if (!show_in_pager && !opt.status_only)
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 1019125..d5fc316 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -84,6 +84,108 @@ test_expect_success 'grep and multiple patterns' '
test_cmp expect actual
'
+test_expect_success 'basic grep tree' '
+ cat >expect <<-\EOF &&
+ HEAD:a:foobar
+ HEAD:b/b:bar
+ HEAD:submodule/a:foobar
+ HEAD:submodule/sub/a:foobar
+ EOF
+
+ git grep -e "bar" --recurse-submodules HEAD >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep tree HEAD^' '
+ cat >expect <<-\EOF &&
+ HEAD^:a:foobar
+ HEAD^:b/b:bar
+ HEAD^:submodule/a:foobar
+ EOF
+
+ git grep -e "bar" --recurse-submodules HEAD^ >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep tree HEAD^^' '
+ cat >expect <<-\EOF &&
+ HEAD^^:a:foobar
+ HEAD^^:b/b:bar
+ EOF
+
+ git grep -e "bar" --recurse-submodules HEAD^^ >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep tree and pathspecs' '
+ cat >expect <<-\EOF &&
+ HEAD:submodule/a:foobar
+ HEAD:submodule/sub/a:foobar
+ EOF
+
+ git grep -e "bar" --recurse-submodules HEAD -- submodule >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep tree and pathspecs' '
+ cat >expect <<-\EOF &&
+ HEAD:submodule/a:foobar
+ HEAD:submodule/sub/a:foobar
+ EOF
+
+ git grep -e "bar" --recurse-submodules HEAD -- "submodule*a" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep tree and more pathspecs' '
+ cat >expect <<-\EOF &&
+ HEAD:submodule/a:foobar
+ EOF
+
+ git grep -e "bar" --recurse-submodules HEAD -- "submodul?/a" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep tree and more pathspecs' '
+ cat >expect <<-\EOF &&
+ HEAD:submodule/sub/a:foobar
+ EOF
+
+ git grep -e "bar" --recurse-submodules HEAD -- "submodul*/sub/a" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success !MINGW 'grep recurse submodule colon in name' '
+ git init parent &&
+ test_when_finished "rm -rf parent" &&
+ echo "foobar" >"parent/fi:le" &&
+ git -C parent add "fi:le" &&
+ git -C parent commit -m "add fi:le" &&
+
+ git init "su:b" &&
+ test_when_finished "rm -rf su:b" &&
+ echo "foobar" >"su:b/fi:le" &&
+ git -C "su:b" add "fi:le" &&
+ git -C "su:b" commit -m "add fi:le" &&
+
+ git -C parent submodule add "../su:b" "su:b" &&
+ git -C parent commit -m "add submodule" &&
+
+ cat >expect <<-\EOF &&
+ fi:le:foobar
+ su:b/fi:le:foobar
+ EOF
+ git -C parent grep -e "foobar" --recurse-submodules >actual &&
+ test_cmp expect actual &&
+
+ cat >expect <<-\EOF &&
+ HEAD:fi:le:foobar
+ HEAD:su:b/fi:le:foobar
+ EOF
+ git -C parent grep -e "foobar" --recurse-submodules HEAD >actual &&
+ test_cmp expect actual
+'
+
test_incompatible_with_recurse_submodules ()
{
test_expect_success "--recurse-submodules and $1 are incompatible" "
@@ -94,6 +196,5 @@ test_incompatible_with_recurse_submodules ()
test_incompatible_with_recurse_submodules --untracked
test_incompatible_with_recurse_submodules --no-index
-test_incompatible_with_recurse_submodules HEAD
test_done
diff --git a/tree-walk.c b/tree-walk.c
index 828f435..ff77605 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -1004,6 +1004,19 @@ static enum interesting do_match(const struct name_entry *entry,
*/
if (ps->recursive && S_ISDIR(entry->mode))
return entry_interesting;
+
+ /*
+ * When matching against submodules with
+ * wildcard characters, ensure that the entry
+ * at least matches up to the first wild
+ * character. More accurate matching can then
+ * be performed in the submodule itself.
+ */
+ if (ps->recursive && S_ISGITLINK(entry->mode) &&
+ !ps_strncmp(item, match + baselen,
+ entry->path,
+ item->nowildcard_len - baselen))
+ return entry_interesting;
}
continue;
@@ -1040,6 +1053,21 @@ static enum interesting do_match(const struct name_entry *entry,
strbuf_setlen(base, base_offset + baselen);
return entry_interesting;
}
+
+ /*
+ * When matching against submodules with
+ * wildcard characters, ensure that the entry
+ * at least matches up to the first wild
+ * character. More accurate matching can then
+ * be performed in the submodule itself.
+ */
+ if (ps->recursive && S_ISGITLINK(entry->mode) &&
+ !ps_strncmp(item, match, base->buf + base_offset,
+ item->nowildcard_len)) {
+ strbuf_setlen(base, base_offset + baselen);
+ return entry_interesting;
+ }
+
strbuf_setlen(base, base_offset + baselen);
/*
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related
* [PATCH v7 5/7] grep: optionally recurse into submodules
From: Brandon Williams @ 2016-12-16 19:03 UTC (permalink / raw)
To: git
Cc: peff, sbeller, jonathantanmy, gitster, jacob.keller, j6t,
Brandon Williams
In-Reply-To: <1481915002-162130-1-git-send-email-bmwill@google.com>
Allow grep to recognize submodules and recursively search for patterns in
each submodule. This is done by forking off a process to recursively
call grep on each submodule. The top level --super-prefix option is
used to pass a path to the submodule which can in turn be used to
prepend to output or in pathspec matching logic.
Recursion only occurs for submodules which have been initialized and
checked out by the parent project. If a submodule hasn't been
initialized and checked out it is simply skipped.
In order to support the existing multi-threading infrastructure in grep,
output from each child process is captured in a strbuf so that it can be
later printed to the console in an ordered fashion.
To limit the number of theads that are created, each child process has
half the number of threads as its parents (minimum of 1), otherwise we
potentailly have a fork-bomb.
Signed-off-by: Brandon Williams <bmwill@google.com>
---
Documentation/git-grep.txt | 5 +
builtin/grep.c | 300 ++++++++++++++++++++++++++++++++++---
git.c | 2 +-
t/t7814-grep-recurse-submodules.sh | 99 ++++++++++++
4 files changed, 386 insertions(+), 20 deletions(-)
create mode 100755 t/t7814-grep-recurse-submodules.sh
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 0ecea6e..17aa1ba 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -26,6 +26,7 @@ SYNOPSIS
[--threads <num>]
[-f <file>] [-e] <pattern>
[--and|--or|--not|(|)|-e <pattern>...]
+ [--recurse-submodules]
[ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | <tree>...]
[--] [<pathspec>...]
@@ -88,6 +89,10 @@ OPTIONS
mechanism. Only useful when searching files in the current directory
with `--no-index`.
+--recurse-submodules::
+ Recursively search in each submodule that has been initialized and
+ checked out in the repository.
+
-a::
--text::
Process binary files as if they were text.
diff --git a/builtin/grep.c b/builtin/grep.c
index 8887b6a..dca0be6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -18,12 +18,20 @@
#include "quote.h"
#include "dir.h"
#include "pathspec.h"
+#include "submodule.h"
static char const * const grep_usage[] = {
N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
NULL
};
+static const char *super_prefix;
+static int recurse_submodules;
+static struct argv_array submodule_options = ARGV_ARRAY_INIT;
+
+static int grep_submodule_launch(struct grep_opt *opt,
+ const struct grep_source *gs);
+
#define GREP_NUM_THREADS_DEFAULT 8
static int num_threads;
@@ -174,7 +182,10 @@ static void *run(void *arg)
break;
opt->output_priv = w;
- hit |= grep_source(opt, &w->source);
+ if (w->source.type == GREP_SOURCE_SUBMODULE)
+ hit |= grep_submodule_launch(opt, &w->source);
+ else
+ hit |= grep_source(opt, &w->source);
grep_source_clear_data(&w->source);
work_done(w);
}
@@ -300,6 +311,10 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1,
if (opt->relative && opt->prefix_length) {
quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf);
strbuf_insert(&pathbuf, 0, filename, tree_name_len);
+ } else if (super_prefix) {
+ strbuf_add(&pathbuf, filename, tree_name_len);
+ strbuf_addstr(&pathbuf, super_prefix);
+ strbuf_addstr(&pathbuf, filename + tree_name_len);
} else {
strbuf_addstr(&pathbuf, filename);
}
@@ -328,10 +343,13 @@ static int grep_file(struct grep_opt *opt, const char *filename)
{
struct strbuf buf = STRBUF_INIT;
- if (opt->relative && opt->prefix_length)
+ if (opt->relative && opt->prefix_length) {
quote_path_relative(filename, opt->prefix, &buf);
- else
+ } else {
+ if (super_prefix)
+ strbuf_addstr(&buf, super_prefix);
strbuf_addstr(&buf, filename);
+ }
#ifndef NO_PTHREADS
if (num_threads) {
@@ -378,31 +396,260 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
exit(status);
}
-static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int cached)
+static void compile_submodule_options(const struct grep_opt *opt,
+ const struct pathspec *pathspec,
+ int cached, int untracked,
+ int opt_exclude, int use_index,
+ int pattern_type_arg)
+{
+ struct grep_pat *pattern;
+ int i;
+
+ if (recurse_submodules)
+ argv_array_push(&submodule_options, "--recurse-submodules");
+
+ if (cached)
+ argv_array_push(&submodule_options, "--cached");
+ if (!use_index)
+ argv_array_push(&submodule_options, "--no-index");
+ if (untracked)
+ argv_array_push(&submodule_options, "--untracked");
+ if (opt_exclude > 0)
+ argv_array_push(&submodule_options, "--exclude-standard");
+
+ if (opt->invert)
+ argv_array_push(&submodule_options, "-v");
+ if (opt->ignore_case)
+ argv_array_push(&submodule_options, "-i");
+ if (opt->word_regexp)
+ argv_array_push(&submodule_options, "-w");
+ switch (opt->binary) {
+ case GREP_BINARY_NOMATCH:
+ argv_array_push(&submodule_options, "-I");
+ break;
+ case GREP_BINARY_TEXT:
+ argv_array_push(&submodule_options, "-a");
+ break;
+ default:
+ break;
+ }
+ if (opt->allow_textconv)
+ argv_array_push(&submodule_options, "--textconv");
+ if (opt->max_depth != -1)
+ argv_array_pushf(&submodule_options, "--max-depth=%d",
+ opt->max_depth);
+ if (opt->linenum)
+ argv_array_push(&submodule_options, "-n");
+ if (!opt->pathname)
+ argv_array_push(&submodule_options, "-h");
+ if (!opt->relative)
+ argv_array_push(&submodule_options, "--full-name");
+ if (opt->name_only)
+ argv_array_push(&submodule_options, "-l");
+ if (opt->unmatch_name_only)
+ argv_array_push(&submodule_options, "-L");
+ if (opt->null_following_name)
+ argv_array_push(&submodule_options, "-z");
+ if (opt->count)
+ argv_array_push(&submodule_options, "-c");
+ if (opt->file_break)
+ argv_array_push(&submodule_options, "--break");
+ if (opt->heading)
+ argv_array_push(&submodule_options, "--heading");
+ if (opt->pre_context)
+ argv_array_pushf(&submodule_options, "--before-context=%d",
+ opt->pre_context);
+ if (opt->post_context)
+ argv_array_pushf(&submodule_options, "--after-context=%d",
+ opt->post_context);
+ if (opt->funcname)
+ argv_array_push(&submodule_options, "-p");
+ if (opt->funcbody)
+ argv_array_push(&submodule_options, "-W");
+ if (opt->all_match)
+ argv_array_push(&submodule_options, "--all-match");
+ if (opt->debug)
+ argv_array_push(&submodule_options, "--debug");
+ if (opt->status_only)
+ argv_array_push(&submodule_options, "-q");
+
+ switch (pattern_type_arg) {
+ case GREP_PATTERN_TYPE_BRE:
+ argv_array_push(&submodule_options, "-G");
+ break;
+ case GREP_PATTERN_TYPE_ERE:
+ argv_array_push(&submodule_options, "-E");
+ break;
+ case GREP_PATTERN_TYPE_FIXED:
+ argv_array_push(&submodule_options, "-F");
+ break;
+ case GREP_PATTERN_TYPE_PCRE:
+ argv_array_push(&submodule_options, "-P");
+ break;
+ case GREP_PATTERN_TYPE_UNSPECIFIED:
+ break;
+ }
+
+ for (pattern = opt->pattern_list; pattern != NULL;
+ pattern = pattern->next) {
+ switch (pattern->token) {
+ case GREP_PATTERN:
+ argv_array_pushf(&submodule_options, "-e%s",
+ pattern->pattern);
+ break;
+ case GREP_AND:
+ case GREP_OPEN_PAREN:
+ case GREP_CLOSE_PAREN:
+ case GREP_NOT:
+ case GREP_OR:
+ argv_array_push(&submodule_options, pattern->pattern);
+ break;
+ /* BODY and HEAD are not used by git-grep */
+ case GREP_PATTERN_BODY:
+ case GREP_PATTERN_HEAD:
+ break;
+ }
+ }
+
+ /*
+ * Limit number of threads for child process to use.
+ * This is to prevent potential fork-bomb behavior of git-grep as each
+ * submodule process has its own thread pool.
+ */
+ argv_array_pushf(&submodule_options, "--threads=%d",
+ (num_threads + 1) / 2);
+
+ /* Add Pathspecs */
+ argv_array_push(&submodule_options, "--");
+ for (i = 0; i < pathspec->nr; i++)
+ argv_array_push(&submodule_options,
+ pathspec->items[i].original);
+}
+
+/*
+ * Launch child process to grep contents of a submodule
+ */
+static int grep_submodule_launch(struct grep_opt *opt,
+ const struct grep_source *gs)
+{
+ struct child_process cp = CHILD_PROCESS_INIT;
+ int status, i;
+ struct work_item *w = opt->output_priv;
+
+ prepare_submodule_repo_env(&cp.env_array);
+
+ /* Add super prefix */
+ argv_array_pushf(&cp.args, "--super-prefix=%s%s/",
+ super_prefix ? super_prefix : "",
+ gs->name);
+ argv_array_push(&cp.args, "grep");
+
+ /* Add options */
+ for (i = 0; i < submodule_options.argc; i++)
+ argv_array_push(&cp.args, submodule_options.argv[i]);
+
+ cp.git_cmd = 1;
+ cp.dir = gs->path;
+
+ /*
+ * Capture output to output buffer and check the return code from the
+ * child process. A '0' indicates a hit, a '1' indicates no hit and
+ * anything else is an error.
+ */
+ status = capture_command(&cp, &w->out, 0);
+ if (status && (status != 1)) {
+ /* flush the buffer */
+ write_or_die(1, w->out.buf, w->out.len);
+ die("process for submodule '%s' failed with exit code: %d",
+ gs->name, status);
+ }
+
+ /* invert the return code to make a hit equal to 1 */
+ return !status;
+}
+
+/*
+ * Prep grep structures for a submodule grep
+ * sha1: the sha1 of the submodule or NULL if using the working tree
+ * filename: name of the submodule including tree name of parent
+ * path: location of the submodule
+ */
+static int grep_submodule(struct grep_opt *opt, const unsigned char *sha1,
+ const char *filename, const char *path)
+{
+ if (!is_submodule_initialized(path))
+ return 0;
+ if (!is_submodule_populated(path))
+ return 0;
+
+#ifndef NO_PTHREADS
+ if (num_threads) {
+ add_work(opt, GREP_SOURCE_SUBMODULE, filename, path, sha1);
+ return 0;
+ } else
+#endif
+ {
+ struct work_item w;
+ int hit;
+
+ grep_source_init(&w.source, GREP_SOURCE_SUBMODULE,
+ filename, path, sha1);
+ strbuf_init(&w.out, 0);
+ opt->output_priv = &w;
+ hit = grep_submodule_launch(opt, &w.source);
+
+ write_or_die(1, w.out.buf, w.out.len);
+
+ grep_source_clear(&w.source);
+ strbuf_release(&w.out);
+ return hit;
+ }
+}
+
+static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
+ int cached)
{
int hit = 0;
int nr;
+ struct strbuf name = STRBUF_INIT;
+ int name_base_len = 0;
+ if (super_prefix) {
+ name_base_len = strlen(super_prefix);
+ strbuf_addstr(&name, super_prefix);
+ }
+
read_cache();
for (nr = 0; nr < active_nr; nr++) {
const struct cache_entry *ce = active_cache[nr];
- if (!S_ISREG(ce->ce_mode))
- continue;
- if (!ce_path_match(ce, pathspec, NULL))
+ strbuf_setlen(&name, name_base_len);
+ strbuf_addstr(&name, ce->name);
+
+ if (S_ISREG(ce->ce_mode) &&
+ match_pathspec(pathspec, name.buf, name.len, 0, NULL,
+ S_ISDIR(ce->ce_mode) ||
+ S_ISGITLINK(ce->ce_mode))) {
+ /*
+ * If CE_VALID is on, we assume worktree file and its
+ * cache entry are identical, even if worktree file has
+ * been modified, so use cache version instead
+ */
+ if (cached || (ce->ce_flags & CE_VALID) ||
+ ce_skip_worktree(ce)) {
+ if (ce_stage(ce) || ce_intent_to_add(ce))
+ continue;
+ hit |= grep_sha1(opt, ce->oid.hash, ce->name,
+ 0, ce->name);
+ } else {
+ hit |= grep_file(opt, ce->name);
+ }
+ } else if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
+ submodule_path_match(pathspec, name.buf, NULL)) {
+ hit |= grep_submodule(opt, NULL, ce->name, ce->name);
+ } else {
continue;
- /*
- * If CE_VALID is on, we assume worktree file and its cache entry
- * are identical, even if worktree file has been modified, so use
- * cache version instead
- */
- if (cached || (ce->ce_flags & CE_VALID) || ce_skip_worktree(ce)) {
- if (ce_stage(ce) || ce_intent_to_add(ce))
- continue;
- hit |= grep_sha1(opt, ce->oid.hash, ce->name, 0,
- ce->name);
}
- else
- hit |= grep_file(opt, ce->name);
+
if (ce_stage(ce)) {
do {
nr++;
@@ -413,6 +660,8 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int
if (hit && opt->status_only)
break;
}
+
+ strbuf_release(&name);
return hit;
}
@@ -651,6 +900,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
N_("search in both tracked and untracked files")),
OPT_SET_INT(0, "exclude-standard", &opt_exclude,
N_("ignore files specified via '.gitignore'"), 1),
+ OPT_BOOL(0, "recurse-submodules", &recurse_submodules,
+ N_("recursivley search in each submodule")),
OPT_GROUP(""),
OPT_BOOL('v', "invert-match", &opt.invert,
N_("show non-matching lines")),
@@ -755,6 +1006,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
init_grep_defaults();
git_config(grep_cmd_config, NULL);
grep_init(&opt, prefix);
+ super_prefix = get_super_prefix();
/*
* If there is no -- then the paths must exist in the working
@@ -872,6 +1124,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
pathspec.max_depth = opt.max_depth;
pathspec.recursive = 1;
+ if (recurse_submodules) {
+ gitmodules_config();
+ compile_submodule_options(&opt, &pathspec, cached, untracked,
+ opt_exclude, use_index,
+ pattern_type_arg);
+ }
+
if (show_in_pager && (cached || list.nr))
die(_("--open-files-in-pager only works on the worktree"));
@@ -895,6 +1154,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
}
}
+ if (recurse_submodules && (!use_index || untracked || list.nr))
+ die(_("option not supported with --recurse-submodules."));
+
if (!show_in_pager && !opt.status_only)
setup_pager();
diff --git a/git.c b/git.c
index dce529f..c95d3e3 100644
--- a/git.c
+++ b/git.c
@@ -434,7 +434,7 @@ static struct cmd_struct commands[] = {
{ "fsck-objects", cmd_fsck, RUN_SETUP },
{ "gc", cmd_gc, RUN_SETUP },
{ "get-tar-commit-id", cmd_get_tar_commit_id },
- { "grep", cmd_grep, RUN_SETUP_GENTLY },
+ { "grep", cmd_grep, RUN_SETUP_GENTLY | SUPPORT_SUPER_PREFIX },
{ "hash-object", cmd_hash_object },
{ "help", cmd_help },
{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY },
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
new file mode 100755
index 0000000..1019125
--- /dev/null
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -0,0 +1,99 @@
+#!/bin/sh
+
+test_description='Test grep recurse-submodules feature
+
+This test verifies the recurse-submodules feature correctly greps across
+submodules.
+'
+
+. ./test-lib.sh
+
+test_expect_success 'setup directory structure and submodule' '
+ echo "foobar" >a &&
+ mkdir b &&
+ echo "bar" >b/b &&
+ git add a b &&
+ git commit -m "add a and b" &&
+ git init submodule &&
+ echo "foobar" >submodule/a &&
+ git -C submodule add a &&
+ git -C submodule commit -m "add a" &&
+ git submodule add ./submodule &&
+ git commit -m "added submodule"
+'
+
+test_expect_success 'grep correctly finds patterns in a submodule' '
+ cat >expect <<-\EOF &&
+ a:foobar
+ b/b:bar
+ submodule/a:foobar
+ EOF
+
+ git grep -e "bar" --recurse-submodules >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep and basic pathspecs' '
+ cat >expect <<-\EOF &&
+ submodule/a:foobar
+ EOF
+
+ git grep -e. --recurse-submodules -- submodule >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep and nested submodules' '
+ git init submodule/sub &&
+ echo "foobar" >submodule/sub/a &&
+ git -C submodule/sub add a &&
+ git -C submodule/sub commit -m "add a" &&
+ git -C submodule submodule add ./sub &&
+ git -C submodule add sub &&
+ git -C submodule commit -m "added sub" &&
+ git add submodule &&
+ git commit -m "updated submodule" &&
+
+ cat >expect <<-\EOF &&
+ a:foobar
+ b/b:bar
+ submodule/a:foobar
+ submodule/sub/a:foobar
+ EOF
+
+ git grep -e "bar" --recurse-submodules >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep and multiple patterns' '
+ cat >expect <<-\EOF &&
+ a:foobar
+ submodule/a:foobar
+ submodule/sub/a:foobar
+ EOF
+
+ git grep -e "bar" --and -e "foo" --recurse-submodules >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'grep and multiple patterns' '
+ cat >expect <<-\EOF &&
+ b/b:bar
+ EOF
+
+ git grep -e "bar" --and --not -e "foo" --recurse-submodules >actual &&
+ test_cmp expect actual
+'
+
+test_incompatible_with_recurse_submodules ()
+{
+ test_expect_success "--recurse-submodules and $1 are incompatible" "
+ test_must_fail git grep -e. --recurse-submodules $1 2>actual &&
+ test_i18ngrep 'not supported with --recurse-submodules' actual
+ "
+}
+
+test_incompatible_with_recurse_submodules --untracked
+test_incompatible_with_recurse_submodules --no-index
+test_incompatible_with_recurse_submodules HEAD
+
+test_done
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related
* [PATCH v7 2/7] submodules: add helper to determine if a submodule is initialized
From: Brandon Williams @ 2016-12-16 19:03 UTC (permalink / raw)
To: git
Cc: peff, sbeller, jonathantanmy, gitster, jacob.keller, j6t,
Brandon Williams
In-Reply-To: <1481915002-162130-1-git-send-email-bmwill@google.com>
Add the `is_submodule_initialized()` helper function to submodules.c.
`is_submodule_initialized()` performs a check to determine if the
submodule at the given path has been initialized.
Signed-off-by: Brandon Williams <bmwill@google.com>
---
submodule.c | 23 +++++++++++++++++++++++
submodule.h | 1 +
2 files changed, 24 insertions(+)
diff --git a/submodule.c b/submodule.c
index ee3198d..edffaa1 100644
--- a/submodule.c
+++ b/submodule.c
@@ -199,6 +199,29 @@ void gitmodules_config(void)
}
/*
+ * Determine if a submodule has been initialized at a given 'path'
+ */
+int is_submodule_initialized(const char *path)
+{
+ int ret = 0;
+ const struct submodule *module = NULL;
+
+ module = submodule_from_path(null_sha1, path);
+
+ if (module) {
+ char *key = xstrfmt("submodule.%s.url", module->name);
+ char *value = NULL;
+
+ ret = !git_config_get_string(key, &value);
+
+ free(value);
+ free(key);
+ }
+
+ return ret;
+}
+
+/*
* Determine if a submodule has been populated at a given 'path'
*/
int is_submodule_populated(const char *path)
diff --git a/submodule.h b/submodule.h
index c4af505..6ec5f2f 100644
--- a/submodule.h
+++ b/submodule.h
@@ -37,6 +37,7 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
const char *path);
int submodule_config(const char *var, const char *value, void *cb);
void gitmodules_config(void);
+extern int is_submodule_initialized(const char *path);
extern int is_submodule_populated(const char *path);
int parse_submodule_update_strategy(const char *value,
struct submodule_update_strategy *dst);
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox