* Re: [PATCH] mailmap: update brandon williams's email address
From: Stefan Beller @ 2018-12-07 22:11 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: bwilliamseng, git, bwilliams.eng
In-Reply-To: <20181207214013.GA73340@google.com>
On Fri, Dec 7, 2018 at 1:40 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> Brandon Williams wrote:
>
> > Signed-off-by: Brandon Williams <bwilliams.eng@gmail.com>
> > ---
> > .mailmap | 1 +
> > 1 file changed, 1 insertion(+)
>
> I can confirm that this is indeed the same person.
What would be more of interest is why we'd be interested in this patch
as there is no commit/patch sent by Brandon with this email in gits history.
Is that so you get cc'd on your private address and can follow
things you worked on without being subscribed to the mailing list?
(I'd be interested to see the use case in the commit message;)
Thanks,
Stefan
^ permalink raw reply
* Re: RFE: git-patch-id should handle patches without leading "diff"
From: Jonathan Nieder @ 2018-12-07 22:01 UTC (permalink / raw)
To: Konstantin Ryabitsev; +Cc: git, Ævar Arnfjörð Bjarmason
In-Reply-To: <20181207181942.GA6411@pure.paranoia.local>
Hi,
Konstantin Ryabitsev wrote:
> Every now and again I come across a patch sent to LKML without a leading
> "diff a/foo b/foo" -- usually produced by quilt. E.g.:
>
> https://lore.kernel.org/lkml/20181125185004.151077005@linutronix.de/
>
> I am guessing quilt does not bother including the leading "diff a/foo
> b/foo" because it's redundant with the next two lines, however this
> remains a valid patch recognized by git-am.
>
> If you pipe that patch via git-patch-id, it produces nothing, but if I
> put in the leading "diff", like so:
>
> diff a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>
> then it properly returns "fb3ae17451bc619e3d7f0dd647dfba2b9ce8992e".
Interesting. As Ævar mentioned, the relevant code is
/* Ignore commit comments */
if (!patchlen && !starts_with(line, "diff "))
continue;
which is trying to handle a case where a line that is special to the
parser appears before the diff begins.
The patch-id appears to only care about the diff text, so it should be
able to handle this. So if we have a better heuristic for where the
diff starts, it would be good to use it.
"git apply" uses apply.c::find_header, which is more permissive.
Maybe it would be possible to unify these somehow. (I haven't looked
closely enough to tell how painful that would be.)
Thanks and hope that helps,
Jonathan
^ permalink raw reply
* Re: [wishlist] git submodule update --reset-hard
From: Stefan Beller @ 2018-12-07 21:55 UTC (permalink / raw)
To: Yaroslav Halchenko; +Cc: git
In-Reply-To: <20181207012256.GR4633@hopa.kiewit.dartmouth.edu>
On Thu, Dec 6, 2018 at 5:23 PM Yaroslav Halchenko <yoh@onerussian.com> wrote:
> > There was a proposal to "re-attach HEAD" in the submodule, i.e.
> > if the branch branch points at the same commit, we don't need
> > a detached HEAD, but could go with the branch instead.
>
> if I got the idea right, if we are talking about any branch, it
> would also non-deterministic since who knows what left over branch(es)
> point to that commit. Not sure if I would have used that ;)
I would think we'd rather want to have it deterministic, i.e. something like
1) record branch name of the submodule
2) update submodules HEAD to to superprojects gitlink
3) if recorded branch (1) matches the sha1 of detached HEAD,
have HEAD point to the branch instead.
You notice a small inefficiency here as we write HEAD twice, so it
could be reworded as:
1) compare superprojects gitlink with the submodules branch
2a) if equal, set submodules HEAD to branch
2b) if unequal set HEAD to gitlink value, resulting in detached HEAD
Note that this idea of reattaching reflects the idea (a) below.
> > a) "stay on submodule branch (i.e. HEAD still points at $branch), and
> > reset --hard" such that the submodule has a clean index and at that $branch
> > or
> > b) "stay on submodule branch (i.e. HEAD still points at $branch), but $branch is
> > set to the gitlink from the superproject, and then a reset --hard
> > will have the worktree set to it as well.
> NB "gitlink" -- just now discovered the thing for me. Thought it would be
> called a subproject echoing what git diff/log -p shows for submodule commits.
The terminology is messy:
The internal representation in Gits object model is a "gitlink" entry in a tree
object. Once we have a .gitmodules entry, we call it submodule.
The term 'subproject' is a historic artifact and will likely not be changed
in the diff output (or format-patch), because these diffs can be applied using
git-am for example. That makes the diff output effectively a transport
protocol, and changing protocols is hard if you have no versioning in them.
More in https://git-scm.com/docs/gitsubmodules (a rather recent new write
of a man page, going into concepts).
> > > right -- I meant the local changes and indeed reset --recurse-submodules
> > > indeed seems to recurse nicely. Then the undesired effect remaining only
> > > the detached HEAD
>
> > For that we may want to revive discussions in
> > https://public-inbox.org/git/20170501180058.8063-5-sbeller@google.com/
>
> well, isn't that one requires a branch to be specified in .gitmodules?
Ah good point.
> > git reset --hard --recursive=hard,keep-branch PREVIOUSPOINT
>
> 'keep-branch' (given aforementioned keeping the specified in .gitmodules
> branch) might be confusing. Also what if a submodule already in a
> detached HEAD? IMHO --recursive=hard, and just saying that it would do
> "reset --hard", is imho sufficient. (that is why I like pure
> --reset hard since it doesn't care and neither does anything to the
> branch)
For that we might want to first do the
git submodule update --reset-hard
which runs reset --hard inside the submodule, no matter which
branch the submodule is on (if any) and resets to the given
superproject sha1.
See git-submodule.sh in git.git[1] in cmd_update.
We'd need to add a command line flag (`--reset-hard`
would be the obvious choice?) which would set the `update`
variable, which then is evaluated to what needs to be done in
the submodule, which in that case would be the hard reset.
https://github.com/git/git/blob/master/git-submodule.sh#L606
Once that is done we'd want to add a test case, presumably
in t/t7406-submodule-update.sh
> > > I would have asked for
>
> > > git revert --recursive <commit>...
> > > git rebase --recursive [-i] ...
>
> > > which I also frequently desire (could elaborate on the use cases etc).
>
> > These would be nice to have. It would be nice if you'd elaborate on the
> > use cases for future reference in the mailing list archive. :-)
>
> ok, will try to do so ;-) In summary: they are just a logical extension
> of git support for submodules for anyone actively working with
> submodules to keep entire tree in sync. Then quite often the need for
> reverting a specific commit (which also has changes reflected in
> submodules) arises. The same with rebase, especially to trim away some
> no longer desired changes reflected in submodules.
>
> the initial "git submodule update --reset-hard" is pretty much a
> crude workaround for some of those cases, so I would just go earlier in
> the history, and redo some things, whenever I could just drop or revert
> some selected set of commits.
That makes sense.
Do you want to give the implementation a try for the --reset-hard switch?
> ah... so it is only submodule command which has --recursive, and the
> rest have --recurse-submodules when talking about recursing into
> submodules?
I don't think we were that cautious in development as it was done by
different people at different times. There is also just `--submodule` for
the diff family, for reference:
https://public-inbox.org/git/20180905225828.17782-1-sbeller@google.com/
^ permalink raw reply
* Re: [PATCH v2 1/3] git clone <url> C:\cygwin\home\USER\repo' is working (again)
From: Johannes Schindelin @ 2018-12-07 21:53 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git, svnpenn
In-Reply-To: <20181207170456.8994-1-tboegi@web.de>
Hi Torsten,
On Fri, 7 Dec 2018, tboegi@web.de wrote:
> compat/mingw-cygwin.c | 28 ++++++++++++++++++++++++++++
> compat/mingw-cygwin.h | 20 ++++++++++++++++++++
Please use compat/win32/path.c (or .../path-utils.c) instead. This is not
so much about MINGW or Cygwin or MSys or MSYS2 or Visual C++, but about
Windows.
Thanks,
Johannes
^ permalink raw reply
* Re: Retrieving a file in git that was deleted and committed
From: biswaranjan panda @ 2018-12-07 21:50 UTC (permalink / raw)
To: Bryan Turner; +Cc: peff, git
In-Reply-To: <CAGyf7-EkyGOi02fqMcCPBzj-=wpsH4zCgvP5VhOoKMdG+wfoLw@mail.gmail.com>
On Thu, Dec 6, 2018 at 11:26 PM Jeff King <peff@peff.net> wrote:
>>
>> On Thu, Dec 06, 2018 at 11:07:00PM -0800, biswaranjan panda wrote:
>>
> >> Thanks! Strangely git log --follow does work.
>>
>> I suspect it would work even without --follow. When you limit a log
>> traversal with a pathspec, like:
>>
>> git log foo
>>
>> that is not about following some continuous stream of content, but
>> rather just applying that pathspec to the diff of each commit, and
>> pruning ones where it did not change. So even if there are gaps where
>> the file did not exist, we continue to apply the pathspec to the older
>> commits.
> Ah, of course. Thanks for the clarification, Jeff. And my > apologies to
> Biswaranjan Panda for the incorrect information.
Thanks Jeff and Bryan! However, I am curious that if there were a way
to tell git blame to skip a commit (the one which added the file again
and maybe the one which deleted it originally) while it walks back
through history, then it should just get back the
entire history right ?
On Thu, Dec 6, 2018 at 11:37 PM Bryan Turner <bturner@atlassian.com> wrote:
>
> On Thu, Dec 6, 2018 at 11:26 PM Jeff King <peff@peff.net> wrote:
> >
> > On Thu, Dec 06, 2018 at 11:07:00PM -0800, biswaranjan panda wrote:
> >
> > > Thanks! Strangely git log --follow does work.
> >
> > I suspect it would work even without --follow. When you limit a log
> > traversal with a pathspec, like:
> >
> > git log foo
> >
> > that is not about following some continuous stream of content, but
> > rather just applying that pathspec to the diff of each commit, and
> > pruning ones where it did not change. So even if there are gaps where
> > the file did not exist, we continue to apply the pathspec to the older
> > commits.
>
> Ah, of course. Thanks for the clarification, Jeff. And my apologies to
> Biswaranjan Panda for the incorrect information.
>
> >
> > Tools like git-blame will _not_ work, though, as they really are trying
> > to track the content as they walk back through history. And Once all of
> > the content seems to appear from nowhere in your new commit, that seems
> > like a dead end.
> >
> > In theory there could be some machine-readable annotation in the commit
> > object (or in a note created after the fact) to say "even though 'foo'
> > is a new file here, it came from $commit:foo". And then git-blame could
> > keep following the content there. But such a feature does not yet exist.
> >
> > -Peff
--
Thanks,
-Biswa
^ permalink raw reply
* [PATCH on master v2] revision: use commit graph in get_reference()
From: Jonathan Tan @ 2018-12-07 21:50 UTC (permalink / raw)
To: git; +Cc: Jonathan Tan, stolee, peff, gitster
In-Reply-To: <20181204224238.50966-1-jonathantanmy@google.com>
When fetching into a repository, a connectivity check is first made by
check_exist_and_connected() in builtin/fetch.c that runs:
git rev-list --objects --stdin --not --all --quiet <(list of objects)
If the client repository has many refs, this command can be slow,
regardless of the nature of the server repository or what is being
fetched. A profiler reveals that most of the time is spent in
setup_revisions() (approx. 60/63), and of the time spent in
setup_revisions(), most of it is spent in parse_object() (approx.
49/60). This is because setup_revisions() parses the target of every ref
(from "--all"), and parse_object() reads the buffer of the object.
Reading the buffer is unnecessary if the repository has a commit graph
and if the ref points to a commit (which is typically the case). This
patch uses the commit graph wherever possible; on my computer, when I
run the above command with a list of 1 object on a many-ref repository,
I get a speedup from 1.8s to 1.0s.
Another way to accomplish this effect would be to modify parse_object()
to use the commit graph if possible; however, I did not want to change
parse_object()'s current behavior of always checking the object
signature of the returned object.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
This patch is now on master.
v2 makes use of the optimization Stolee describes in [1], except that I
have arranged the functions slightly differently. In particular, I
didn't want to add even more ways to obtain objects, so I let
parse_commit_in_graph() be able to take in either a commit shell or an
OID, and did not create the parse_probably_commit() function he
suggested. But I'm not really attached to this design choice, and can
change it if requested.
[1] https://public-inbox.org/git/aa0cd481-c135-47aa-2a69-e3dc71661caa@gmail.com/
---
commit-graph.c | 38 ++++++++++++++++++++++++++++----------
commit-graph.h | 12 ++++++++----
commit.c | 2 +-
revision.c | 5 ++++-
t/helper/test-repository.c | 4 ++--
5 files changed, 43 insertions(+), 18 deletions(-)
diff --git a/commit-graph.c b/commit-graph.c
index 40c855f185..a571b523b7 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -286,7 +286,8 @@ void close_commit_graph(struct repository *r)
r->objects->commit_graph = NULL;
}
-static int bsearch_graph(struct commit_graph *g, struct object_id *oid, uint32_t *pos)
+static int bsearch_graph(struct commit_graph *g, const struct object_id *oid,
+ uint32_t *pos)
{
return bsearch_hash(oid->hash, g->chunk_oid_fanout,
g->chunk_oid_lookup, g->hash_len, pos);
@@ -374,24 +375,41 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
}
}
-static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
+static struct commit *parse_commit_in_graph_one(struct repository *r,
+ struct commit_graph *g,
+ struct commit *shell,
+ const struct object_id *oid)
{
uint32_t pos;
- if (item->object.parsed)
- return 1;
+ if (shell && shell->object.parsed)
+ return shell;
- if (find_commit_in_graph(item, g, &pos))
- return fill_commit_in_graph(item, g, pos);
+ if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
+ pos = shell->graph_pos;
+ } else if (bsearch_graph(g, shell ? &shell->object.oid : oid, &pos)) {
+ /* bsearch_graph sets pos */
+ } else {
+ return NULL;
+ }
- return 0;
+ if (!shell) {
+ shell = lookup_commit(r, oid);
+ if (!shell)
+ return NULL;
+ }
+
+ fill_commit_in_graph(shell, g, pos);
+ return shell;
}
-int parse_commit_in_graph(struct repository *r, struct commit *item)
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+ const struct object_id *oid)
{
if (!prepare_commit_graph(r))
return 0;
- return parse_commit_in_graph_one(r->objects->commit_graph, item);
+ return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
+ oid);
}
void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1025,7 +1043,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
}
graph_commit = lookup_commit(r, &cur_oid);
- if (!parse_commit_in_graph_one(g, graph_commit))
+ if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
graph_report("failed to parse %s from commit-graph",
oid_to_hex(&cur_oid));
}
diff --git a/commit-graph.h b/commit-graph.h
index 9db40b4d3a..8b7b5985dc 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -13,16 +13,20 @@ struct commit;
char *get_commit_graph_filename(const char *obj_dir);
/*
- * Given a commit struct, try to fill the commit struct info, including:
+ * If the given commit (identified by shell->object.oid or oid) is in the
+ * commit graph, returns a commit struct (reusing shell if it is not NULL)
+ * including the following info:
* 1. tree object
* 2. date
* 3. parents.
*
- * Returns 1 if and only if the commit was found in the packed graph.
+ * If not, returns NULL. See parse_commit_buffer() for the fallback after this
+ * call.
*
- * See parse_commit_buffer() for the fallback after this call.
+ * Either shell or oid must be non-NULL. If both are non-NULL, oid is ignored.
*/
-int parse_commit_in_graph(struct repository *r, struct commit *item);
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+ const struct object_id *oid);
/*
* It is possible that we loaded commit contents from the commit buffer,
diff --git a/commit.c b/commit.c
index d13a7bc374..88eb580c5a 100644
--- a/commit.c
+++ b/commit.c
@@ -456,7 +456,7 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
return -1;
if (item->object.parsed)
return 0;
- if (use_commit_graph && parse_commit_in_graph(the_repository, item))
+ if (use_commit_graph && parse_commit_in_graph(the_repository, item, NULL))
return 0;
buffer = read_object_file(&item->object.oid, &type, &size);
if (!buffer)
diff --git a/revision.c b/revision.c
index 13e0519c02..05fddb5880 100644
--- a/revision.c
+++ b/revision.c
@@ -213,7 +213,10 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
{
struct object *object;
- object = parse_object(revs->repo, oid);
+ object = (struct object *) parse_commit_in_graph(revs->repo, NULL, oid);
+ if (!object)
+ object = parse_object(revs->repo, oid);
+
if (!object) {
if (revs->ignore_missing)
return object;
diff --git a/t/helper/test-repository.c b/t/helper/test-repository.c
index 6a84a53efb..63b928a883 100644
--- a/t/helper/test-repository.c
+++ b/t/helper/test-repository.c
@@ -22,7 +22,7 @@ static void test_parse_commit_in_graph(const char *gitdir, const char *worktree,
c = lookup_commit(&r, commit_oid);
- if (!parse_commit_in_graph(&r, c))
+ if (!parse_commit_in_graph(&r, c, NULL))
die("Couldn't parse commit");
printf("%"PRItime, c->date);
@@ -52,7 +52,7 @@ static void test_get_commit_tree_in_graph(const char *gitdir,
* get_commit_tree_in_graph does not automatically parse the commit, so
* parse it first.
*/
- if (!parse_commit_in_graph(&r, c))
+ if (!parse_commit_in_graph(&r, c, NULL))
die("Couldn't parse commit");
tree = get_commit_tree_in_graph(&r, c);
if (!tree)
--
2.19.0.271.gfe8321ec05.dirty
^ permalink raw reply related
* Re: [PATCH] mailmap: update brandon williams's email address
From: Jonathan Nieder @ 2018-12-07 21:40 UTC (permalink / raw)
To: Brandon Williams; +Cc: git, Brandon Williams
In-Reply-To: <20181207205621.49961-1-bwilliams.eng@gmail.com>
Brandon Williams wrote:
> Signed-off-by: Brandon Williams <bwilliams.eng@gmail.com>
> ---
> .mailmap | 1 +
> 1 file changed, 1 insertion(+)
I can confirm that this is indeed the same person.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Welcome back!
^ permalink raw reply
* [PATCH] mailmap: update brandon williams's email address
From: Brandon Williams @ 2018-12-07 20:56 UTC (permalink / raw)
To: git; +Cc: Brandon Williams
Signed-off-by: Brandon Williams <bwilliams.eng@gmail.com>
---
.mailmap | 1 +
1 file changed, 1 insertion(+)
diff --git a/.mailmap b/.mailmap
index eb7b5fc7b..247a3deb7 100644
--- a/.mailmap
+++ b/.mailmap
@@ -27,6 +27,7 @@ Ben Walton <bdwalton@gmail.com> <bwalton@artsci.utoronto.ca>
Benoit Sigoure <tsunanet@gmail.com> <tsuna@lrde.epita.fr>
Bernt Hansen <bernt@norang.ca> <bernt@alumni.uwaterloo.ca>
Brandon Casey <drafnel@gmail.com> <casey@nrlssc.navy.mil>
+Brandon Williams <bwilliams.eng@gmail.com> <bmwill@google.com>
brian m. carlson <sandals@crustytoothpaste.net>
brian m. carlson <sandals@crustytoothpaste.net> <sandals@crustytoothpaste.ath.cx>
Bryan Larsen <bryan@larsen.st> <bryan.larsen@gmail.com>
--
2.19.1
^ permalink raw reply related
* Re: RFE: git-patch-id should handle patches without leading "diff"
From: Ævar Arnfjörð Bjarmason @ 2018-12-07 19:25 UTC (permalink / raw)
To: Konstantin Ryabitsev; +Cc: git
In-Reply-To: <20181207181942.GA6411@pure.paranoia.local>
On Fri, Dec 07 2018, Konstantin Ryabitsev wrote:
> Hi, all:
>
> Every now and again I come across a patch sent to LKML without a leading
> "diff a/foo b/foo" -- usually produced by quilt. E.g.:
>
> https://lore.kernel.org/lkml/20181125185004.151077005@linutronix.de/
>
> I am guessing quilt does not bother including the leading "diff a/foo
> b/foo" because it's redundant with the next two lines, however this
> remains a valid patch recognized by git-am.
>
> If you pipe that patch via git-patch-id, it produces nothing, but if I
> put in the leading "diff", like so:
>
> diff a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>
> then it properly returns "fb3ae17451bc619e3d7f0dd647dfba2b9ce8992e".
>
> Can we please teach git-patch-id to work without the leading diff a/foo
> b/foo, same as git-am?
>
> Best,
> -K
The state machine is sensitive there being a "diff" line, then "index"
etc.
diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index 970d0d30b4..b99e4455fd 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -97,7 +97,9 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result,
}
/* Ignore commit comments */
- if (!patchlen && !starts_with(line, "diff "))
+ if (!patchlen && starts_with(line, "--- a/"))
+ ;
+ else if (!patchlen && !starts_with(line, "diff "))
continue;
/* Parsing diff header? */
This would make it produce a patch-id for that input, however note that
I've done "--- a/" there, with just "--- " (which is legit) we'd get
confused and start earlier before the diffstat.
So if you're interested in having this I leave it to you to run with
this & write tests for it, but more convincingly run it on the git &
LKML archives and see that the output is the same (or just extra in case
where we now find patches) with --stable etc.
^ permalink raw reply related
* [PATCH] l10n: de.po: fix two messages
From: Ralf Thielow @ 2018-12-07 18:52 UTC (permalink / raw)
To: git, Phillip Szelat; +Cc: Matthias Rüster, Ralf Thielow
Reported-by: Phillip Szelat <phillip.szelat@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
---
Hi Phillip,
Good catches. Thanks for the review!
Ralf
po/de.po | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/po/de.po b/po/de.po
index eb213d742..d5113434a 100644
--- a/po/de.po
+++ b/po/de.po
@@ -3421,7 +3421,7 @@ msgstr "Fehler bei Vorbereitung der Packdatei aus multi-pack-index."
#: midx.c:407
#, c-format
msgid "failed to add packfile '%s'"
-msgstr "Fehler beim Hinzufügen von Packdatei'%s'."
+msgstr "Fehler beim Hinzufügen von Packdatei '%s'."
#: midx.c:413
#, c-format
@@ -4559,7 +4559,7 @@ msgstr "Öffnen von /dev/null fehlgeschlagen"
#: run-command.c:1229
#, c-format
msgid "cannot create async thread: %s"
-msgstr "Kann Thread für async nicht erzeugen: %s"
+msgstr "Konnte Thread für async nicht erzeugen: %s"
#: run-command.c:1293
#, c-format
--
2.20.0.rc2.411.g8f28e744c2
^ permalink raw reply related
* Re: [PATCH] git-rebase.txt: update note about directory rename detection and am
From: Elijah Newren @ 2018-12-07 18:46 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <66479695-d17e-c9cb-4cb7-8c74e3855032@kdbg.org>
On Fri, Dec 7, 2018 at 9:51 AM Johannes Sixt <j6t@kdbg.org> wrote:
>
> From: Elijah Newren <newren@gmail.com>
>
> In commit 6aba117d5cf7 ("am: avoid directory rename detection when
> calling recursive merge machinery", 2018-08-29), the git-rebase manpage
> probably should have also been updated to note the stronger
> incompatibility between git-am and directory rename detection. Update
> it now.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
> ---
> Documentation/git-rebase.txt | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 41631df6e4..7bea21e8e3 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -569,8 +569,9 @@ it to keep commits that started empty.
> Directory rename detection
> ~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> -The merge and interactive backends work fine with
> -directory rename detection. The am backend sometimes does not.
> +Directory rename heuristics are enabled in the merge and interactive
> +backends. Due to the lack of accurate tree information, directory
> +rename detection is disabled in the am backend.
>
> include::merge-strategies.txt[]
I was intending to send this out the past couple days, was just kinda
busy. Thanks for handling it for me.
^ permalink raw reply
* RFE: git-patch-id should handle patches without leading "diff"
From: Konstantin Ryabitsev @ 2018-12-07 18:19 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 737 bytes --]
Hi, all:
Every now and again I come across a patch sent to LKML without a leading
"diff a/foo b/foo" -- usually produced by quilt. E.g.:
https://lore.kernel.org/lkml/20181125185004.151077005@linutronix.de/
I am guessing quilt does not bother including the leading "diff a/foo
b/foo" because it's redundant with the next two lines, however this
remains a valid patch recognized by git-am.
If you pipe that patch via git-patch-id, it produces nothing, but if I
put in the leading "diff", like so:
diff a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
then it properly returns "fb3ae17451bc619e3d7f0dd647dfba2b9ce8992e".
Can we please teach git-patch-id to work without the leading diff a/foo
b/foo, same as git-am?
Best,
-K
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply
* [PATCH] git-rebase.txt: update note about directory rename detection and am
From: Johannes Sixt @ 2018-12-07 17:51 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Elijah Newren, Git Mailing List
In-Reply-To: <xmqqmupjbnhq.fsf@gitster-ct.c.googlers.com>
From: Elijah Newren <newren@gmail.com>
In commit 6aba117d5cf7 ("am: avoid directory rename detection when
calling recursive merge machinery", 2018-08-29), the git-rebase manpage
probably should have also been updated to note the stronger
incompatibility between git-am and directory rename detection. Update
it now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
Documentation/git-rebase.txt | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 41631df6e4..7bea21e8e3 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -569,8 +569,9 @@ it to keep commits that started empty.
Directory rename detection
~~~~~~~~~~~~~~~~~~~~~~~~~~
-The merge and interactive backends work fine with
-directory rename detection. The am backend sometimes does not.
+Directory rename heuristics are enabled in the merge and interactive
+backends. Due to the lack of accurate tree information, directory
+rename detection is disabled in the am backend.
include::merge-strategies.txt[]
--
2.19.1.1133.g2dd3d172d2
^ permalink raw reply related
* [PATCH v2 3/3] Refactor mingw_cygwin_offset_1st_component()
From: tboegi @ 2018-12-07 17:05 UTC (permalink / raw)
To: git, svnpenn; +Cc: Torsten Bögershausen
In-Reply-To: <5bf18396.1c69fb81.20780.2b1d@mx.google.com>
From: Torsten Bögershausen <tboegi@web.de>
The Windows version of offset_1st_component() needs to hande 3 cases:
- The path is an UNC path, starting with "//" or "\\\\".
Skip the servername and the name of the share.
- The path is a DOS drive, starting with e.g. "X:"
The driver letter and the ':' must be skipped
- The path is pointing to a subdirectory somewhere in the path and the
directory seperator needs to be skipped ('/' or '\\').
Refactor the code to make it easier to read.
Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
compat/mingw-cygwin.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/compat/mingw-cygwin.c b/compat/mingw-cygwin.c
index 5552c3ac20..c379a72775 100644
--- a/compat/mingw-cygwin.c
+++ b/compat/mingw-cygwin.c
@@ -10,10 +10,8 @@ size_t mingw_cygwin_skip_dos_drive_prefix(char **path)
size_t mingw_cygwin_offset_1st_component(const char *path)
{
char *pos = (char *)path;
-
- /* unc paths */
- if (!skip_dos_drive_prefix(&pos) &&
- is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
+ if (is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
+ /* unc path */
/* skip server name */
pos = strpbrk(pos + 2, "\\/");
if (!pos)
@@ -22,7 +20,8 @@ size_t mingw_cygwin_offset_1st_component(const char *path)
do {
pos++;
} while (*pos && !is_dir_sep(*pos));
+ } else {
+ skip_dos_drive_prefix(&pos);
}
-
return pos + is_dir_sep(*pos) - path;
}
--
2.19.0.271.gfe8321ec05
^ permalink raw reply related
* [PATCH v2 1/3] git clone <url> C:\cygwin\home\USER\repo' is working (again)
From: tboegi @ 2018-12-07 17:04 UTC (permalink / raw)
To: git, svnpenn; +Cc: Torsten Bögershausen
In-Reply-To: <5bf18396.1c69fb81.20780.2b1d@mx.google.com>
From: Torsten Bögershausen <tboegi@web.de>
A regression for cygwin users was introduced with commit 05b458c,
"real_path: resolve symlinks by hand".
In the the commit message we read:
The current implementation of real_path uses chdir() in order to resolve
symlinks. Unfortunately this isn't thread-safe as chdir() affects a
process as a whole...
The old (and non-thread-save) OS calls chdir()/pwd() had been
replaced by a string operation.
The cygwin layer "knows" that "C:\cygwin" is an absolute path,
but the new string operation does not.
"git clone <url> C:\cygwin\home\USER\repo" fails like this:
fatal: Invalid path '/home/USER/repo/C:\cygwin\home\USER\repo'
The solution is to implement has_dos_drive_prefix(), skip_dos_drive_prefix()
is_dir_sep(), offset_1st_component() and convert_slashes() for cygwin
in the same way as it is done in 'Git for Windows' in compat/mingw.[ch]
Instead of duplicating the code, it is extracted into compat/mingw-cygwin.[ch]
Some need for refactoring and cleanup came up in the review, they are adressed
in a seperate commit.
Reported-By: Steven Penny <svnpenn@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
compat/cygwin.c | 19 -------------------
compat/cygwin.h | 2 --
compat/mingw-cygwin.c | 28 ++++++++++++++++++++++++++++
compat/mingw-cygwin.h | 20 ++++++++++++++++++++
compat/mingw.c | 29 +----------------------------
compat/mingw.h | 20 --------------------
config.mak.uname | 4 ++--
git-compat-util.h | 3 ++-
8 files changed, 53 insertions(+), 72 deletions(-)
delete mode 100644 compat/cygwin.c
delete mode 100644 compat/cygwin.h
create mode 100644 compat/mingw-cygwin.c
create mode 100644 compat/mingw-cygwin.h
diff --git a/compat/cygwin.c b/compat/cygwin.c
deleted file mode 100644
index b9862d606d..0000000000
--- a/compat/cygwin.c
+++ /dev/null
@@ -1,19 +0,0 @@
-#include "../git-compat-util.h"
-#include "../cache.h"
-
-int cygwin_offset_1st_component(const char *path)
-{
- const char *pos = path;
- /* unc paths */
- if (is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
- /* skip server name */
- pos = strchr(pos + 2, '/');
- if (!pos)
- return 0; /* Error: malformed unc path */
-
- do {
- pos++;
- } while (*pos && pos[0] != '/');
- }
- return pos + is_dir_sep(*pos) - path;
-}
diff --git a/compat/cygwin.h b/compat/cygwin.h
deleted file mode 100644
index 8e52de4644..0000000000
--- a/compat/cygwin.h
+++ /dev/null
@@ -1,2 +0,0 @@
-int cygwin_offset_1st_component(const char *path);
-#define offset_1st_component cygwin_offset_1st_component
diff --git a/compat/mingw-cygwin.c b/compat/mingw-cygwin.c
new file mode 100644
index 0000000000..c63d7acb9c
--- /dev/null
+++ b/compat/mingw-cygwin.c
@@ -0,0 +1,28 @@
+#include "../git-compat-util.h"
+
+int mingw_cygwin_skip_dos_drive_prefix(char **path)
+{
+ int ret = has_dos_drive_prefix(*path);
+ *path += ret;
+ return ret;
+}
+
+int mingw_cygwin_offset_1st_component(const char *path)
+{
+ char *pos = (char *)path;
+
+ /* unc paths */
+ if (!skip_dos_drive_prefix(&pos) &&
+ is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
+ /* skip server name */
+ pos = strpbrk(pos + 2, "\\/");
+ if (!pos)
+ return 0; /* Error: malformed unc path */
+
+ do {
+ pos++;
+ } while (*pos && !is_dir_sep(*pos));
+ }
+
+ return pos + is_dir_sep(*pos) - path;
+}
diff --git a/compat/mingw-cygwin.h b/compat/mingw-cygwin.h
new file mode 100644
index 0000000000..66ccc909ae
--- /dev/null
+++ b/compat/mingw-cygwin.h
@@ -0,0 +1,20 @@
+#define has_dos_drive_prefix(path) \
+ (isalpha(*(path)) && (path)[1] == ':' ? 2 : 0)
+int mingw_cygwin_skip_dos_drive_prefix(char **path);
+#define skip_dos_drive_prefix mingw_cygwin_skip_dos_drive_prefix
+static inline int mingw_cygwin_is_dir_sep(int c)
+{
+ return c == '/' || c == '\\';
+}
+#define is_dir_sep mingw_cygwin_is_dir_sep
+static inline char *mingw_cygwin_find_last_dir_sep(const char *path)
+{
+ char *ret = NULL;
+ for (; *path; ++path)
+ if (is_dir_sep(*path))
+ ret = (char *)path;
+ return ret;
+}
+#define find_last_dir_sep mingw_cygwin_find_last_dir_sep
+int mingw_cygwin_offset_1st_component(const char *path);
+#define offset_1st_component mingw_cygwin_offset_1st_component
diff --git a/compat/mingw.c b/compat/mingw.c
index 34b3880b29..038e96af9d 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -350,7 +350,7 @@ static inline int needs_hiding(const char *path)
return 0;
/* We cannot use basename(), as it would remove trailing slashes */
- mingw_skip_dos_drive_prefix((char **)&path);
+ mingw_cygwin_skip_dos_drive_prefix((char **)&path);
if (!*path)
return 0;
@@ -2275,33 +2275,6 @@ pid_t waitpid(pid_t pid, int *status, int options)
return -1;
}
-int mingw_skip_dos_drive_prefix(char **path)
-{
- int ret = has_dos_drive_prefix(*path);
- *path += ret;
- return ret;
-}
-
-int mingw_offset_1st_component(const char *path)
-{
- char *pos = (char *)path;
-
- /* unc paths */
- if (!skip_dos_drive_prefix(&pos) &&
- is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
- /* skip server name */
- pos = strpbrk(pos + 2, "\\/");
- if (!pos)
- return 0; /* Error: malformed unc path */
-
- do {
- pos++;
- } while (*pos && !is_dir_sep(*pos));
- }
-
- return pos + is_dir_sep(*pos) - path;
-}
-
int xutftowcsn(wchar_t *wcs, const char *utfs, size_t wcslen, int utflen)
{
int upos = 0, wpos = 0;
diff --git a/compat/mingw.h b/compat/mingw.h
index 8c24ddaa3e..30d9fb3e36 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -443,32 +443,12 @@ HANDLE winansi_get_osfhandle(int fd);
* git specific compatibility
*/
-#define has_dos_drive_prefix(path) \
- (isalpha(*(path)) && (path)[1] == ':' ? 2 : 0)
-int mingw_skip_dos_drive_prefix(char **path);
-#define skip_dos_drive_prefix mingw_skip_dos_drive_prefix
-static inline int mingw_is_dir_sep(int c)
-{
- return c == '/' || c == '\\';
-}
-#define is_dir_sep mingw_is_dir_sep
-static inline char *mingw_find_last_dir_sep(const char *path)
-{
- char *ret = NULL;
- for (; *path; ++path)
- if (is_dir_sep(*path))
- ret = (char *)path;
- return ret;
-}
static inline void convert_slashes(char *path)
{
for (; *path; path++)
if (*path == '\\')
*path = '/';
}
-#define find_last_dir_sep mingw_find_last_dir_sep
-int mingw_offset_1st_component(const char *path);
-#define offset_1st_component mingw_offset_1st_component
#define PATH_SEP ';'
extern char *mingw_query_user_email(void);
#define query_user_email mingw_query_user_email
diff --git a/config.mak.uname b/config.mak.uname
index 3ee7da0e23..9346f67922 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -187,7 +187,7 @@ ifeq ($(uname_O),Cygwin)
UNRELIABLE_FSTAT = UnfortunatelyYes
OBJECT_CREATION_USES_RENAMES = UnfortunatelyNeedsTo
MMAP_PREVENTS_DELETE = UnfortunatelyYes
- COMPAT_OBJS += compat/cygwin.o
+ COMPAT_OBJS += compat/mingw-cygwin.o
FREAD_READS_DIRECTORIES = UnfortunatelyYes
endif
ifeq ($(uname_S),FreeBSD)
@@ -526,7 +526,7 @@ ifneq (,$(findstring MINGW,$(uname_S)))
DEFAULT_HELP_FORMAT = html
COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32
COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
- COMPAT_OBJS += compat/mingw.o compat/winansi.o \
+ COMPAT_OBJS += compat/mingw.o compat/mingw-cygwin.o compat/winansi.o \
compat/win32/pthread.o compat/win32/syslog.o \
compat/win32/dirent.o
BASIC_CFLAGS += -DWIN32 -DPROTECT_NTFS_DEFAULT=1
diff --git a/git-compat-util.h b/git-compat-util.h
index 09b0102cae..7ece969b22 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -193,10 +193,11 @@
#endif
#if defined(__CYGWIN__)
-#include "compat/cygwin.h"
+#include "compat/mingw-cygwin.h"
#endif
#if defined(__MINGW32__)
/* pull in Windows compatibility stuff */
+#include "compat/mingw-cygwin.h"
#include "compat/mingw.h"
#elif defined(_MSC_VER)
#include "compat/msvc.h"
--
2.19.0.271.gfe8321ec05
^ permalink raw reply related
* [PATCH v2 2/3] offset_1st_component(), dos_drive_prefix() return size_t
From: tboegi @ 2018-12-07 17:04 UTC (permalink / raw)
To: git, svnpenn; +Cc: Torsten Bögershausen
In-Reply-To: <5bf18396.1c69fb81.20780.2b1d@mx.google.com>
From: Torsten Bögershausen <tboegi@web.de>
Change the return value for offset_1st_component(),
has_dos_drive_prefix() and skip_dos_drive_prefix from int into size_t,
which is the natural type for length of data in memory.
While at it, remove possible "parameter not used" warnings in for the
non-Windows builds in git-compat-util.h
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
abspath.c | 2 +-
compat/mingw-cygwin.c | 6 +++---
compat/mingw-cygwin.h | 4 ++--
git-compat-util.h | 8 +++++---
setup.c | 4 ++--
5 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/abspath.c b/abspath.c
index 9857985329..12055a1d8f 100644
--- a/abspath.c
+++ b/abspath.c
@@ -51,7 +51,7 @@ static void get_next_component(struct strbuf *next, struct strbuf *remaining)
/* copies root part from remaining to resolved, canonicalizing it on the way */
static void get_root_part(struct strbuf *resolved, struct strbuf *remaining)
{
- int offset = offset_1st_component(remaining->buf);
+ size_t offset = offset_1st_component(remaining->buf);
strbuf_reset(resolved);
strbuf_add(resolved, remaining->buf, offset);
diff --git a/compat/mingw-cygwin.c b/compat/mingw-cygwin.c
index c63d7acb9c..5552c3ac20 100644
--- a/compat/mingw-cygwin.c
+++ b/compat/mingw-cygwin.c
@@ -1,13 +1,13 @@
#include "../git-compat-util.h"
-int mingw_cygwin_skip_dos_drive_prefix(char **path)
+size_t mingw_cygwin_skip_dos_drive_prefix(char **path)
{
- int ret = has_dos_drive_prefix(*path);
+ size_t ret = has_dos_drive_prefix(*path);
*path += ret;
return ret;
}
-int mingw_cygwin_offset_1st_component(const char *path)
+size_t mingw_cygwin_offset_1st_component(const char *path)
{
char *pos = (char *)path;
diff --git a/compat/mingw-cygwin.h b/compat/mingw-cygwin.h
index 66ccc909ae..0e8a0c9074 100644
--- a/compat/mingw-cygwin.h
+++ b/compat/mingw-cygwin.h
@@ -1,6 +1,6 @@
#define has_dos_drive_prefix(path) \
(isalpha(*(path)) && (path)[1] == ':' ? 2 : 0)
-int mingw_cygwin_skip_dos_drive_prefix(char **path);
+size_t mingw_cygwin_skip_dos_drive_prefix(char **path);
#define skip_dos_drive_prefix mingw_cygwin_skip_dos_drive_prefix
static inline int mingw_cygwin_is_dir_sep(int c)
{
@@ -16,5 +16,5 @@ static inline char *mingw_cygwin_find_last_dir_sep(const char *path)
return ret;
}
#define find_last_dir_sep mingw_cygwin_find_last_dir_sep
-int mingw_cygwin_offset_1st_component(const char *path);
+size_t mingw_cygwin_offset_1st_component(const char *path);
#define offset_1st_component mingw_cygwin_offset_1st_component
diff --git a/git-compat-util.h b/git-compat-util.h
index 7ece969b22..65eaaf0d50 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -355,16 +355,18 @@ static inline int noop_core_config(const char *var, const char *value, void *cb)
#endif
#ifndef has_dos_drive_prefix
-static inline int git_has_dos_drive_prefix(const char *path)
+static inline size_t git_has_dos_drive_prefix(const char *path)
{
+ (void)path;
return 0;
}
#define has_dos_drive_prefix git_has_dos_drive_prefix
#endif
#ifndef skip_dos_drive_prefix
-static inline int git_skip_dos_drive_prefix(char **path)
+static inline size_t git_skip_dos_drive_prefix(char **path)
{
+ (void)path;
return 0;
}
#define skip_dos_drive_prefix git_skip_dos_drive_prefix
@@ -379,7 +381,7 @@ static inline int git_is_dir_sep(int c)
#endif
#ifndef offset_1st_component
-static inline int git_offset_1st_component(const char *path)
+static inline size_t git_offset_1st_component(const char *path)
{
return is_dir_sep(path[0]);
}
diff --git a/setup.c b/setup.c
index 1be5037f12..538bc1ff99 100644
--- a/setup.c
+++ b/setup.c
@@ -29,7 +29,7 @@ static int abspath_part_inside_repo(char *path)
size_t len;
size_t wtlen;
char *path0;
- int off;
+ size_t off;
const char *work_tree = get_git_work_tree();
if (!work_tree)
@@ -800,7 +800,7 @@ static const char *setup_bare_git_dir(struct strbuf *cwd, int offset,
struct repository_format *repo_fmt,
int *nongit_ok)
{
- int root_len;
+ size_t root_len;
if (check_repository_format_gently(".", repo_fmt, nongit_ok))
return NULL;
--
2.19.0.271.gfe8321ec05
^ permalink raw reply related
* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
From: Derrick Stolee @ 2018-12-07 13:49 UTC (permalink / raw)
To: Jonathan Tan, sbeller; +Cc: git
In-Reply-To: <20181206233626.144072-1-jonathantanmy@google.com>
On 12/6/2018 6:36 PM, Jonathan Tan wrote:
>> AFAICT oid_object_info doesn't take advantage of the commit graph,
>> but just looks up the object header, which is still less than completely
>> parsing it. Then lookup_commit is overly strict, as it may return
>> NULL as when there still is a type mismatch (I don't think a mismatch
>> could happen here, as both rely on just the object store, and not the
>> commit graph.), so this would be just defensive programming for
>> the sake of it. I dunno.
>>
>> struct commit *c;
>>
>> if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT &&
>> (c = lookup_commit(revs->repo, oid)) &&
>> !repo_parse_commit(revs->repo, c))
>> object = (struct object *) c;
>> else
>> object = parse_object(revs->repo, oid);
> I like this way better - I'll do it in the next version.
If we do _not_ have a commit-graph or if the commit-graph does not have
that commit, this will have the same performance problem, right?
Should we instead create a direct dependence on the commit-graph, and try
to parse the oid from the graph directly? If it succeeds, then we learn
that the object is a commit, in addition to all of the parsing work. This
means we could avoid oid_object_info() loading data if we succeed. We
would fall back to parse_object() if it fails.
I was thinking this should be a simple API call to parse_commit_in_graph(),
but that requires a struct commit filled with an oid, which is not the
best idea if we don't actually know it is a commit yet.
The approach I recommend would then be more detailed:
1. Modify find_commit_in_graph() to take a struct object_id instead of a
struct commit. This helps find the integer position in the graph. That
position can be used in fill_commit_in_graph() to load the commit
contents. Keep find_commit_in_graph() static as it should not be a
public function.
2. Create a public function with prototype
struct commit *try_parse_commit_from_graph(struct repository *r, struct
object_id *oid)
that returns a commit struct fully parsed if and only if the repository
has that oid. It can call find_commit_in_graph(), then
lookup_commit() and
fill_commit_in_graph() to create the commit and parse the data.
3. In replace of the snippet above, do:
struct commit *c;
if ((c = try_parse_commit_from_graph(revs->repo, oid))
object = (struct object *)c;
else
object = parse_object(revs->repo, oid);
A similar pattern _could_ be used in parse_object(), but I don't recommend
doing this pattern unless we have a reasonable suspicion that we are going
to parse commits more often than other objects. (It adds an O(log(#
commits))
binary search to each object.)
A final thought: consider making this "try the commit graph first, but fall
back to parse_object()" a library function with a name like
struct object *parse_probably_commit(struct repository *r, struct
object_id *oid)
so other paths that are parsing a lot of commits (but also maybe tags) could
use the logic.
Thanks!
-Stolee
^ permalink raw reply
* Re: [PATCH v2 2/3] commit-graph: fix buffer read-overflow
From: Derrick Stolee @ 2018-12-07 13:33 UTC (permalink / raw)
To: Josh Steadmon, git, gitster, avarab
In-Reply-To: <af45c2337fbe2a59ac95aff3ce90a69d8c30416f.1544127439.git.steadmon@google.com>
On 12/6/2018 3:20 PM, Josh Steadmon wrote:
> +
> +# usage: corrupt_and_zero_graph_then_verify <corrupt_position> <data> <zero_position> <string>
> +# Manipulates the commit-graph file at <corrupt_position> by inserting the data,
> +# then zeros the file starting at <zero_position>. Finally, runs
> +# 'git commit-graph verify' and places the output in the file 'err'. Tests 'err'
> +# for the given string.
> +corrupt_and_zero_graph_then_verify() {
This method is very similar to to 'corrupt_graph_and_verify()', the only
difference being the zero_pos, which zeroes the graph.
Could it instead be a modification of corrupt_graph_and_verify() where
$4 is interpreted as zero_pos, and if it is blank we don't do the
truncation?
> +test_expect_success 'detect truncated graph' '
> + corrupt_and_zero_graph_then_verify $GRAPH_BYTE_CHUNK_COUNT "\xff" \
> + $GRAPH_CHUNK_LOOKUP_OFFSET "chunk lookup table entry missing"
> +'
> +
Thanks for this! I think it's valuable to keep explicit tests around
that were discovered from your fuzz tests. Specifically, I can repeat
the test when I get around to the next file format.
Thanks,
-Stolee
^ permalink raw reply
* Re: Get "responsible" .gitignore file / rule
From: Victor Toni @ 2018-12-07 13:17 UTC (permalink / raw)
To: sunshine; +Cc: git
In-Reply-To: <CAPig+cRvJWhnm22Xyb9J1wz2sKLycRKbAg299QfTZu5Tq8F07g@mail.gmail.com>
Am Fr., 7. Dez. 2018 um 13:45 Uhr schrieb Eric Sunshine
<sunshine@sunshineco.com>:
>
> On Fri, Dec 7, 2018 at 7:36 AM Victor Toni <victor.toni@gmail.com> wrote:
> > I'm wondering if there is any way to show which rules (ideally with
> > the .gitignore file they are coming from) are causing a specific file
> > to get ignored so I could easily fix the .gitignore file?
>
> Perhaps the "git check-ignore" command would help.
Thanks for the tip!
Works like a charm (had to use the --verbose option though, without,
it does not give much feedback)
^ permalink raw reply
* Re: Get "responsible" .gitignore file / rule
From: Eric Sunshine @ 2018-12-07 12:44 UTC (permalink / raw)
To: victor.toni; +Cc: Git List
In-Reply-To: <CAG0OSgeTqFYGqqOOBF0TbKpsWb70Bv_wQ6-b4Ke=LTg6Z0OUMg@mail.gmail.com>
On Fri, Dec 7, 2018 at 7:36 AM Victor Toni <victor.toni@gmail.com> wrote:
> I'm wondering if there is any way to show which rules (ideally with
> the .gitignore file they are coming from) are causing a specific file
> to get ignored so I could easily fix the .gitignore file?
Perhaps the "git check-ignore" command would help.
^ permalink raw reply
* Get "responsible" .gitignore file / rule
From: Victor Toni @ 2018-12-07 12:35 UTC (permalink / raw)
To: git
In a rather complex setup with deep directory structure it happens
every now and then, that files get ignored when trying to add them.
As these files are _not_ shown in `git status` but in `git status
--ignored` so I guess the culprit is some misconfigured `.gitignore`.
Trying to ad the specific file gives a:
$ git add ignored/file/name
The following paths are ignored by one of your .gitignore files:
ignored/file/name
Use -f if you really want to add them.
Using -v doen't add any verbosity. I'm using git 2.19.1.windows.1 if
this matters
I'm wondering if there is any way to show which rules (ideally with
the .gitignore file they are coming from) are causing a specific file
to get ignored so I could easily fix the .gitignore file?
^ permalink raw reply
* Re: [PATCH v2 2/3] commit-graph: fix buffer read-overflow
From: Jeff King @ 2018-12-07 9:07 UTC (permalink / raw)
To: Josh Steadmon; +Cc: git, gitster, stolee, avarab
In-Reply-To: <af45c2337fbe2a59ac95aff3ce90a69d8c30416f.1544127439.git.steadmon@google.com>
On Thu, Dec 06, 2018 at 12:20:54PM -0800, Josh Steadmon wrote:
> diff --git a/commit-graph.c b/commit-graph.c
> index 07dd410f3c..224a5f161e 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -165,10 +165,20 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd,
> last_chunk_offset = 8;
> chunk_lookup = data + 8;
> for (i = 0; i < graph->num_chunks; i++) {
> - uint32_t chunk_id = get_be32(chunk_lookup + 0);
> - uint64_t chunk_offset = get_be64(chunk_lookup + 4);
> + uint32_t chunk_id;
> + uint64_t chunk_offset;
> int chunk_repeated = 0;
>
> + if (chunk_lookup + GRAPH_CHUNKLOOKUP_WIDTH >
> + data + graph_size) {
> + error(_("chunk lookup table entry missing; graph file may be incomplete"));
> + free(graph);
> + return NULL;
> + }
Is it possible to overflow the addition here? E.g., if I'm on a 32-bit
system and the truncated chunk appears right at the 4GB limit, in which
case we wrap back around? I guess that's pretty implausible, since it
would mean that the mmap is bumping up against the end of the address
space. I didn't check, but I wouldn't be surprised if sane operating
systems avoid allocating those addresses.
But I think you could write this as:
if (data + graph_size - chunk_lookup < GRAPH_CHUNKLOOKUP_WIDTH)
to avoid overflow (we know that "data + graph_size" is sane because
that's our mmap, and chunk_lookup is somewhere between "data" and "data
+ graph_size", so the result is between 0 and graph_size).
I dunno. I think I've convinced myself it's a non-issue here, but it may
be good to get in the habit of writing these sorts of offset checks in
an overflow-proof order.
-Peff
^ permalink raw reply
* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
From: Jeff King @ 2018-12-07 8:53 UTC (permalink / raw)
To: Jonathan Tan; +Cc: git, stolee
In-Reply-To: <20181206235446.147173-1-jonathantanmy@google.com>
On Thu, Dec 06, 2018 at 03:54:46PM -0800, Jonathan Tan wrote:
> This makes sense - I thought I shouldn't mention the commit graph in the
> code since it seems like a layering violation, but I felt the need to
> mention commit graph in a comment, so maybe the need to mention commit
> graph in the code is there too. Subsequently, maybe the lookup-for-type
> could be replaced by a lookup-in-commit-graph (maybe by using
> parse_commit_in_graph() directly), which should be at least slightly
> faster.
That makes more sense to me. If we don't have a commit graph at all,
it's a quick noop. If we do, we might binary search in the list of
commits for a non-commit. But that's strictly faster than finding the
object's type (which involves a binary search of a larger list, followed
by actually accessing the type info).
> > In general, it would be nice if we had a more incremental API
> > for accessing objects: open, get metadata, then read the data. That
> > would make these kinds of optimizations "free".
>
> Would this be assuming that to read the data, you would (1) first need to
> read the metadata, and (2) there would be no redundancy in reading the
> two? It seems to me that for loose objects, you would want to perform
> all your reads at once, since any read requires opening the file, and
> for commit graphs, you just want to read what you want, since the
> metadata and the data are in separate places.
By metadata here, I don't mean the commit-graph data, but just the
object type and size. So I'm imagining an interface more like:
- object_open() locates the object, and stores either the pack
file/offset or a descriptor to a loose path in an opaque handle
struct
- object_size() and object_type() on that handle would do what you
expect. For loose objects, these would parse the header (the
equivalent of unpack_sha1_header()). For packed ones, they'd use the
object header in the pack (and chase down the delta bits as needed).
- object_contents() would return the full content
- object_read() could sequentially read a subset of the file (this
could replace the streaming interface we currently have)
We have most of the low-level bits for this already, if you poke into
what object_info_extended() is doing. We just don't have them packaged
in an interface which can persist across multiple calls.
With an interface like that, parse_object()'s large-blob check could be
something like the patch below.
But your case here is a bit more interesting. If we have a commit graph,
then we can avoid opening (or even finding!) the on-disk object at all.
So I actually think it makes sense to just check the commit-graph first,
as discussed above.
---
diff --git a/object.c b/object.c
index e54160550c..afce58c0bc 100644
--- a/object.c
+++ b/object.c
@@ -254,23 +254,31 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
const struct object_id *repl = lookup_replace_object(r, oid);
void *buffer;
struct object *obj;
+ struct object_handle oh;
obj = lookup_object(r, oid->hash);
if (obj && obj->parsed)
return obj;
- if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
- (!obj && has_object_file(oid) &&
- oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
- if (check_object_signature(repl, NULL, 0, NULL) < 0) {
+ if (object_open(&oh, oid) < 0)
+ return NULL; /* missing object */
+
+ if (object_type(&oh) == OBJ_BLOB) {
+ /* this will call object_read() on 4k chunks */
+ if (check_object_signature_stream(&oh, oid)) {
error(_("sha1 mismatch %s"), oid_to_hex(oid));
return NULL;
}
+ object_close(&oh); /* we don't care about contents */
parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
return lookup_object(r, oid->hash);
}
- buffer = read_object_file(oid, &type, &size);
+ type = object_type(&oh);
+ size = object_size(&oh);
+ buffer = object_contents(&oh);
+ object_close(&oh);
+
if (buffer) {
if (check_object_signature(repl, buffer, size, type_name(type)) < 0) {
free(buffer);
^ permalink raw reply related
* Re: Retrieving a file in git that was deleted and committed
From: Bryan Turner @ 2018-12-07 7:37 UTC (permalink / raw)
To: Jeff King; +Cc: biswaranjan.nitrkl, Git Users
In-Reply-To: <20181207072004.GA32603@sigill.intra.peff.net>
On Thu, Dec 6, 2018 at 11:26 PM Jeff King <peff@peff.net> wrote:
>
> On Thu, Dec 06, 2018 at 11:07:00PM -0800, biswaranjan panda wrote:
>
> > Thanks! Strangely git log --follow does work.
>
> I suspect it would work even without --follow. When you limit a log
> traversal with a pathspec, like:
>
> git log foo
>
> that is not about following some continuous stream of content, but
> rather just applying that pathspec to the diff of each commit, and
> pruning ones where it did not change. So even if there are gaps where
> the file did not exist, we continue to apply the pathspec to the older
> commits.
Ah, of course. Thanks for the clarification, Jeff. And my apologies to
Biswaranjan Panda for the incorrect information.
>
> Tools like git-blame will _not_ work, though, as they really are trying
> to track the content as they walk back through history. And Once all of
> the content seems to appear from nowhere in your new commit, that seems
> like a dead end.
>
> In theory there could be some machine-readable annotation in the commit
> object (or in a note created after the fact) to say "even though 'foo'
> is a new file here, it came from $commit:foo". And then git-blame could
> keep following the content there. But such a feature does not yet exist.
>
> -Peff
^ permalink raw reply
* Re: Retrieving a file in git that was deleted and committed
From: Jeff King @ 2018-12-07 7:20 UTC (permalink / raw)
To: biswaranjan panda; +Cc: bturner, git
In-Reply-To: <CADHAf1Y8or_frf=Ecu-82z-jo06NKe7oqo1cxtsZsOxhKKxjAw@mail.gmail.com>
On Thu, Dec 06, 2018 at 11:07:00PM -0800, biswaranjan panda wrote:
> Thanks! Strangely git log --follow does work.
I suspect it would work even without --follow. When you limit a log
traversal with a pathspec, like:
git log foo
that is not about following some continuous stream of content, but
rather just applying that pathspec to the diff of each commit, and
pruning ones where it did not change. So even if there are gaps where
the file did not exist, we continue to apply the pathspec to the older
commits.
Tools like git-blame will _not_ work, though, as they really are trying
to track the content as they walk back through history. And Once all of
the content seems to appear from nowhere in your new commit, that seems
like a dead end.
In theory there could be some machine-readable annotation in the commit
object (or in a note created after the fact) to say "even though 'foo'
is a new file here, it came from $commit:foo". And then git-blame could
keep following the content there. But such a feature does not yet exist.
-Peff
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox