* Re: am fails to apply patches for files with CRLF lineendings
From: Sverre Rabbelier @ 2009-12-15 6:33 UTC (permalink / raw)
To: Brandon Casey; +Cc: git
In-Reply-To: <ee63ef30912141809k27bc73edp20abddd5e9c7c063@mail.gmail.com>
Heya,
On Tue, Dec 15, 2009 at 03:09, Brandon Casey <drafnel@gmail.com> wrote:
> Forwarding to the list. The original was bounced since gmail sent a
> multipart mime version with html. Seems we can't disable html
> composing in the gmail settings anymore (I thought we used to be able
> to).
You can, it remembers when you click "« Plain Text", at least, it does
for me :P.
--
Cheers,
Sverre Rabbelier
^ permalink raw reply
* Re: Giving command line parameter to textconv command?
From: Junio C Hamano @ 2009-12-15 5:56 UTC (permalink / raw)
To: Nanako Shiraishi; +Cc: git, Jeff King
In-Reply-To: <20091215121110.6117@nanako3.lavabit.com>
Nanako Shiraishi <nanako3@lavabit.com> writes:
> I experimented with other variables (eg. smudge and clean) and
> they honor their command line arguments. If textconv is the only
> setting that doesn't, the change may be easier to justify.
Yes, as you found out, convert.c::apply_filter() is aware of the command
line arguments.
Let's try to do a bit more work to make the coverage complete. After
scanning "git grep -e start_async -e run_command" output, here is what I
came up with:
- editor.c::launch_editor() that allows a custom editor named via
GIT_EDITOR does seem to honor your command line arguments.
- pager.c::setup_pager() is used for GIT_PAGER and it does honor your
command line arguments.
- ll-merge.c::ll_ext_merge() that is used to handle custom merge drivers
lets the user specify command line via templating to replace %O %A %B
and naturally it needs to be aware of the command line arguments.
- diff.c::run_external_diff() that runs GIT_EXTERNAL_DIFF defines that
the command has to take 7 parameters in a fixed order, and is not
designed to permute its arguments like ll_ext_merge() does, but these
days people don't use it directly (they use it indirectly via
"difftool" wrapper), so it probably is not an issue.
- merge-index.c::merge_entry() also defines a strict order and semantics
to its parameters, but similar to GIT_EXTERNAL_DIFF, it is not
something you would throw a ready-made program (like an editor or an
pager) and expect it to work, so it wouldn't be an issue either.
Hooks do not even take arbitrary command line arguments, so we don't have
to worry about them.
So it does look like that textconv is the only odd-man out.
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Eric Paris @ 2009-12-15 4:26 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Jeff King, git
In-Reply-To: <alpine.LFD.2.00.0912142245240.23173@xanadu.home>
On Mon, 2009-12-14 at 22:50 -0500, Nicolas Pitre wrote:
> On Mon, 14 Dec 2009, Jeff King wrote:
>
> > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> >
> > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > > to
> > >
> > > http://people.redhat.com/~eparis/git-tar/
> > >
> > > But it's going to take a couple hours.
> >
> > Holy cow. Almost 150 packs, and that's not even everything. The tarball
> > is missing a bunch of objects, because it points to your kernel-1 as an
> > alternate. So I suspect we would need that, as well, to recreate.
>
> Hmmm... Rebasing repositories mixed with alternates... I wonder if the
> infinite loop might not actually be due to a delta cycle, especially if
> the alternate is also rebasing.
>
> So having the alternate, too, would certainly be interesting.
The alternative repo is slowing pushing up to that same location. That
tar is 855838982, so just a tad bit smaller.
-Eric
^ permalink raw reply
* Re: am fails to apply patches for files with CRLF lineendings
From: Brandon Casey @ 2009-12-15 3:59 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Brandon Casey, Björn Steinbrink, jk, git
In-Reply-To: <7vfx7d7zpp.fsf@alter.siamese.dyndns.org>
On Mon, Dec 14, 2009 at 8:12 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Brandon Casey <drafnel@gmail.com> writes:
>
>>> It actually is the norm to use LF as the line terminator in the body text
>>> in saved messages (and trailing CR as a true part of the payload), and
>>> "am" traditionally used that definition. It is meant to read from "mbox"
>>> format to begin with.
>>
>> But isn't each email in the mbox file supposed to be RFC-2822 formatted
>> anyway?
>
> If you are talking about the same "mbox" I was talking about, which is
> what I see when I peek "/var/mail/junio", then the answer is no.
Yes, that is what I was talking about, but I did not know whether the
individual mails which are separated by "From user@host ..." were
supposed to be in RFC-2822 format or not.
> Their lines are terminated with a LF, and if you insert CR at the end of the
> line it would appear as true payload.
How do you insert CR at the end of the line? Can you use mutt or
something like it to send a mail which contains a CR? I have tried,
and I have not been able to do so. I have tried mutt, mailx, and
sendmail. For sendmail, I of course constructed the email headers by
hand and piped it through sendmail. The CRLF in my tests were
converted to LF by the time they reached /var/mail/casey.
> DOSsy boxes can have C:\mail\user
> or whatever that has DOS text, of course, so there is no "supposed to be".
>
> Having said that, it does not matter an iota in the real world if somebody
> declares on _this list_ that it a bug that Thunderbird spits out CRLF text
> in response to "Save As..." on platforms where LF is the natural line
> terminator [*1*].
I'm not sure it is a bug, just a change in behavior.
> Whether it is a bug or not, we still need to help
> people with such a program without breaking others.
I agree.
> I saw "peeking the line ending of the first line" as suggested as a
> solution, and my gut feeling, without thinking too much about it, is
> that it is likely to be the right thing to do, especially if we do
> both the check and the necessary conversion in either mailinfo or even
> in mailsplit.
Yes, I think it will work as a work-around, unfortunately I cannot
work on implementing this at the moment. I think the better solution,
if it is not too costly, is to detect the presence of CR and produce a
binary patch that can be sent through email and applied by git-am.
-brandon
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Nicolas Pitre @ 2009-12-15 3:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Eric Paris, Jeff King, git
In-Reply-To: <7vk4wpax99.fsf@alter.siamese.dyndns.org>
On Mon, 14 Dec 2009, Junio C Hamano wrote:
> Nicolas Pitre <nico@fluxnic.net> writes:
>
> > On Mon, 14 Dec 2009, Eric Paris wrote:
> >
> >> On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
> >> > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> >> >
> >> > > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
> >> > > for >5 minutes at 100% cpu (I killed it, it didn't finish)
> >> > >
> >> > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> >> > > to
> >> > >
> >> > > http://people.redhat.com/~eparis/git-tar/
> >> >
> >> > Wowzers, that's big. Can you send just what's in .git?
> >>
> >> So I zipped up just .git 1.2G. I did a make clean and zipped up the
> >> whole repo 1.3G.
> >>
> >> Just started pushing the 1.3G file.
> >>
> >> Maybe having a .git directory that large is the problem?
> >
> > Shouldn't be, unless your repo is really badly packed.
> >
> > What's the output of 'git count-objects -v' ?
>
> Didn't somebody say that the trace hints an infinite loop not "slow
> because of bad packing"?
Maybe. But I was curious about the size too, which turns out to be
really bad packing. Of course bad packing shouldn't affect the
correctness of the repository.
Nicolas
^ permalink raw reply
* Re: [PATCH 03/23] Introduce "skip-worktree" bit in index, teach Git to get/set this bit
From: Nguyen Thai Ngoc Duy @ 2009-12-15 3:51 UTC (permalink / raw)
To: Greg Price; +Cc: git
In-Reply-To: <20091214230619.GA30538@dr-wily.mit.edu>
2009/12/15 Greg Price <price@ksplice.com>:
> I confess I can't tell how the skip-worktree bit does differ from
> assume-unchanged. Is its 'goal' different only in that you have a
> different motivation for introducing it, or does it actually have a
> different effect -- and what is that different effect?
On the fun side, you could use both bits in the same worktree, to
narrow your worktree and have some assume-unchanged files.
Another difference is that with assume-unchanged bit, you make a
promise to Git that those assume-unchanged files are "good", Git does
not have to care for them. If somehow you violate the promise, Git can
harm your files on worktree.
--
Duy
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Nicolas Pitre @ 2009-12-15 3:50 UTC (permalink / raw)
To: Jeff King; +Cc: Eric Paris, git
In-Reply-To: <20091215023918.GA14689@coredump.intra.peff.net>
On Mon, 14 Dec 2009, Jeff King wrote:
> On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
>
> > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > to
> >
> > http://people.redhat.com/~eparis/git-tar/
> >
> > But it's going to take a couple hours.
>
> Holy cow. Almost 150 packs, and that's not even everything. The tarball
> is missing a bunch of objects, because it points to your kernel-1 as an
> alternate. So I suspect we would need that, as well, to recreate.
Hmmm... Rebasing repositories mixed with alternates... I wonder if the
infinite loop might not actually be due to a delta cycle, especially if
the alternate is also rebasing.
So having the alternate, too, would certainly be interesting.
Nicolas
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Nicolas Pitre @ 2009-12-15 3:44 UTC (permalink / raw)
To: Eric Paris; +Cc: Jeff King, git
In-Reply-To: <1260843111.9379.86.camel@localhost>
On Mon, 14 Dec 2009, Eric Paris wrote:
> On Mon, 2009-12-14 at 19:26 -0500, Nicolas Pitre wrote:
> > On Mon, 14 Dec 2009, Eric Paris wrote:
> >
> > > Maybe having a .git directory that large is the problem?
> >
> > Shouldn't be, unless your repo is really badly packed.
> >
> > What's the output of 'git count-objects -v' ?
>
> count: 87065
> size: 866744
> in-pack: 1203497
> packs: 148
> size-pack: 976474
So basically 87K loose objects occupying 846 MB and 1.2M packed objects
occupying 954 MB across 148 packs. That's an horrible repository
layout which would definitely gain by being repacked.
> I noticed just blindly poking at sizes in my .git/object/pack that the
> largest pack is a lot larger than the second and third largest....
That's expected.
> And all total there is almost 1G of data in .git/object/pack
>
> If the answer really is that I just have too much data and it can't be
> handled,
Nope. git should handle that kind of data set perfectly fine. And once
repacked, you should end up with a single pack containing everything and
the total size of your .git/objects directory will probably shrink by
50% or more.
But to be able to repack, your 'git reflog' needs to work correctly, and
the problem is unlikely to be related to the repository size.
Nicolas
^ permalink raw reply
* Re: Giving command line parameter to textconv command?
From: Nanako Shiraishi @ 2009-12-15 3:11 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <7vvdg9ceud.fsf@alter.siamese.dyndns.org>
Quoting Junio C Hamano <gitster@pobox.com>
> Nanako Shiraishi <nanako3@lavabit.com> writes:
>
>> I need this extra script because setting 'nkf -w' for
>> textconv like this
>>
>> [diff "eucjp"]
>> textconv = nkf -w
>>
>> gives an error.
>>
>> % diff --git a/hello.txt b/hello.txt
>> index 696acd7..f07aa1a 100644
>> error: cannot run nkf -w: No such file or directory
>> error: error running textconv command 'nkf -w'
>> fatal: unable to read files to diff
>>
>> Could you fix textconv so that it can be given parameters?
>
> The change to do so looks like this; it has a few side effects:
>
> - If somebody else were relying on the fact that 'nkf -w' names the
> entire command, it now will run 'nkf' command with '-w' as an argument,
> and it will break such a set-up. IOW, command that has an IFS white
> space in its path will now need to be quoted from the shell.
>
> You can see the fallout from this in the damage made to t/ hierarchy in
> the attached patch.
>
> - You can now use $HOME and other environment variables your shell
> expands when defining your textconv command.
>
> Overall I think it is a good direction to go, but we need to be careful
> about how we transition the existing repositories that use the old
> semantics.
>
> We might need to introduce diff.*.xtextconv or something.
I experimented with other variables (eg. smudge and clean) and
they honor their command line arguments. If textconv is the only
setting that doesn't, the change may be easier to justify.
By the way, there should be a better description of the filters
in the gitattributes documentation, similar to how [diff "name"]
sections in the .git/config file are described.
-- >8 --
Subject: Illustrate "filter" attribute with an example
The example was taken from aa4ed402c9721170fde2e9e43c3825562070e65e
(Add 'filter' attribute and external filter driver definition).
Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
---
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 1f472ce..5a45e51 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -197,6 +197,25 @@ intent is that if someone unsets the filter driver definition,
or does not have the appropriate filter program, the project
should still be usable.
+For example, in .gitattributes, you would assign the `filter`
+attribute for paths.
+
+------------------------
+*.c filter=indent
+------------------------
+
+Then you would define a "filter.indent.clean" and "filter.indent.smudge"
+configuration in your .git/config to specify a pair of commands to
+modify the contents of C programs when the source files are checked
+in ("clean" is run) and checked out (no change is made because the
+command is "cat").
+
+------------------------
+[filter "indent"]
+ clean = indent
+ smudge = cat
+------------------------
+
Interaction between checkin/checkout attributes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
--
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/
^ permalink raw reply related
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Jeff King @ 2009-12-15 2:39 UTC (permalink / raw)
To: Eric Paris; +Cc: git
In-Reply-To: <1260825629.9379.56.camel@localhost>
On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> to
>
> http://people.redhat.com/~eparis/git-tar/
>
> But it's going to take a couple hours.
Holy cow. Almost 150 packs, and that's not even everything. The tarball
is missing a bunch of objects, because it points to your kernel-1 as an
alternate. So I suspect we would need that, as well, to recreate.
-Peff
^ permalink raw reply
* Re: am fails to apply patches for files with CRLF lineendings
From: Junio C Hamano @ 2009-12-15 2:12 UTC (permalink / raw)
To: Brandon Casey; +Cc: Brandon Casey, Björn Steinbrink, jk, git
In-Reply-To: <ee63ef30912141650ie05baf4kab8505adf160c62e@mail.gmail.com>
Brandon Casey <drafnel@gmail.com> writes:
>> It actually is the norm to use LF as the line terminator in the body text
>> in saved messages (and trailing CR as a true part of the payload), and
>> "am" traditionally used that definition. It is meant to read from "mbox"
>> format to begin with.
>
> But isn't each email in the mbox file supposed to be RFC-2822 formatted
> anyway?
If you are talking about the same "mbox" I was talking about, which is
what I see when I peek "/var/mail/junio", then the answer is no. Their
lines are terminated with a LF, and if you insert CR at the end of the
line it would appear as true payload. DOSsy boxes can have C:\mail\user
or whatever that has DOS text, of course, so there is no "supposed to be".
Having said that, it does not matter an iota in the real world if somebody
declares on _this list_ that it a bug that Thunderbird spits out CRLF text
in response to "Save As..." on platforms where LF is the natural line
terminator [*1*]. Whether it is a bug or not, we still need to help
people with such a program without breaking others.
I saw "peeking the line ending of the first line" as suggested as a
solution, and my gut feeling, without thinking too much about it, is
that it is likely to be the right thing to do, especially if we do
both the check and the necessary conversion in either mailinfo or even
in mailsplit.
[Footnote]
*1* It is a different matter if it was done on _their_ mailing list, and
it would even be better if such a discussion on _their_ mailing list
resulted in a fix over there.
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Eric Paris @ 2009-12-15 2:11 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Jeff King, git
In-Reply-To: <alpine.LFD.2.00.0912141924030.23173@xanadu.home>
On Mon, 2009-12-14 at 19:26 -0500, Nicolas Pitre wrote:
> On Mon, 14 Dec 2009, Eric Paris wrote:
>
> > On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
> > > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> > >
> > > > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
> > > > for >5 minutes at 100% cpu (I killed it, it didn't finish)
> > > >
> > > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > > > to
> > > >
> > > > http://people.redhat.com/~eparis/git-tar/
> > >
> > > Wowzers, that's big. Can you send just what's in .git?
> >
> > So I zipped up just .git 1.2G. I did a make clean and zipped up the
> > whole repo 1.3G.
> >
> > Just started pushing the 1.3G file.
> >
> > Maybe having a .git directory that large is the problem?
>
> Shouldn't be, unless your repo is really badly packed.
>
> What's the output of 'git count-objects -v' ?
count: 87065
size: 866744
in-pack: 1203497
packs: 148
size-pack: 976474
prune-packable: 1611
garbage: 0
It's not home movies :) . It's a kernel trees with about 5
'upstream' trees that are remotes, which I update daily. One of the
remotes constantly rebases every day starting with Linus' tree and
pulling in about 150+ branches of work from others all of which might
rebase. I have (needlessly) the tags he keeps of that repo every day.
I daily rebase my work on top of that constantly rebasing tree
(linux-next) using stgit.
I noticed just blindly poking at sizes in my .git/object/pack that the
largest pack is a lot larger than the second and third largest....
-r--r--r-- 1 paris paris 108031039 Feb 12 2009 pack-71a9c0f08c76b8ffd1cf0a14d7cfe991fbc9db80.pack
-r--r--r-- 1 paris paris 32670479 Apr 7 2009 pack-5c8333301012d9b70d70648b287cf540afcc63ed.pack
-r--r--r-- 1 paris paris 26728958 Dec 30 2008 pack-fb8ceb5a33d9881fe771860c6006f55f73ecdf65.pack
And all total there is almost 1G of data in .git/object/pack
If the answer really is that I just have too much data and it can't be
handled, I'm fine exporting my patches getting some clean trees and
starting over till I get in this situation again, but if it really is a
problem/bug that can be solved, the full tar ball of my repo is at
http://people.redhat.com/~eparis/git-tar/
-Eric
^ permalink raw reply
* Fwd: am fails to apply patches for files with CRLF lineendings
From: Brandon Casey @ 2009-12-15 2:09 UTC (permalink / raw)
To: git
In-Reply-To: <ee63ef30912141650ie05baf4kab8505adf160c62e@mail.gmail.com>
Forwarding to the list. The original was bounced since gmail sent a
multipart mime version with html. Seems we can't disable html
composing in the gmail settings anymore (I thought we used to be able
to).
---------- Forwarded message ----------
From: Brandon Casey <drafnel@gmail.com>
Date: Mon, Dec 14, 2009 at 6:50 PM
Subject: Re: am fails to apply patches for files with CRLF lineendings
To: Junio C Hamano <gitster@pobox.com>
Cc: Brandon Casey <brandon.casey.ctr@nrlssc.navy.mil>, Björn
Steinbrink <B.Steinbrink@gmx.de>, jk@silentcow.com,
git@vger.kernel.org
On Mon, Dec 14, 2009 at 5:22 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> Brandon Casey <brandon.casey.ctr@nrlssc.navy.mil> writes:
>
> > My understanding of the problem is that rfc2822 dictates that...
>
> I think the fundamental problem is that what MUA uses as the internal
> storage format doesn't necessarily have to even be RFC-2822, which only
> specifies what should be on-the-wire.
If CRLF is what is on-the-wire, how can the MUA tell whether the
original was also CRLF or whether it was only LF? My assumption was
that the MUA cannot tell, and that things worked for most people
because those people who wanted LF terminated output were on platforms
that used LF termination and their MUA produced output using the
native line termination. Things broke recently for some people since
thunderbird devels decided to start saving emails with CRLF
termination on linux.
>
> The blamed commit took things too
> far.
>
> It actually is the norm to use LF as the line terminator in the body text
> in saved messages (and trailing CR as a true part of the payload), and
> "am" traditionally used that definition. It is meant to read from "mbox"
> format to begin with.
But isn't each email in the mbox file supposed to be RFC-2822
formatted anyway? If so, then my reading of RFC-2822 says that there
should only be CRLF everywhere and no bare CR or bare LF. But maybe
everyone has just been ignoring that part of RFC-2822? I'm not an
email expert, so I really don't know.
>
> Before the blamed commit, "am" took what was given literally, and it
> treated the trailing CR as part of the payload in a text file, each of
> whose line is LF terminated. This meant that if you sent and your MUA
> didn't corrupt, or more importantly if you ran format-patch yourself to
> produce a patch on content with CRLF line endings and fed it to am without
> any e-mail involved, your CRLF would have been preserved. So in that
> sense, unlike what you said in your message, the blamed commit didn't
> decide that the line termination must be LF. It decided that the line
> termination does not matter, which is a lot worse.
I think it is more correct to say that the line termination in an
email is ambiguous. CRLF does not necessarily mean that the original
had CRLF line termination if RFC-2822 is followed explicitly.
> As long as the use of CR is an internal storage matter and "Save As..."
> doesn't add extra CR that wasn't in the original contents, I wouldn't say
> that such a MUA is broken. In the use case that led to the blamed commit,
> the user is choosing to read directly from the internal storage of MUA,
> bypassing its "Save As..." interface meant to be used to externalize the
> messages,
No, we're using "Save As..." in thunderbird, and it saves with CRLF
line endings. I don't really care for thunderbird and its proclivity
for munging my emails and _not_ doing-the-right-thing in my opinion.
Maybe someone can suggest a mail client that can use imap and provide
the nice sorting of emails into folders like thunderbird does.
-brandon
^ permalink raw reply
* Re: am fails to apply patches for files with CRLF lineendings
From: Björn Steinbrink @ 2009-12-15 1:25 UTC (permalink / raw)
To: Brandon Casey; +Cc: Junio C Hamano, Brandon Casey, jk, git
In-Reply-To: <ee63ef30912141650ie05baf4kab8505adf160c62e@mail.gmail.com>
On 2009.12.14 18:50:44 -0600, Brandon Casey wrote:
> I think it is more correct to say that the line termination in an email is
> ambiguous. CRLF does not necessarily mean that the original had CRLF line
> termination if RFC-2822 is followed explicitly.
Right. And checking, after sending a patch containing CRs with mutt, it
lost those CRs. Even the local copy saved directly by mutt, which didn't
leave my box, lacks the CRs. So it seems basically impossible to send
patches to CRLF files inline.
RFC-822 still allowed bare CRs/LFs :-/
So the commit didn't break with anything mails conforming to RFC-2822,
those won't work for files with CR being patch. But it still breaks the
the raw format-patch generated patches, so even attaching them to the
actual email as a workaround won't do.
That makes a "use the first line to decide whether or not to strip CRs"
approach look like a good idea. Real mails are broken anyway, and the
format-patch output has LF on the first line, so mailsplit wouldn't mess
it up... Unless git on windows produces CRLF format-patch output...
Björn
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Junio C Hamano @ 2009-12-15 0:36 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Eric Paris, Jeff King, git
In-Reply-To: <alpine.LFD.2.00.0912141924030.23173@xanadu.home>
Nicolas Pitre <nico@fluxnic.net> writes:
> On Mon, 14 Dec 2009, Eric Paris wrote:
>
>> On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
>> > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
>> >
>> > > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
>> > > for >5 minutes at 100% cpu (I killed it, it didn't finish)
>> > >
>> > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
>> > > to
>> > >
>> > > http://people.redhat.com/~eparis/git-tar/
>> >
>> > Wowzers, that's big. Can you send just what's in .git?
>>
>> So I zipped up just .git 1.2G. I did a make clean and zipped up the
>> whole repo 1.3G.
>>
>> Just started pushing the 1.3G file.
>>
>> Maybe having a .git directory that large is the problem?
>
> Shouldn't be, unless your repo is really badly packed.
>
> What's the output of 'git count-objects -v' ?
Didn't somebody say that the trace hints an infinite loop not "slow
because of bad packing"?
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Nicolas Pitre @ 2009-12-15 0:29 UTC (permalink / raw)
To: Sverre Rabbelier; +Cc: Eric Paris, Jeff King, git
In-Reply-To: <fabb9a1e0912141403hb728974sc50b9e8dbb08925d@mail.gmail.com>
On Mon, 14 Dec 2009, Sverre Rabbelier wrote:
> Heya,
>
> On Mon, Dec 14, 2009 at 22:56, Eric Paris <eparis@redhat.com> wrote:
> > Just started pushing the 1.3G file.
> >
> > Maybe having a .git directory that large is the problem?
>
> What did you say this repository contained again? Your home video's?
> Ah, well that explains ;).
That would explain the size, but not the reflog CPU time.
Nicolas
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Nicolas Pitre @ 2009-12-15 0:26 UTC (permalink / raw)
To: Eric Paris; +Cc: Jeff King, git
In-Reply-To: <1260827790.9379.59.camel@localhost>
On Mon, 14 Dec 2009, Eric Paris wrote:
> On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
> > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> >
> > > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
> > > for >5 minutes at 100% cpu (I killed it, it didn't finish)
> > >
> > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > > to
> > >
> > > http://people.redhat.com/~eparis/git-tar/
> >
> > Wowzers, that's big. Can you send just what's in .git?
>
> So I zipped up just .git 1.2G. I did a make clean and zipped up the
> whole repo 1.3G.
>
> Just started pushing the 1.3G file.
>
> Maybe having a .git directory that large is the problem?
Shouldn't be, unless your repo is really badly packed.
What's the output of 'git count-objects -v' ?
Nicolas
^ permalink raw reply
* Re: [PATCH] help.autocorrect: do not run a command if the command given is junk
From: Junio C Hamano @ 2009-12-14 23:39 UTC (permalink / raw)
To: Johannes Sixt
Cc: Junio C Hamano, Johannes Schindelin, Git Mailing List,
Alex Riesen
In-Reply-To: <200912142255.36949.j.sixt@viscovery.net>
Johannes Sixt <j.sixt@viscovery.net> writes:
> On Montag, 14. Dezember 2009, Junio C Hamano wrote:
>> In the meantime, I think squashing the following in would help us keep the
>> two magic numbers in sync.
>
> I do not think that keeping the numbers in sync is necessary. For example, the
> similarity requirement for commands that run automatically could be stricter
> than for the list of suggestions. Then it would be possible that a unique
> best candidate is not good enough to be run automatically; there would only
> be a list of suggestions.
Well thought out. Would you want to reroll a patch with two symbolic
constants then?
^ permalink raw reply
* Re: Giving command line parameter to textconv command?
From: Junio C Hamano @ 2009-12-14 23:31 UTC (permalink / raw)
To: Nanako Shiraishi; +Cc: git, Jeff King
In-Reply-To: <20091215071735.6117@nanako3.lavabit.com>
Nanako Shiraishi <nanako3@lavabit.com> writes:
> I need this extra script because setting 'nkf -w' for
> textconv like this
>
> [diff "eucjp"]
> textconv = nkf -w
>
> gives an error.
>
> % diff --git a/hello.txt b/hello.txt
> index 696acd7..f07aa1a 100644
> error: cannot run nkf -w: No such file or directory
> error: error running textconv command 'nkf -w'
> fatal: unable to read files to diff
>
> Could you fix textconv so that it can be given parameters?
The change to do so looks like this; it has a few side effects:
- If somebody else were relying on the fact that 'nkf -w' names the
entire command, it now will run 'nkf' command with '-w' as an argument,
and it will break such a set-up. IOW, command that has an IFS white
space in its path will now need to be quoted from the shell.
You can see the fallout from this in the damage made to t/ hierarchy in
the attached patch.
- You can now use $HOME and other environment variables your shell
expands when defining your textconv command.
Overall I think it is a good direction to go, but we need to be careful
about how we transition the existing repositories that use the old
semantics.
We might need to introduce diff.*.xtextconv or something.
diff.c | 9 ++++-----
t/t4030-diff-textconv.sh | 2 +-
t/t4031-diff-rewrite-binary.sh | 2 +-
3 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/diff.c b/diff.c
index 08bbd3e..64a1486 100644
--- a/diff.c
+++ b/diff.c
@@ -3760,15 +3760,14 @@ static char *run_textconv(const char *pgm, struct diff_filespec *spec,
size_t *outsize)
{
struct diff_tempfile *temp;
- const char *argv[3];
- const char **arg = argv;
+ const char *argv[4] = { "sh", "-c", NULL, NULL };
struct child_process child;
struct strbuf buf = STRBUF_INIT;
+ struct strbuf cmd = STRBUF_INIT;
temp = prepare_temp_file(spec->path, spec);
- *arg++ = pgm;
- *arg++ = temp->name;
- *arg = NULL;
+ strbuf_addf(&cmd, "%s %s", pgm, temp->name);
+ argv[2] = strbuf_detach(&cmd, NULL);
memset(&child, 0, sizeof(child));
child.argv = argv;
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index a3f0897..3468f77 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -48,7 +48,7 @@ test_expect_success 'file is considered binary by plumbing' '
test_expect_success 'setup textconv filters' '
echo file diff=foo >.gitattributes &&
- git config diff.foo.textconv "$PWD"/hexdump &&
+ git config diff.foo.textconv \""$PWD"/hexdump\" &&
git config diff.fail.textconv false
'
diff --git a/t/t4031-diff-rewrite-binary.sh b/t/t4031-diff-rewrite-binary.sh
index a894c60..e6cb30e 100755
--- a/t/t4031-diff-rewrite-binary.sh
+++ b/t/t4031-diff-rewrite-binary.sh
@@ -54,7 +54,7 @@ chmod +x dump
test_expect_success 'setup textconv' '
echo file diff=foo >.gitattributes &&
- git config diff.foo.textconv "$PWD"/dump
+ git config diff.foo.textconv \""$PWD"/dump\"
'
test_expect_success 'rewrite diff respects textconv' '
^ permalink raw reply related
* Re: am fails to apply patches for files with CRLF lineendings
From: Junio C Hamano @ 2009-12-14 23:22 UTC (permalink / raw)
To: Brandon Casey; +Cc: Björn Steinbrink, jk, git, Brandon Casey
In-Reply-To: <tCQlJn153g8Oa6Z9HKe6xOUQJdcf2PCIVthlTrLgYE-wJ5jFyXVXWw@cipher.nrlssc.navy.mil>
Brandon Casey <brandon.casey.ctr@nrlssc.navy.mil> writes:
> My understanding of the problem is that rfc2822 dictates that...
I think the fundamental problem is that what MUA uses as the internal
storage format doesn't necessarily have to even be RFC-2822, which only
specifies what should be on-the-wire. The blamed commit took things too
far.
It actually is the norm to use LF as the line terminator in the body text
in saved messages (and trailing CR as a true part of the payload), and
"am" traditionally used that definition. It is meant to read from "mbox"
format to begin with.
Before the blamed commit, "am" took what was given literally, and it
treated the trailing CR as part of the payload in a text file, each of
whose line is LF terminated. This meant that if you sent and your MUA
didn't corrupt, or more importantly if you ran format-patch yourself to
produce a patch on content with CRLF line endings and fed it to am without
any e-mail involved, your CRLF would have been preserved. So in that
sense, unlike what you said in your message, the blamed commit didn't
decide that the line termination must be LF. It decided that the line
termination does not matter, which is a lot worse.
As long as the use of CR is an internal storage matter and "Save As..."
doesn't add extra CR that wasn't in the original contents, I wouldn't say
that such a MUA is broken. In the use case that led to the blamed commit,
the user is choosing to read directly from the internal storage of MUA,
bypassing its "Save As..." interface meant to be used to externalize the
messages, and the user is responsible for dealing with the fallout, hence
my "dos2unix" suggestion in the original thread.
Probably we should revert that commit, unless somebody comes up with a
better solution _or_ somebody convincingly argues that there shouldn't be
CRLF in your committed history.
^ permalink raw reply
* Re: [PATCH 03/23] Introduce "skip-worktree" bit in index, teach Git to get/set this bit
From: Greg Price @ 2009-12-14 23:06 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
In-Reply-To: <1260786666-8405-4-git-send-email-pclouds@gmail.com>
Hi Duy,
> +Skip-worktree bit
> +-----------------
> +
> +Skip-worktree bit can be defined in one (long) sentence: When reading
> +an entry, if it is marked as skip-worktree, then Git pretends its
> +working directory version is up to date and read the index version
> +instead.
> +
> +To elaborate, "reading" means checking for file existence, reading
> +file attributes or file content. The working directory version may be
> +present or absent. If present, its content may match against the index
> +version or not. Writing is not affected by this bit, content safety
> +is still first priority. Note that Git _can_ update working directory
> +file, that is marked skip-worktree, if it is safe to do so (i.e.
> +working directory version matches index version)
> +
> +Although this bit looks similar to assume-unchanged bit, its goal is
> +different from assume-unchanged bit's. Skip-worktree also takes
> +precedence over assume-unchanged bit when both are set.
I confess I can't tell how the skip-worktree bit does differ from
assume-unchanged. Is its 'goal' different only in that you have a
different motivation for introducing it, or does it actually have a
different effect -- and what is that different effect?
Looking forward to seeing sparse checkouts soon!
Cheers,
Greg
^ permalink raw reply
* Re: am fails to apply patches for files with CRLF lineendings
From: Brandon Casey @ 2009-12-14 22:56 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Björn Steinbrink, jk, git, Brandon Casey
In-Reply-To: <7vvdg9i9mn.fsf@alter.siamese.dyndns.org>
Junio C Hamano wrote:
> Björn Steinbrink <B.Steinbrink@gmx.de> writes:
>
>> Commit c2ca1d7 "Allow mailsplit ... to handle mails with CRLF line-endings"
>> seems to be responsible.
>
> Yes, that commit is not only responsible but was deliberate. For a better
> backstory, see:
>
> http://thread.gmane.org/gmane.comp.version-control.git/124718/focus=124721
>
> You'd notice that I was one of the people who didn't want to have this
> change, so you don't need to convince _me_ that this was not a change to
> keep everybody happy, but you'd need to try a better job than I did back
> then to convince people who thought that "am" should directly work on
> "Thunderbird saved mails" that what they want was a bad idea X-<.
My understanding of the problem is that rfc2822 dictates that CRLF is the
line ending in an email message for _every_ line, and that CR cannot
occur without LF and vice versa. So there is no reliable way to extract
patches from the body of an email and expect line endings to be conveyed
accurately. Some email clients save emails with the line-endings of the
platform, some save in what they call "raw" format with rfc2822's CRLF
line endings. So we have to _assume_ that the patch extracted from the
email has a particular line ending and make-it-so. For better or worse
(better for me), commit c2ca1d7 chose LF line-endings as the line-ending
of choice.
I agree that git-am should be able to apply everything that
git-format-patch produces. Perhaps the diff machinery should be modified
to treat files containing \r as binary when generating the output for
format-patch. Then we'd get a binary diff in the email.
-brandon
^ permalink raw reply
* Giving command line parameter to textconv command?
From: Nanako Shiraishi @ 2009-12-14 22:17 UTC (permalink / raw)
To: git
Some text documents in my repository is encoded in EUC-JP
and I have this line in my .gitattributes file.
*.txt diff=eucjp
and these two lines in my .git/config file.
[diff "eucjp"]
textconv = nkf-w
And I have ~/bin/nkf-w script that is executable.
#!/bin/sh
nkf -w "$@"
The command takes a (Japanese) text file and converts it
into UTF-8 (it guesses the input encoding).
I need this extra script because setting 'nkf -w' for
textconv like this
[diff "eucjp"]
textconv = nkf -w
gives an error.
% diff --git a/hello.txt b/hello.txt
index 696acd7..f07aa1a 100644
error: cannot run nkf -w: No such file or directory
error: error running textconv command 'nkf -w'
fatal: unable to read files to diff
Could you fix textconv so that it can be given parameters?
--
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/
^ permalink raw reply
* Re: [PATCH] help.autocorrect: do not run a command if the command given is junk
From: Johannes Sixt @ 2009-12-14 21:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Johannes Schindelin, Git Mailing List, Alex Riesen
In-Reply-To: <7v7hspjp3q.fsf@alter.siamese.dyndns.org>
On Montag, 14. Dezember 2009, Junio C Hamano wrote:
> In the meantime, I think squashing the following in would help us keep the
> two magic numbers in sync.
I do not think that keeping the numbers in sync is necessary. For example, the
similarity requirement for commands that run automatically could be stricter
than for the list of suggestions. Then it would be possible that a unique
best candidate is not good enough to be run automatically; there would only
be a list of suggestions.
-- Hannes
^ permalink raw reply
* Re: git-reflog 70 minutes at 100% cpu and counting
From: Jeff King @ 2009-12-14 22:14 UTC (permalink / raw)
To: Eric Paris; +Cc: git
In-Reply-To: <1260827790.9379.59.camel@localhost>
On Mon, Dec 14, 2009 at 04:56:30PM -0500, Eric Paris wrote:
> So I zipped up just .git 1.2G. I did a make clean and zipped up the
> whole repo 1.3G.
>
> Just started pushing the 1.3G file.
>
> Maybe having a .git directory that large is the problem?
It could be, but I doubt it. If you have a lot of loose objects that
could make things slow due to the disk access, but it is not likely to
use that much CPU time (we do have to zlib uncompress more, but
still...70 minutes is a lot).
-Peff
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox