* Re: [PATCH] Resurrect diff-tree-helper -R
From: Daniel Jacobowitz @ 2005-05-01 1:47 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.58.0504301805300.2296@ppc970.osdl.org>
On Sat, Apr 30, 2005 at 06:09:53PM -0700, Linus Torvalds wrote:
> So it would be much nicer (I think) if mode changes are handled
> separately, with a simple separate line before the diff saying
>
> "Mode change: %o->%o %s", oldmode, newmode, path
>
> and not mess up the diff header. That way, you only see it when it
> actually makes any difference, and it's more readable both for humans
> _and_ machines as a result.
>
> Normal "patch" will just ignore the extra lines before the diff anyway, so
> it won't matter there.
>
> Comments?
It sounds good - but could you efficiently collect them before any diff
output? If you have something like this, it'll be easy to read:
Mode change: 644->755 foo.sh
Mode change: 644->755 bar.sh
--- ChangeLog
+++ ChangeLog
@@ -1,0 +1,1 @@
+New line
--- copyright
+++ copyright
@@ -1,0 +1,1 @@
+New line
But if you generate this then you might as well not generate the mode
lines at all, for all a human looking at the diff is going to notice
them:
--- ChangeLog
+++ ChangeLog
@@ -1,0 +1,1 @@
+New line
Mode change: 644->755 foo.sh
--- copyright
+++ copyright
@@ -1,0 +1,1 @@
+New line
Mode change: 644->755 bar.sh
The latter is how diff does its "Only in" messages. I never see them
when I'm looking through a diff of any size; only via diffstat, where
they're clearly disambiguated.
--
Daniel Jacobowitz
CodeSourcery, LLC
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Edgar Toernig @ 2005-04-30 22:54 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0504301322130.2296@ppc970.osdl.org>
Linus Torvalds wrote:
>
> [...] I just rewrote it to give "almost correct
> results" for "pretty much any crap you throw at it".
And I had the impression the strict checks in the original
version were intentionally ;-)
> I'll probably tweak it a bit more (make "no timezone means local
> timezone", for example, rather than UTC like it is now).
Here's my try on that. But whether it works everywhere ...
Btw, your %+03d%02d printf gave wrong results for i.e. -0130 (-01-30).
--- k/date.c (mode:100644)
+++ l/date.c (mode:100644)
@@ -10,7 +10,9 @@
#include <ctype.h>
#include <time.h>
-static time_t my_mktime(struct tm *tm)
+#define NO_TZ 11111
+
+static time_t utc_mktime(struct tm *tm)
{
static const int mdays[] = {
0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334
@@ -23,12 +25,19 @@ static time_t my_mktime(struct tm *tm)
return -1;
if (month < 0 || month > 11) /* array bounds */
return -1;
+ if (day < 1 || day > 31)
+ return -1;
if (month < 2 || (year + 2) % 4)
day--;
return (year * 365 + (year + 1) / 4 + mdays[month] + day) * 24*60*60UL +
tm->tm_hour * 60*60 + tm->tm_min * 60 + tm->tm_sec;
}
+static int local_offset(time_t *when)
+{
+ return (utc_mktime(localtime(when)) - *when) / 60;
+}
+
static const char *month_names[] = {
"January", "February", "March", "April", "May", "June",
"July", "August", "September", "October", "November", "December"
@@ -138,7 +147,8 @@ static int match_alpha(const char *date,
for (i = 0; i < NR_TZ; i++) {
int match = match_string(date, timezone_names[i].name);
if (match >= 3) {
- *offset = 60*timezone_names[i].offset;
+ if (*offset == NO_TZ)
+ *offset = 60*timezone_names[i].offset;
return match;
}
}
@@ -245,7 +255,7 @@ void parse_date(char *date, char *result
tm.tm_year = -1;
tm.tm_mon = -1;
tm.tm_mday = -1;
- offset = 0;
+ offset = NO_TZ;
for (;;) {
int match = 0;
@@ -270,13 +280,20 @@ void parse_date(char *date, char *result
date += match;
}
- then = my_mktime(&tm); /* mktime uses local timezone */
- if (then == -1)
- return;
-
- then -= offset * 60;
+ if (offset == NO_TZ) {
+ tm.tm_isdst = -1;
+ then = mktime(&tm);
+ if (then == -1)
+ return;
+ offset = local_offset(&then);
+ } else {
+ then = utc_mktime(&tm);
+ if (then == -1)
+ return;
+ then -= offset * 60;
+ }
- snprintf(result, maxlen, "%lu %+03d%02d", then, offset/60, offset % 60);
+ snprintf(result, maxlen, "%lu %+05d", then, offset/60*100 + offset%60);
}
void datestamp(char *buf, int bufsize)
@@ -285,9 +302,7 @@ void datestamp(char *buf, int bufsize)
int offset;
time(&now);
-
- offset = my_mktime(localtime(&now)) - now;
- offset /= 60;
+ offset = local_offset(&now);
snprintf(buf, bufsize, "%lu %+05d", now, offset/60*100 + offset%60);
}
Ciao, ET.
^ permalink raw reply
* [PATCH] Resurrect diff-tree-helper -R
From: Junio C Hamano @ 2005-05-01 0:34 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
Diff-tree-helper take two patch inadvertently dropped the
support of -R option, which is necessary to produce reverse diff
based on diff-cache and diff-files output (diff-tree does not
matter since you can feed two trees in reverse order). This
patch restores it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff-tree-helper.c | 17 +++++++++++------
1 files changed, 11 insertions(+), 6 deletions(-)
jit-diff 0 diff-tree-helper.c
# - Fix up d_type handling - we need to include <dirent.h> before
# + working-tree
--- k/diff-tree-helper.c (mode:100644)
+++ l/diff-tree-helper.c (mode:100644)
@@ -44,7 +44,8 @@ static int parse_oneside_change(const ch
return 0;
}
-static int parse_diff_tree_output(const char *buf, const char **spec, int cnt)
+static int parse_diff_tree_output(const char *buf,
+ const char **spec, int cnt, int reverse)
{
struct diff_spec old, new;
char path[PATH_MAX];
@@ -98,8 +99,12 @@ static int parse_diff_tree_output(const
default:
return -1;
}
- if (!cnt || matches_pathspec(path, spec, cnt))
- run_external_diff(path, &old, &new);
+ if (!cnt || matches_pathspec(path, spec, cnt)) {
+ if (reverse)
+ run_external_diff(path, &new, &old);
+ else
+ run_external_diff(path, &old, &new);
+ }
return 0;
}
@@ -108,14 +113,14 @@ static const char *diff_tree_helper_usag
int main(int ac, const char **av) {
struct strbuf sb;
- int reverse_diff = 0;
+ int reverse = 0;
int line_termination = '\n';
strbuf_init(&sb);
while (1 < ac && av[1][0] == '-') {
if (av[1][1] == 'R')
- reverse_diff = 1;
+ reverse = 1;
else if (av[1][1] == 'z')
line_termination = 0;
else
@@ -129,7 +134,7 @@ int main(int ac, const char **av) {
read_line(&sb, stdin, line_termination);
if (sb.eof)
break;
- status = parse_diff_tree_output(sb.buf, av+1, ac-1);
+ status = parse_diff_tree_output(sb.buf, av+1, ac-1, reverse);
if (status)
fprintf(stderr, "cannot parse %s\n", sb.buf);
}
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Juliusz Chroboczek @ 2005-04-30 21:59 UTC (permalink / raw)
To: git
In-Reply-To: <Pine.LNX.4.58.0504301322130.2296@ppc970.osdl.org>
Hi,
Here's the code I'm using in darcs-git (copied from Polipo, another
project of mine). You're welcome to use it in any way you see fit.
sprintf_a is defined as strdup of sprintf.
Juliusz
#if defined __GLIBC__
#define HAVE_TM_GMTOFF
#define HAVE_SETENV
#ifndef __UCLIBC__
#define HAVE_TIMEGM
#endif
#endif
#if defined(__linux__) && (__GNU_LIBRARY__ == 1)
/* Linux libc 5 */
#define HAVE_TIMEGM
#define HAVE_SETENV
#endif
#ifdef BSD
#define HAVE_TM_GMTOFF
#define HAVE_SETENV
#endif
#ifdef __CYGWIN__
#define HAVE_SETENV
#endif
#if _POSIX_VERSION >= 200112L
#define HAVE_SETENV
#endif
#define HAVE_TZSET
/* Like mktime(3), but UTC rather than local time */
#if defined(HAVE_TIMEGM)
time_t
mktime_gmt(struct tm *tm)
{
return timegm(tm);
}
#elif defined(HAVE_TM_GMTOFF)
time_t
mktime_gmt(struct tm *tm)
{
time_t t;
struct tm *ltm;
t = mktime(tm);
if(t < 0)
return -1;
ltm = localtime(&t);
if(ltm == NULL)
return -1;
return t + ltm->tm_gmtoff;
}
#elif defined(HAVE_TZSET)
#ifdef HAVE_SETENV
/* Taken from the Linux timegm(3) man page. */
time_t
mktime_gmt(struct tm *tm)
{
time_t t;
char *tz;
tz = getenv("TZ");
setenv("TZ", "", 1);
tzset();
t = mktime(tm);
if(tz)
setenv("TZ", tz, 1);
else
unsetenv("TZ");
tzset();
return t;
}
#else
time_t
mktime_gmt(struct tm *tm)
{
time_t t;
char *tz;
static char *old_tz = NULL;
tz = getenv("TZ");
putenv("TZ=");
tzset();
t = mktime(tm);
if(old_tz)
free(old_tz);
if(tz)
old_tz = sprintf_a("TZ=%s", tz);
else
old_tz = strdup("TZ"); /* XXX - non-portable? */
if(old_tz)
putenv(old_tz);
tzset();
return t;
}
#endif
#else
#error no mktime_gmt implementation on this platform
#endif
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Linus Torvalds @ 2005-04-30 20:32 UTC (permalink / raw)
To: Russ Allbery; +Cc: Edgar Toernig, David Woodhouse, git
In-Reply-To: <87r7gs87a9.fsf@windlord.stanford.edu>
On Sat, 30 Apr 2005, Russ Allbery wrote:
>
> You really cannot get portable behavior in this area without something
> akin to Autoconf probes, unfortunately.
Ok, since this only really matters for AUTHOR_DATE, which we pass in as a
random string anyway, and which comes from various mail programs which may
or may not follow all RFC's, I just rewrote it to give "almost correct
results" for "pretty much any crap you throw at it".
As a test-bed, a "test-date" program that parses a date and then prints
it out in git format _and_ in the local timezone format, here's a few
examples:
./test-date "$(date)" "April 4th, 1992 at 13:45" "13:04:09 +0100 2004 Yesterday, Friday 13th, December"
results in
Sat Apr 30 13:26:52 PDT 2005 -> 1114892812 -0700 -> Sat Apr 30 13:26:52 2005
April 4th, 1992 at 13:45 -> 702395100 +0000 -> Sat Apr 4 05:45:00 1992
13:04:09 +0100 2004 Yesterday, Friday 13th, December -> 1102939449 +0100 -> Mon Dec 13 04:04:09 2004
which is just because it really doesn't check a hell of a lot.
For example, if you say
"I caught 14 fishes in December 1998"
test-date will happily parse this as
Sun Dec 13 16:00:00 1998
(That's "0:00:00 Dec 14th, 1998 UTC" shown in the local timezone ;). Or:
./test-date "12:15 4/17/2009"
12:15 4/17/2009 -> 1239970500 +0000 -> Fri Apr 17 05:15:00 2009
ie it just greedily tries to make _some_ sense of the random strings you
throw at it.
It doesn't even try getting timezones right - it doesn't know about
summertime or anything. Besides, I probably used the wrong timezone info
anyway.
I'll probably tweak it a bit more (make "no timezone means local
timezone", for example, rather than UTC like it is now).
Linus
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Russ Allbery @ 2005-04-30 18:10 UTC (permalink / raw)
To: Edgar Toernig; +Cc: David Woodhouse, Linus Torvalds, git
In-Reply-To: <20050430124048.79119cac.froese@gmx.de>
Edgar Toernig <froese@gmx.de> writes:
> Oh btw, when we are about sucking time functions: the %s and %z
> strftime- sequences used further down are also non-standard (POSIX has
> no %s, old libc has neither %s nor %z).
> A possible workaround:
[...]
> tm = localtime(&now); /* get timezone and tm_isdst */
> tz = -timezone / 60;
> if (tm->tm_isdst > 0)
> tz += 60;
The global timezone variable isn't available on all systems. :)
You really cannot get portable behavior in this area without something
akin to Autoconf probes, unfortunately. Oh, and you can't assume daylight
savings time is an hour; it is sometimes two hours. You have to instead
use the altzone variable to get the offset when you're in daylight savings
time, but this again isn't available on all systems.
I posted a pointer to the INN source a while back; I'm really not sure
that anything less is sufficient to get full portability, although I
certainly trust Paul Eggart's implementation.
BTW, the yacc-based thing is exactly what I wrote the INN code to get rid
of, since I didn't want a yacc dependency.
--
Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
^ permalink raw reply
* [PATCH] cg-add only checks if the first file exists
From: Andy Lutomirski @ 2005-04-30 16:39 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
[apologies for possible dupe -- my mailer freaked out]
Doing:
cg-add foo bar
fails if foo doesn't exist but doesn't check for bar. It also gives
misleading reports to stdout.
I've fixed it at rsync://www.luto.us/cogito.git
Patch attached below as well.
--
Changed cg-add to check each added file for existence.
---
commit 9eb8efee632b6270a436d8088315856712bb5b32
tree cba76f974b1840640ccfa14b0118e1dc4a704876
parent 49612c471eebd26efe926a71752e254c1cdc382d
author Andy Lutomirski <luto@myrealbox.com> 1114878247 -0700
committer Andy Lutomirski <luto@myrealbox.com> 1114878247 -0700
Index: cg-add
===================================================================
--- c3aa1e6b53cc59d5fbe261f3f859584904ae3a63/cg-add (mode:100755
sha1:83f0b13f41599104d741ac91c7aa81497cd37d5f)
+++ cba76f974b1840640ccfa14b0118e1dc4a704876/cg-add (mode:100755
sha1:c84792450cec279f7b3eb1dec03b69ac07dbe9d9)
@@ -10,10 +10,12 @@
[ "$1" ] || die "usage: cg-add FILE..."
-if [ -f "$1" ]; then
- echo "Adding file $1"
-else
- die "$1 does not exist"
-fi
+for i in "$@"; do
+ if [ -f "$i" ]; then
+ echo "Adding file $i"
+ else
+ die "$i does not exist"
+ fi
+done
update-cache --add -- "$@"
^ permalink raw reply
* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Andrea Arcangeli @ 2005-04-30 16:37 UTC (permalink / raw)
To: Matt Mackall; +Cc: Linus Torvalds, linux-kernel, git
In-Reply-To: <20050430152014.GI21897@waste.org>
On Sat, Apr 30, 2005 at 08:20:15AM -0700, Matt Mackall wrote:
> Most of that psyco speed up is accelerating subsequent diffs in
> difflib, which you probably didn't hit yet.
Correct. Plus I've a 64bit python so I can't use psyco anyway.
> I can make it some sort of environment variable, sure. I think the
> speed is already in a domain where it's not a big deal though. There
No big deal of course, I mentioned it just because it was by far the
most CPU userland intensive operation during checkin. Perhaps doing less
vfs syscalls would improve checkin time too, but I'm unsure if that's
easily feasible (while disabling compression was certainly easy ;)
> Yep, I'm rather new to actually packaging my Python hacks.
I sent you by private email a modified package that gets that right.
Thanks!
^ permalink raw reply
* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Matt Mackall @ 2005-04-30 16:06 UTC (permalink / raw)
To: Adam J. Richter; +Cc: andrea, git
In-Reply-To: <200504301444.j3UEiHN05686@adam.yggdrasil.com>
On Sat, Apr 30, 2005 at 07:44:17AM -0700, Adam J. Richter wrote:
>
> I'd like to mention a couple of possible optimizations
> for both the with and without compression approaches.
>
> If you remove the gzip compression, then I imagine you could
> do much of the IO of checking out files via sendfile, without
> ever copying data to program space or even changing the program's
> memory map. There apparently exists a python sendfile module.
>
> If this mercurial were written in C, much of the rest of
> the IO could be optimized with mmap (to reduce copies) and writev
> in the absense of a compression pass. I don't know enough about
> python to know if these optimizations are available.
Python can do mmap, not sure about writev.
But I'm currently still in the "keep it as simple as possible" stage.
There's a bunch of room for optimization still, but if I do it all
now, it'll make things hard when I run into the next design change.
And there's still some important core work that needs doing - checkout
and commit need to be a subcase of the core merge code.
--
Mathematics is the supreme nostalgia of our time.
^ permalink raw reply
* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Adam J. Richter @ 2005-04-30 14:44 UTC (permalink / raw)
To: andrea; +Cc: git, mpm
On 2005-04-30, Andrea Arcangeli wrote:
>On a bit more technical side, one thing I'm wondering about is the
>compression. If I change mercurial like this:
>
>--- revlog.py.~1~ 2005-04-29 01:33:14.000000000 +0200
>+++ revlog.py 2005-04-30 03:54:12.000000000 +0200
>@@ -11,9 +11,11 @@
> import zlib, struct, mdiff, sha, binascii, os, tempfile
>
> def compress(text):
>+ return text
> return zlib.compress(text)
>
> def decompress(bin):
>+ return text
> return zlib.decompress(bin)
>
> def hash(text):
>
>
>the .hg directory sizes changes from 167M to 302M _BUT_ the _compressed_
>size of the .hg directory (i.e. like in a full network transfer with
>rsync -z or a tar.gz backup) changes from 55M to 38M:
>
>andrea@opteron:~/devel/kernel> du -sm hg-orig hg-aa hg-orig.tar.bz2 hg-aa.tar.bz2
>167 hg-orig
>302 hg-aa
>55 hg-orig.tar.bz2
>38 hg-aa.tar.bz2
>^^^^^^^^^^^^^^^^^^^^^ 38M backup and network transfer is what I want
>
>So I don't really see an huge benefit in compression, other than to
>slowdown the checkins measurably [i.e. what Linus doesn't want] (the
>time of compression is a lot higher than the time of python runtime during
>checkin, so it's hard to believe your 100% boost with psyco in the hg file,
>sometime psyco doesn't make any difference infact, I'd rather prefer people to
>work on the real thing of generating native bytecode at compile time, rather
>than at runtime, like some haskell compiler can do).
>
>mercurial is already good at decreasing the entropy by using an efficient
>storage format, it doesn't need to cheat by putting compression on each blob
>that can only leads to bad ratios when doing backups and while transferring
>more than one blob through the network.
I'd like to mention a couple of possible optimizations
for both the with and without compression approaches.
If you remove the gzip compression, then I imagine you could
do much of the IO of checking out files via sendfile, without
ever copying data to program space or even changing the program's
memory map. There apparently exists a python sendfile module.
If this mercurial were written in C, much of the rest of
the IO could be optimized with mmap (to reduce copies) and writev
in the absense of a compression pass. I don't know enough about
python to know if these optimizations are available.
On the other hand, if you recognize that there is a
duplication of the work of matching common substrings in
attepmting to store files as differences and in most compression
algorithms, including zlib or bzip2, then you might want to
consider storing the files in a format like zdelta or vcdiff, where
differential storage and compression are combined by describing
a file in terms of copy operations both from other files and
_earlier byte ranges of itself_.
zdelta is a modification of zlib for this purpose, but
I see no permission grants associated with the author's copyright,
and I thought that zlib only looked at the previous 32kB of data.
Also, if you go this route, you might want to skip the
last phases of these compressors where they convert individual
characters into a more compact representation, which I think
would defeat inter-file pattern matching if you try to make
a compressed tar of the repository, and would preclude the
sendfile/mmap optimization (although they might not be worth
it at this level of granularity). Then again, since you're
naming your files by sha1 hashes, it follows that related files
will not be farther apart as the repository grows, so the
compression opportunities for larger repositories might be
less anyhow.
__ ______________
Adam J. Richter \ /
adam@yggdrasil.com | g g d r a s i l
^ permalink raw reply
* Re: questions about cg-update, cg-pull, and cg-clone.
From: Zack Brown @ 2005-04-30 15:48 UTC (permalink / raw)
To: David A. Wheeler; +Cc: Git Mailing List, Petr Baudis, xpasky
In-Reply-To: <4272EF69.2090806@dwheeler.com>
On Fri, Apr 29, 2005 at 10:37:29PM -0400, David A. Wheeler wrote:
> Zack Brown wrote:
> >Now, if the update is clean, a cg-commit is invoked automatically,
>
> Correct; cg-merge calls "cg-commit -C" (ignore cache)
> if the merge is clean.
>
> >and if the
> >update is not clean, I then have to resolve any conflicts and give the
> >cg-commit
> >command by hand.
>
> Correct.
>
> >But: what is the significance of either of these cg-commit
> >commands? Why should I have to write a changelog entry recording this
> >merge? All
> >I'm doing is updating my tree to be current. Why should I have to 'commit'
> >that
> >update?
>
> I can't speak Petr, but I would guess that he's doing that because
> he's trying to avoid data loss.
So, what would be an appropriate comment for that commit? I have no idea
what is changing on my tree in that case, all I know is that I'm merging from
someone else. All I really want is their changes and their commit messages,
not one of my own that is meaningless.
So far I just type ^d when this happens, and leave the commit message blank.
Be well,
Zack
>
>
> --- David A. Wheeler
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Zack Brown
^ permalink raw reply
* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Matt Mackall @ 2005-04-30 15:20 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: Linus Torvalds, linux-kernel, git
In-Reply-To: <20050430025211.GP17379@opteron.random>
On Sat, Apr 30, 2005 at 04:52:11AM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 29, 2005 at 01:39:59PM -0700, Matt Mackall wrote:
> > Mercurial is ammenable to rsync provided you devote a read-only
> > repository to it on the client side. In other words, you rsync from
> > kernel.org/mercurial/linus to local/linus and then you merge from
> > local/linus to your own branch. Mercurial's hashing hierarchy is
> > similar to git's (and Monotone's), so you can sign a single hash of
> > the tree as well.
>
> Ok fine. It's also interesting how you already enabled partial transfers
> through http.
>
> Please apply this patch so it doesn't fail on my setup ;)
>
> --- mercurial-0.4b/hg.~1~ 2005-04-29 02:52:52.000000000 +0200
> +++ mercurial-0.4b/hg 2005-04-30 00:53:02.000000000 +0200
> @@ -1,4 +1,4 @@
> -#!/usr/bin/python
> +#!/usr/bin/env python
Done.
> On a bit more technical side, one thing I'm wondering about is the
> compression. If I change mercurial like this:
>
> --- revlog.py.~1~ 2005-04-29 01:33:14.000000000 +0200
> +++ revlog.py 2005-04-30 03:54:12.000000000 +0200
> @@ -11,9 +11,11 @@
> import zlib, struct, mdiff, sha, binascii, os, tempfile
>
> def compress(text):
> + return text
> return zlib.compress(text)
>
> def decompress(bin):
> + return text
> return zlib.decompress(bin)
>
> def hash(text):
>
>
> the .hg directory sizes changes from 167M to 302M _BUT_ the _compressed_
> size of the .hg directory (i.e. like in a full network transfer with
> rsync -z or a tar.gz backup) changes from 55M to 38M:
>
> andrea@opteron:~/devel/kernel> du -sm hg-orig hg-aa hg-orig.tar.bz2 hg-aa.tar.bz2
> 167 hg-orig
> 302 hg-aa
> 55 hg-orig.tar.bz2
> 38 hg-aa.tar.bz2
> ^^^^^^^^^^^^^^^^^^^^^ 38M backup and network transfer is what I want
>
> So I don't really see an huge benefit in compression, other than to
> slowdown the checkins measurably [i.e. what Linus doesn't want] (the
> time of compression is a lot higher than the time of python runtime during
> checkin, so it's hard to believe your 100% boost with psyco in the hg file,
> sometime psyco doesn't make any difference infact, I'd rather prefer people to
> work on the real thing of generating native bytecode at compile time, rather
> than at runtime, like some haskell compiler can do).
Most of that psyco speed up is accelerating subsequent diffs in
difflib, which you probably didn't hit yet.
> mercurial is already good at decreasing the entropy by using an efficient
> storage format, it doesn't need to cheat by putting compression on each blob
> that can only leads to bad ratios when doing backups and while transferring
> more than one blob through the network.
>
> So I suggest to try disabling compression optionally, perhaps it'll be even
> faster than git in the initial checkin that way! No need of compressing or
> decompressing anything with mercurial (unlike with git that would explode
> without control w/o compression).
I can make it some sort of environment variable, sure. I think the
speed is already in a domain where it's not a big deal though. There
are other things to do first, like unifying the merge/commit/update
code.
> Http is not intended for maximal efficiency, it's there just to make
> life easy. special protocol with zlib is required for maximum
> efficiency.
Yeah, I've got a plan here.
> You also should move the .py into a hg directory, so that they won't
> pollute the site-packages.
Yep, I'm rather new to actually packaging my Python hacks.
--
Mathematics is the supreme nostalgia of our time.
^ permalink raw reply
* Re: The big git command renaming..
From: Nicolas Pitre @ 2005-04-30 14:24 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504291416190.18901@ppc970.osdl.org>
On Fri, 29 Apr 2005, Linus Torvalds wrote:
> So I just pushed out a change that renames the commands to always have a
> "git-" prefix. In addition, I renamed "show-diff" to "diff-files", with
> together with the prefix means that it becomes "git-diff-files" when used.
While at it, could you also rename show-files to ls-files?
Nicolas
^ permalink raw reply
* Re: The criss-cross merge case
From: Adam J. Richter @ 2005-04-30 12:32 UTC (permalink / raw)
To: wsc9tt; +Cc: barkalow, bram, droundry, git, ry102, tupshin
On Fri, 29 Apr 2005 07:19:18 -0500, Wayne Scott wrote:
>On 4/28/05, Adam J. Richter <adam@yggdrasil.com> wrote:
>> On 2005-04-28, Benedikt Schmidt wrote:
>> >AFAIK the paper mentioned in the GNU diff sources [1] is an improvement
>> >to an earlier paper by the same author titled
>> >"A File Comparison Program" - Miller, Myers - 1985.
>> [...]
>> >[1] http://citeseer.ist.psu.edu/myers86ond.html
>>
>> Monotone apparently uses a futher acceleration of that algorithm
>> from the 1989 paper, also co-authored by the Myers, "An O(NP) Sequence
>> Comparison Algorithm" by Sun Wu, Udi Manber, and Gene Myers.
>> http://www.eecs.berkeley.edu/~gene/Papers/np_diff.pdf . The Monotone
>> implementation was apparently a port of an implementation originally
>> written in Scheme by Aubrey Jaffer.
>>
>> I don't fully understand the 1989 paper, but I get the
>> general impression that is a small change to the previous algorithm
>> (the one in GNU diff) that might be a 30 line patch if someone
>> got around to submitting it, and seems to make the code run more
>> than twice as fast in practice. One of these days, I will probably get
>> around to coding up a patch to GNU diff if nobody beats me to it.
>>
>> Making diff run faster may have at least one potentially useful
>> benefit for merging. A faster diff makes it more practical run diff
>> on smaller units of comparison. I posted a note here before about
>> converting the input files to diff3 to have just one character per
>> line, and then undoing that transformation of the result to produce
>> a character based merge that seemed to work pretty well in the
>> couple of tests that I tried.
>I just read that paper and unless I am mistaken, it already describes
>the basis for how GNU diff works. I don't think anything in that
>paper would make it faster.
>
>I also don't find anything to suggest the Monotone guys have rewritten
>diff. Just some notes from graydon that notes python's difflib uses a
>non-optimal diff that is faster in some cases.
In terminology that can only be understood by reading
the 1985 paper, the 1989 paper describes a possible reduction
in the number of diagonals in the edit graph that iterations of the
1989 algorithm have to consider. I say "possible reduction" because
the reduction can be zero in the worse case, although I get the
impression that it should be a reduction of 50% or better
typically, and it makes the case where the changes is just
a bunch of inserts run in linear time.
I believe that the longest common subsequence finder
at the core of GNU diff does not currently perform this optimization,
but the one in monotone-0.18/lcs.{cc,hh} does.
__ ______________
Adam J. Richter \ /
adam@yggdrasil.com | g g d r a s i l
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Edgar Toernig @ 2005-04-30 13:22 UTC (permalink / raw)
To: David Woodhouse; +Cc: Linus Torvalds, H. Peter Anvin, Luck, Tony, git
In-Reply-To: <1114865964.24014.77.camel@localhost.localdomain>
David Woodhouse wrote:
>
> On Sat, 2005-04-30 at 14:49 +0200, Edgar Toernig wrote:
> > + if (tm.tm_sec > 59)
> > + return;
>
> During a leap second, won't tm_sec be 60? And in fact you don't seem to
> handle leap seconds at all, so isn't my_mktime going to be out by one
> second for every leap second which has occurred since 1970?
There are no leap-seconds on POSIX systems. They allow tm_sec
to be 60 but thats all - 00:00:60 is the same as 00:01:00.
Whether the check should be against 59 or 60? I don't care.
It's Linus decision.
> There's a reason I'd rather just let glibc handle it :)
Good joke.
Ciao, ET.
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: David Woodhouse @ 2005-04-30 12:59 UTC (permalink / raw)
To: Edgar Toernig; +Cc: Linus Torvalds, H. Peter Anvin, Luck, Tony, git
In-Reply-To: <20050430144936.6b05cc90.froese@gmx.de>
On Sat, 2005-04-30 at 14:49 +0200, Edgar Toernig wrote:
> + if (tm.tm_sec > 59)
> + return;
During a leap second, won't tm_sec be 60? And in fact you don't seem to
handle leap seconds at all, so isn't my_mktime going to be out by one
second for every leap second which has occurred since 1970?
There's a reason I'd rather just let glibc handle it :)
It's not as if tm_gmtoff is particularly esoteric -- we inherited it
from BSD. Let's just use it and let both remaining HPUX users worry
about it themselves if they ever want to use git on their systems.
--
dwmw2
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Edgar Toernig @ 2005-04-30 12:49 UTC (permalink / raw)
To: David Woodhouse; +Cc: Linus Torvalds, H. Peter Anvin, Luck, Tony, git
In-Reply-To: <1114859594.24014.60.camel@localhost.localdomain>
David Woodhouse wrote:
>
> > + if (tm->tm_isdst > 0)
> > + offset += 60;
>
> Some locales have DST offsets which aren't 60 minutes, don't they?
Oh shit :-/
If grepped through the tz-database and it seems there's one
"country" left that has non-60-minute DST: Lord Howe Island.
All others dropped that before 1970.
Ok, here's a new version of the patch.
--- k/Makefile (mode:100644)
+++ l/Makefile (mode:100644)
@@ -28,7 +28,8 @@ all: $(PROG)
install: $(PROG) $(SCRIPTS)
install $(PROG) $(SCRIPTS) $(HOME)/bin/
-LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o tag.o
+LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o \
+ tag.o date.o
LIB_FILE=libgit.a
LIB_H=cache.h object.h blob.h tree.h commit.h tag.h
@@ -91,7 +92,6 @@ git-diff-tree-helper: diff-tree-helper.c
git-tar-tree: tar-tree.c
git-http-pull: LIBS += -lcurl
-git-commit-tree: LIBS += -lcurl
# Library objects..
blob.o: $(LIB_H)
--- k/cache.h (mode:100644)
+++ l/cache.h (mode:100644)
@@ -148,6 +148,9 @@ extern void *read_object_with_reference(
unsigned long *size,
unsigned char *sha1_ret);
+void parse_date(char *date, char *buf, int bufsize);
+void datestamp(char *buf, int bufsize);
+
static inline void *xmalloc(int size)
{
void *ret = malloc(size);
--- k/commit-tree.c (mode:100644)
+++ l/commit-tree.c (mode:100644)
@@ -10,7 +10,6 @@
#include <string.h>
#include <ctype.h>
#include <time.h>
-#include <curl/curl.h>
#define BLOCKING (1ul << 14)
@@ -81,24 +80,6 @@ static void remove_special(char *p)
}
}
-/* Gr. strptime is crap for this; it doesn't have a way to require RFC2822
- (i.e. English) day/month names, and it doesn't work correctly with %z. */
-static void parse_date(char *date, time_t *now, char *result, int maxlen)
-{
- char *p;
- time_t then;
-
- if ((then = curl_getdate(date, now)) == 0)
- return;
-
- /* find the timezone at the end */
- p = date + strlen(date);
- while (p > date && isdigit(*--p))
- ;
- if ((*p == '+' || *p == '-') && strlen(p) == 5)
- snprintf(result, maxlen, "%lu %5.5s", then, p);
-}
-
static void check_valid(unsigned char *sha1, const char *expect)
{
void *buf;
@@ -132,8 +113,6 @@ int main(int argc, char **argv)
char *audate;
char comment[1000];
struct passwd *pw;
- time_t now;
- struct tm *tm;
char *buffer;
unsigned int size;
@@ -163,10 +142,8 @@ int main(int argc, char **argv)
strcat(realemail, ".");
getdomainname(realemail+strlen(realemail), sizeof(realemail)-strlen(realemail)-1);
}
- time(&now);
- tm = localtime(&now);
- strftime(realdate, sizeof(realdate), "%s %z", tm);
+ datestamp(realdate, sizeof(realdate));
strcpy(date, realdate);
commitgecos = getenv("COMMIT_AUTHOR_NAME") ? : realgecos;
@@ -175,7 +152,7 @@ int main(int argc, char **argv)
email = getenv("AUTHOR_EMAIL") ? : realemail;
audate = getenv("AUTHOR_DATE");
if (audate)
- parse_date(audate, &now, date, sizeof(date));
+ parse_date(audate, date, sizeof(date));
remove_special(gecos); remove_special(realgecos); remove_special(commitgecos);
remove_special(email); remove_special(realemail); remove_special(commitemail);
--- /dev/null
+++ l/date.c (mode:100644)
@@ -0,0 +1,184 @@
+/*
+ * GIT - The information manager from hell
+ *
+ * Copyright (C) Linus Torvalds, 2005
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <ctype.h>
+#include <time.h>
+
+static time_t my_mktime(struct tm *tm)
+{
+ static const int mdays[] = {
+ 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334
+ };
+ int year = tm->tm_year - 70;
+ int month = tm->tm_mon;
+ int day = tm->tm_mday;
+
+ if (year < 0 || year > 129) /* algo only works for 1970-2099 */
+ return -1;
+ if (month < 0 || month > 11) /* array bounds */
+ return -1;
+ if (month < 2 || (year + 2) % 4)
+ day--;
+ return (year * 365 + (year + 1) / 4 + mdays[month] + day) * 24*60*60UL +
+ tm->tm_hour * 60*60 + tm->tm_min * 60 + tm->tm_sec;
+}
+
+static const char *month_names[] = {
+ "Jan", "Feb", "Mar", "Apr", "May", "Jun",
+ "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
+};
+
+static const char *weekday_names[] = {
+ "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"
+};
+
+
+static char *skipfws(char *str)
+{
+ while (isspace(*str))
+ str++;
+ return str;
+}
+
+
+/* Gr. strptime is crap for this; it doesn't have a way to require RFC2822
+ (i.e. English) day/month names, and it doesn't work correctly with %z. */
+void parse_date(char *date, char *result, int maxlen)
+{
+ struct tm tm;
+ char *p, *tz;
+ int i, offset;
+ time_t then;
+
+ memset(&tm, 0, sizeof(tm));
+
+ /* Skip day-name */
+ p = skipfws(date);
+ if (!isdigit(*p)) {
+ for (i=0; i<7; i++) {
+ if (!strncmp(p,weekday_names[i],3) && p[3] == ',') {
+ p = skipfws(p+4);
+ goto day;
+ }
+ }
+ return;
+ }
+
+ /* day */
+ day:
+ tm.tm_mday = strtoul(p, &p, 10);
+
+ if (tm.tm_mday < 1 || tm.tm_mday > 31)
+ return;
+
+ if (!isspace(*p))
+ return;
+
+ p = skipfws(p);
+
+ /* month */
+
+ for (i=0; i<12; i++) {
+ if (!strncmp(p, month_names[i], 3) && isspace(p[3])) {
+ tm.tm_mon = i;
+ p = skipfws(p+strlen(month_names[i]));
+ goto year;
+ }
+ }
+ return; /* Error -- bad month */
+
+ /* year */
+ year:
+ tm.tm_year = strtoul(p, &p, 10);
+
+ if (!tm.tm_year && !isspace(*p))
+ return;
+
+ if (tm.tm_year > 1900)
+ tm.tm_year -= 1900;
+
+ p=skipfws(p);
+
+ /* hour */
+ if (!isdigit(*p))
+ return;
+ tm.tm_hour = strtoul(p, &p, 10);
+
+ if (tm.tm_hour > 23)
+ return;
+
+ if (*p != ':')
+ return; /* Error -- bad time */
+ p++;
+
+ /* minute */
+ if (!isdigit(*p))
+ return;
+ tm.tm_min = strtoul(p, &p, 10);
+
+ if (tm.tm_min > 59)
+ return;
+
+ if (*p != ':')
+ goto zone;
+ p++;
+
+ /* second */
+ if (!isdigit(*p))
+ return;
+ tm.tm_sec = strtoul(p, &p, 10);
+
+ if (tm.tm_sec > 59)
+ return;
+
+ zone:
+ if (!isspace(*p))
+ return;
+
+ p = skipfws(p);
+
+ if (*p == '-')
+ offset = -60;
+ else if (*p == '+')
+ offset = 60;
+ else
+ return;
+
+ if (!isdigit(p[1]) || !isdigit(p[2]) || !isdigit(p[3]) || !isdigit(p[4]))
+ return;
+
+ tz = p;
+ i = strtoul(p+1, NULL, 10);
+ offset *= ((i % 100) + ((i / 100) * 60));
+
+ p = skipfws(p + 5);
+ if (*p && *p != '(') /* trailing comment like (EDT) is ok */
+ return;
+
+ then = my_mktime(&tm); /* mktime uses local timezone */
+ if (then == -1)
+ return;
+
+ then -= offset;
+
+ snprintf(result, maxlen, "%lu %5.5s", then, tz);
+}
+
+void datestamp(char *buf, int bufsize)
+{
+ time_t now;
+ int offset;
+
+ time(&now);
+
+ offset = my_mktime(localtime(&now)) - now;
+ offset /= 60;
+
+ snprintf(buf, bufsize, "%lu %+05d", now, offset/60*100 + offset%60);
+}
Ciao, ET.
^ permalink raw reply
* Re: description
From: Kay Sievers @ 2005-04-30 12:43 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Git Mailing List
In-Reply-To: <4272CDF1.9060207@zytor.com>
On Fri, Apr 29, 2005 at 05:14:41PM -0700, H. Peter Anvin wrote:
> I guess this is technically speaking complete nonstandard addition to
> git :) but I have added the following to the script that generates
> http://www.kernel.org/git/:
>
> If there is a plain text file called "description" in the .git
> directory, it will appear on that webpage.
That's a nice idea. We may add the owner of the repo to this file too?
Or where does your script know that from the permissions? Is the projects
overview page updating itself?
Here is a new gitweb.cgi version that integrates with that page. The
own project browser is removed now and the top-link points now to your
link-list.
!! It already uses the new binaries with the git-* prepended. !!
See it working here:
http://ehlo.org/~kay/git/
Get the cgi from here:
ftp://ehlo.org/git/gitweb.cgi
!! Just remove lines 26-30, which are the settings to run on my box. !!
Thanks,
Kay
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: David Woodhouse @ 2005-04-30 12:13 UTC (permalink / raw)
To: Kay Sievers
Cc: Edgar Toernig, Linus Torvalds, H. Peter Anvin, Luck, Tony, git
In-Reply-To: <1114862920.17673.1.camel@localhost.localdomain>
On Sat, 2005-04-30 at 14:08 +0200, Kay Sievers wrote:
> Yes, some have half-hour offsets:
> http://www.timeanddate.com/library/abbreviations/timezones/au/nft.html
That doesn't count -- that timezone is honoured all year round. We're
talking about the difference between wintertime and summertime in any
given locale.
TBH I think I'd rather just put a gmt_mktime() which uses my trick of
looking at tm.tm_gmtoff after the mktime call into a separate file
wrapped in #ifdef GLIBC and let anyone else who really cares about their
own non-BSD-compatible system worry about whether it works there and fix
it up accordingly.
--
dwmw2
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Kay Sievers @ 2005-04-30 12:08 UTC (permalink / raw)
To: David Woodhouse
Cc: Edgar Toernig, Linus Torvalds, H. Peter Anvin, Luck, Tony, git
In-Reply-To: <1114859594.24014.60.camel@localhost.localdomain>
On Sat, 2005-04-30 at 12:13 +0100, David Woodhouse wrote:
> On Sat, 2005-04-30 at 12:53 +0200, Edgar Toernig wrote:
> > + tm = localtime(&now); /* get timezone and tm_isdst */
> > + offset = -timezone / 60;
> > + if (tm->tm_isdst > 0)
> > + offset += 60;
>
> Some locales have DST offsets which aren't 60 minutes, don't they?
Yes, some have half-hour offsets:
http://www.timeanddate.com/library/abbreviations/timezones/au/nft.html
Kay
^ permalink raw reply
* git compatibility patches
From: Edgar Toernig @ 2005-04-30 11:40 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
With these fourc patches and the previous date patch git compiles
and works on my good old gcc2.7/libc5/2.0-kernel system.
The first one: support for pre-1.2 zlib:
--- k/cache.h (mode:100644)
+++ l/cache.h (mode:100644)
@@ -17,6 +17,10 @@
#include SHA1_HEADER
#include <zlib.h>
+#if ZLIB_VERNUM < 0x1200
+#define deflateBound(c,s) ((s) + (((s) + 7) >> 3) + (((s) + 63) >> 6) + 11)
+#endif
+
/*
* Basic data structures for the directory cache
*
The second one: missing dirent.d_type field
--- k/cache.h (mode:100644)
+++ l/cache.h (mode:100644)
@@ -21,6 +21,15 @@
#define deflateBound(c,s) ((s) + (((s) + 7) >> 3) + (((s) + 63) >> 6) + 11)
#endif
+#ifdef DT_UNKNOWN
+#define DTYPE(de) ((de)->d_type)
+#else
+#define DT_UNKNOWN 0
+#define DT_DIR 1
+#define DT_REG 2
+#define DTYPE(de) DT_UNKNOWN
+#endif
+
/*
* Basic data structures for the directory cache
*
--- k/show-files.c (mode:100644)
+++ l/show-files.c (mode:100644)
@@ -129,7 +129,7 @@ static void read_directory(const char *p
len = strlen(de->d_name);
memcpy(fullname + baselen, de->d_name, len+1);
- switch (de->d_type) {
+ switch (DTYPE(de)) {
struct stat st;
default:
continue;
The third one: replace AF_LOCAL with AF_UNIX (there's no AF_LOCAL in POSIX).
--- k/rsh.c (mode:100644)
+++ l/rsh.c (mode:100644)
@@ -48,7 +48,7 @@ int setup_connection(int *fd_in, int *fd
}
}
strcpy(posn, " -");
- if (socketpair(AF_LOCAL, SOCK_STREAM, 0, sv)) {
+ if (socketpair(AF_UNIX, SOCK_STREAM, 0, sv)) {
return error("Couldn't create socket");
}
if (!fork()) {
And the last one: move variable declarations to the start of the function.
--- k/tag.c (mode:100644)
+++ l/tag.c (mode:100644)
@@ -26,6 +26,10 @@ int parse_tag(struct tag *item)
char type[20];
void *data, *bufptr;
unsigned long size;
+ int typelen, taglen;
+ unsigned char object[20];
+ const char *type_line, *tag_line, *sig_line;
+
if (item->object.parsed)
return 0;
item->object.parsed = 1;
@@ -36,10 +40,6 @@ int parse_tag(struct tag *item)
if (strcmp(type, tag_type))
return error("Object %s not a tag",
sha1_to_hex(item->object.sha1));
-
- int typelen, taglen;
- unsigned char object[20];
- const char *type_line, *tag_line, *sig_line;
if (size < 64)
return -1;
Ciao, ET.
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: David Woodhouse @ 2005-04-30 11:13 UTC (permalink / raw)
To: Edgar Toernig; +Cc: Linus Torvalds, H. Peter Anvin, Luck, Tony, git
In-Reply-To: <20050430125333.2bd81b18.froese@gmx.de>
On Sat, 2005-04-30 at 12:53 +0200, Edgar Toernig wrote:
> + tm = localtime(&now); /* get timezone and tm_isdst */
> + offset = -timezone / 60;
> + if (tm->tm_isdst > 0)
> + offset += 60;
Some locales have DST offsets which aren't 60 minutes, don't they?
--
dwmw2
^ permalink raw reply
* Re: How to get bash to shut up about SIGPIPE?
From: Rene Scharfe @ 2005-04-30 11:04 UTC (permalink / raw)
To: Paul Jackson, Linus Torvalds; +Cc: git, pasky
In-Reply-To: <20050429232922.03057aba.pj@sgi.com>
On Fri, Apr 29, 2005 at 11:29:22PM -0700, Paul Jackson wrote:
> Linus replied to pj:
> > > Code Sample 2:
> > > ...
> > Didn't change anything for me. Same thing.
>
> I don't believe you did what I did.
>
> The source code for bash, both 2.x and 3.x versions, clearly displays a
> simpler error message (no line number or redisplay of your script
> commands) in the case that you set a trap. And I tested both shells on
> a multiprocessor, to verify that they behaved as I expected, running
> these silly little scripts.
I don't have a multiprocessor and I see the same. Are you sure it's SMP
dependant?
Your solution (trapping _inside_ the job, too) works for me, btw. Here's
a patch for cg-log that reduces the clutter to two "Broken pipe" lines
(pun not intended).
Rene
--- cg-log~ 2005-04-29 23:43:09.000000000 +0200
+++ cg-log 2005-04-30 12:15:40.000000000 +0200
@@ -16,6 +16,7 @@
# or id1:id2 representing an (id1;id2] range of commits to show.
. cg-Xlib
+trap exit SIGPIPE
if [ "$1" = "-c" ]; then
shift
@@ -47,6 +48,7 @@
fi
$revls | $revsort | while read time commit parents; do
+ trap exit SIGPIPE
[ "$revfmt" = "rev-list" ] && commit="$time"
echo $colheader""commit ${commit%:*} $coldefault;
cat-file commit $commit | \
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Edgar Toernig @ 2005-04-30 10:53 UTC (permalink / raw)
To: Linus Torvalds; +Cc: H. Peter Anvin, Luck, Tony, git
In-Reply-To: <Pine.LNX.4.58.0504292114580.2296@ppc970.osdl.org>
Linus Torvalds wrote:
>
> Edgar, willing to create a separate "parse-date.c" with your "my_mktime()"
> thing and move the old date parsing there? That way we'll just use that
> instead of libcurl..
Here it is. I moved the strftime stuff too (workaround for non-standard
%s %z sequence).
--- k/Makefile (mode:100644)
+++ l/Makefile (mode:100644)
@@ -28,7 +28,8 @@ all: $(PROG)
install: $(PROG) $(SCRIPTS)
install $(PROG) $(SCRIPTS) $(HOME)/bin/
-LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o tag.o
+LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o \
+ tag.o date.o
LIB_FILE=libgit.a
LIB_H=cache.h object.h blob.h tree.h commit.h tag.h
--- k/cache.h (mode:100644)
+++ l/cache.h (mode:100644)
@@ -147,6 +160,9 @@ extern void *read_object_with_reference(
const unsigned char *required_type,
unsigned long *size,
unsigned char *sha1_ret);
+
+void parse_date(char *date, char *buf, int bufsize);
+void datestamp(char *buf, int bufsize);
static inline void *xmalloc(int size)
{
--- k/commit-tree.c (mode:100644)
+++ l/commit-tree.c (mode:100644)
@@ -10,7 +10,6 @@
#include <string.h>
#include <ctype.h>
#include <time.h>
-#include <curl/curl.h>
#define BLOCKING (1ul << 14)
@@ -81,24 +80,6 @@ static void remove_special(char *p)
}
}
-/* Gr. strptime is crap for this; it doesn't have a way to require RFC2822
- (i.e. English) day/month names, and it doesn't work correctly with %z. */
-static void parse_date(char *date, time_t *now, char *result, int maxlen)
-{
- char *p;
- time_t then;
-
- if ((then = curl_getdate(date, now)) == 0)
- return;
-
- /* find the timezone at the end */
- p = date + strlen(date);
- while (p > date && isdigit(*--p))
- ;
- if ((*p == '+' || *p == '-') && strlen(p) == 5)
- snprintf(result, maxlen, "%lu %5.5s", then, p);
-}
-
static void check_valid(unsigned char *sha1, const char *expect)
{
void *buf;
@@ -132,8 +113,6 @@ int main(int argc, char **argv)
char *audate;
char comment[1000];
struct passwd *pw;
- time_t now;
- struct tm *tm;
char *buffer;
unsigned int size;
@@ -163,10 +142,8 @@ int main(int argc, char **argv)
strcat(realemail, ".");
getdomainname(realemail+strlen(realemail), sizeof(realemail)-strlen(realemail)-1);
}
- time(&now);
- tm = localtime(&now);
- strftime(realdate, sizeof(realdate), "%s %z", tm);
+ datestamp(realdate, sizeof(realdate));
strcpy(date, realdate);
commitgecos = getenv("COMMIT_AUTHOR_NAME") ? : realgecos;
@@ -175,7 +152,7 @@ int main(int argc, char **argv)
email = getenv("AUTHOR_EMAIL") ? : realemail;
audate = getenv("AUTHOR_DATE");
if (audate)
- parse_date(audate, &now, date, sizeof(date));
+ parse_date(audate, date, sizeof(date));
remove_special(gecos); remove_special(realgecos); remove_special(commitgecos);
remove_special(email); remove_special(realemail); remove_special(commitemail);
--- /dev/null
+++ l/date.c (mode:100644)
@@ -0,0 +1,187 @@
+/*
+ * GIT - The information manager from hell
+ *
+ * Copyright (C) Linus Torvalds, 2005
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <ctype.h>
+#include <time.h>
+
+static time_t my_mktime(struct tm *tm)
+{
+ static const int mdays[] = {
+ 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334
+ };
+ int year = tm->tm_year - 70;
+ int month = tm->tm_mon;
+ int day = tm->tm_mday;
+
+ if (year < 0 || year > 129) /* algo only works for 1970-2099 */
+ return -1;
+ if (month < 0 || month > 11) /* array bounds */
+ return -1;
+ if (month < 2 || (year + 2) % 4)
+ day--;
+ return (year * 365 + (year + 1) / 4 + mdays[month] + day) * 24*60*60UL +
+ tm->tm_hour * 60*60 + tm->tm_min * 60 + tm->tm_sec;
+}
+
+static const char *month_names[] = {
+ "Jan", "Feb", "Mar", "Apr", "May", "Jun",
+ "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
+};
+
+static const char *weekday_names[] = {
+ "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"
+};
+
+
+static char *skipfws(char *str)
+{
+ while (isspace(*str))
+ str++;
+ return str;
+}
+
+
+/* Gr. strptime is crap for this; it doesn't have a way to require RFC2822
+ (i.e. English) day/month names, and it doesn't work correctly with %z. */
+void parse_date(char *date, char *result, int maxlen)
+{
+ struct tm tm;
+ char *p, *tz;
+ int i, offset;
+ time_t then;
+
+ memset(&tm, 0, sizeof(tm));
+
+ /* Skip day-name */
+ p = skipfws(date);
+ if (!isdigit(*p)) {
+ for (i=0; i<7; i++) {
+ if (!strncmp(p,weekday_names[i],3) && p[3] == ',') {
+ p = skipfws(p+4);
+ goto day;
+ }
+ }
+ return;
+ }
+
+ /* day */
+ day:
+ tm.tm_mday = strtoul(p, &p, 10);
+
+ if (tm.tm_mday < 1 || tm.tm_mday > 31)
+ return;
+
+ if (!isspace(*p))
+ return;
+
+ p = skipfws(p);
+
+ /* month */
+
+ for (i=0; i<12; i++) {
+ if (!strncmp(p, month_names[i], 3) && isspace(p[3])) {
+ tm.tm_mon = i;
+ p = skipfws(p+strlen(month_names[i]));
+ goto year;
+ }
+ }
+ return; /* Error -- bad month */
+
+ /* year */
+ year:
+ tm.tm_year = strtoul(p, &p, 10);
+
+ if (!tm.tm_year && !isspace(*p))
+ return;
+
+ if (tm.tm_year > 1900)
+ tm.tm_year -= 1900;
+
+ p=skipfws(p);
+
+ /* hour */
+ if (!isdigit(*p))
+ return;
+ tm.tm_hour = strtoul(p, &p, 10);
+
+ if (tm.tm_hour > 23)
+ return;
+
+ if (*p != ':')
+ return; /* Error -- bad time */
+ p++;
+
+ /* minute */
+ if (!isdigit(*p))
+ return;
+ tm.tm_min = strtoul(p, &p, 10);
+
+ if (tm.tm_min > 59)
+ return;
+
+ if (*p != ':')
+ goto zone;
+ p++;
+
+ /* second */
+ if (!isdigit(*p))
+ return;
+ tm.tm_sec = strtoul(p, &p, 10);
+
+ if (tm.tm_sec > 59)
+ return;
+
+ zone:
+ if (!isspace(*p))
+ return;
+
+ p = skipfws(p);
+
+ if (*p == '-')
+ offset = -60;
+ else if (*p == '+')
+ offset = 60;
+ else
+ return;
+
+ if (!isdigit(p[1]) || !isdigit(p[2]) || !isdigit(p[3]) || !isdigit(p[4]))
+ return;
+
+ tz = p;
+ i = strtoul(p+1, NULL, 10);
+ offset *= ((i % 100) + ((i / 100) * 60));
+
+ p = skipfws(p + 5);
+ if (*p && *p != '(') /* trailing comment like (EDT) is ok */
+ return;
+
+ then = my_mktime(&tm); /* mktime uses local timezone */
+ if (then == -1)
+ return;
+
+ then -= offset;
+
+ snprintf(result, maxlen, "%lu %5.5s", then, tz);
+}
+
+void datestamp(char *buf, int bufsize)
+{
+ time_t now;
+ struct tm *tm;
+ int offset;
+
+ time(&now);
+
+ tm = localtime(&now); /* get timezone and tm_isdst */
+ offset = -timezone / 60;
+ if (tm->tm_isdst > 0)
+ offset += 60;
+
+ snprintf(buf, bufsize, "%lu %+05d", now, offset/60*100 + offset%60);
+}
Ciao, ET.
^ permalink raw reply
* Re: Trying to use AUTHOR_DATE
From: Edgar Toernig @ 2005-04-30 10:40 UTC (permalink / raw)
To: David Woodhouse; +Cc: Russ Allbery, Linus Torvalds, git
In-Reply-To: <1114848175.24014.35.camel@localhost.localdomain>
David Woodhouse wrote:
>
> Eww. The time functions we have to play with _really_ suck, don't they?
> How about this...
>
> + then += tm.tm_gmtoff;
tm_gmtoff is not available everywhere - POSIX doesn't even mention it (BSD?).
Oh btw, when we are about sucking time functions: the %s and %z strftime-
sequences used further down are also non-standard (POSIX has no %s, old
libc has neither %s nor %z).
A possible workaround:
void make_datestamp(char *buf)
{
time_t now;
struct tm *tm;
int tz;
time(&now);
tm = localtime(&now); /* get timezone and tm_isdst */
tz = -timezone / 60;
if (tm->tm_isdst > 0)
tz += 60;
sprintf(buf, "%lu %+05d", now, tz/60*100+tz%60);
}
That *should* work on any POSIX system but who knows ...
Ciao, ET.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox