From: Lea Wiemann <lewiemann@gmail.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org, John Hawley <warthog19@eaglescrag.net>,
Petr Baudis <pasky@suse.cz>
Subject: Re: [PATCH 2/3] add new Git::Repo API
Date: Fri, 18 Jul 2008 15:40:01 +0200 [thread overview]
Message-ID: <48809D31.5030008@gmail.com> (raw)
In-Reply-To: <200807180149.18565.jnareb@gmail.com>
Jakub Narebski wrote:
> [$commit->author:] We would probably want _in the future_ to return some
> object wrapper, which stringifies to value of author and committer headers
Yup, good idea. They'll even stay strings, they'll just be blessed.
>>>> +=item $commit->message
>
> I'd rather then have _git_ convert it to UTF=8 for us (using
> --encoding=<encoding> option to git-log/git-rev-list)
Yeah, I guess the API should actually decode it. You wouldn't want to
have the message in UTF-8 but in Unicode (I suggest you read man
perlunitut if you haven't done so). We cannot have git do the decoding,
since (apart from the fact that it doesn't smell right) it isn't
guaranteed to emit valid UTF-8 (thanks Junio for the pointer):
Lea Wiemann: Does anyone know off the top of their heads how git handles
character decoding errors in commands like git log? [...]
Junio 'gitster' Hamano: silently punt and show the original unmolested.
Junio 'gitster' Hamano: cf. pretty.c:pretty_print_commit()
So we're not guaranteed to be able to, in turn, decode git's output into
actual characters since it might just be byte soup.
Hence, how about this fallback strategy:
1. Decode according to the encoding header.
2. Decode as UTF-8 (passing through byte soup is often equivalent to
decoding UTF-8 since many terminals use UTF-8, and trying UTF-8 is
reasonably safe).
3. Decode as Latin1.
(Not that the fallbacks will matter a lot in practice, I think.)
> It is (much) better than forking git-cat-file for each commit shown
> on the list; nevertheless I think that it would be better to use git-log
> to generate list (or Git::Revlist) of Git::Commit objects. It is one
> fork less, but what more important you don't have to access repository
> twice for the very same objects.
You're confused; it's not one fork less, it's a write to a pipe less.
(Pleeeease look at the code before you write something. It's there, in
this very thread.) And I don't believe the "access the repository
twice" thing is anywhere near an actual issue. To summarize, you're
asking me to (a) write code and (presumably) (b) add something to the
interface of a public API, based on some (most probably faulty)
assumptions about performance? You should really read
<http://c2.com/cgi/wiki?PrematureOptimization>.
> if I understand correctly for log-like views you
> propose to first run simple git-rev-list [...], then feed list of
> revisions (perhaps via cache, i.e. excluding objects which are in
> cache) to 'git cat-file --batch' open two-directional pipeline.
Yup, it's an option, though currently it's a single cached call to git
log (or git rev-list).
> What I propose instead is to provide alternate method to fully
> instantiate Git::Commit object (in addition to ->_load), which would
> fill fields by parsing git-log / git-rev-list --headers output
Yes, but this would need a method in the API, it's not an optimization
that falls out for free. Cluttering an API for some obscure (= very
doubtful) optimization? Bad Idea.(tm)
> "git cat-file --batch" should have commits to be
> accessed in filesystem cache, which means in memory; but it is possible
> that they wouldn't be in cache because of I/O pressure
No. Page cache turnover time is at least around 10 seconds (and that's
under fairly artificial conditions), definitely not in the millisecond
range.
> I think that _not using_ Git::Cmd (or somesuch) API results in botched,
> horrible API
> our $git_version = $repo_root->repo(directory => 'dummy')->version;
> (Unless it is not needed any longer, or not used any longer; if it is
> so, then perhaps implementing Git::Cmd as generic wrapper around git
> commands, hiding for example ActivePerl hack, could be left for later).
It isn't used any longer -- I really suggest you read the whole thread
before replying. ;-)
>> As I wrote in my reply to Petr [...]
>
> Just a question: was this reply only to him, or to all?
To all, otherwise I wouldn't have Cc'ed the list.
>> I wouldn't -- see my blurb about error handling at the top of my reply
>> to Petr (<487BD0F3.2060508@gmail.com>). You're not supposed to pass
>> anything that you didn't get from get_sha1 into Git::Commit or
>> Git::Tag constructors, or your error handling is invariably broken.
>
> I can understand this simpler, although less than optimal, and geared
> mainly towards gitweb needs.
FTR, yes it is simpler, but no, it is not really geared toward gitweb
needs, and it's definitely not "less than optimal" in the sense of being
worse than the exception-based error handling Git.pm does. Trust on me
on this one. ;-)
> Errr... "yes" to first question means that 'reuse' option makes sense
> _only_ for get_bidi_pipe? If so, why it is present in other commands?
Yes, and no, it isn't present in other commands. (Hey, could you please
check the code before posting? Really.)
>>> Why name_rev, and no describe?
>> Feel free to add it. ;-) (It might take some work to come up with a
>> decent interface for that method.)
>
> Why do you _need_ name_rev, if you are not to include git-describe
> equivalent.
I needed it for gitweb. As I said, I'm not trying to create a complete
API. A describe_rev (or so) method can be added later, if and when it's
needed. (As I said, I don't think writing APIs without at least one use
case is a good idea anyway.)
>>> Should (for completeness) Git::Tag provide $tag->validate() method?
>
> I meant here equivalent of "git tag -v <tag>"
I guess it could be added. As with describe_rev, I won't add it myself,
in particular not as part of this patch series.
-- Lea
next prev parent reply other threads:[~2008-07-18 13:41 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-11 1:06 [PATCH 0/3] Git::Repo API and gitweb caching Lea Wiemann
2008-07-11 1:10 ` [PATCH 1/3 v9] gitweb: add test suite with Test::WWW::Mechanize::CGI Lea Wiemann
2008-07-11 1:11 ` [PATCH 2/3] add new Git::Repo API Lea Wiemann
2008-07-13 21:38 ` Junio C Hamano
2008-07-14 1:04 ` Lea Wiemann
2008-07-13 23:28 ` Jakub Narebski
2008-07-14 2:29 ` Lea Wiemann
2008-07-14 1:40 ` Petr Baudis
2008-07-14 22:19 ` Lea Wiemann
2008-07-18 16:48 ` Petr Baudis
2008-07-18 17:05 ` Jakub Narebski
2008-07-18 17:17 ` Petr Baudis
2008-07-18 18:09 ` Lea Wiemann
2008-07-18 18:19 ` Petr Baudis
2008-07-18 18:23 ` Johannes Schindelin
2008-07-19 20:54 ` Statictics on Git.pm usage in git commands (was: [PATCH 2/3] add new Git::Repo API) Jakub Narebski
2008-07-19 21:14 ` Petr Baudis
2008-07-20 0:16 ` Jakub Narebski
2008-07-20 21:38 ` Petr Baudis
2008-07-20 10:38 ` Johannes Schindelin
2008-07-20 10:49 ` Petr Baudis
2008-07-20 12:33 ` Johannes Schindelin
2008-07-20 12:58 ` Petr Baudis
2008-07-20 13:21 ` Johannes Schindelin
2008-07-14 23:41 ` [PATCH 2/3] add new Git::Repo API Jakub Narebski
2008-07-15 0:11 ` Lea Wiemann
2008-07-18 16:54 ` Petr Baudis
2008-07-19 0:03 ` Jakub Narebski
2008-07-19 19:07 ` Jakub Narebski
2008-07-20 21:36 ` Petr Baudis
2008-07-20 21:50 ` Jakub Narebski
2008-07-16 18:21 ` Jakub Narebski
2008-07-16 20:32 ` Lea Wiemann
2008-07-17 23:49 ` Jakub Narebski
2008-07-18 13:40 ` Lea Wiemann [this message]
2008-07-18 15:35 ` Jakub Narebski
2008-07-18 16:51 ` Lea Wiemann
2008-07-11 1:11 ` [PATCH 3/3] gitweb: use new Git::Repo API, and add optional caching Lea Wiemann
2008-07-14 21:23 ` Jakub Narebski
2008-07-14 23:03 ` Lea Wiemann
2008-07-14 23:14 ` Jakub Narebski
2008-07-14 23:56 ` Lea Wiemann
2008-07-15 0:52 ` Jakub Narebski
2008-07-15 1:16 ` Lea Wiemann
2008-07-15 1:28 ` Johannes Schindelin
2008-07-15 1:44 ` J.H.
2008-07-15 1:50 ` Lea Wiemann
2008-07-15 2:03 ` J.H.
2008-07-11 1:21 ` [PATCH 0/3] Git::Repo API and gitweb caching Johannes Schindelin
2008-07-11 9:33 ` Jakub Narebski
2008-07-11 14:07 ` Lea Wiemann
2008-07-11 16:27 ` Abhijit Menon-Sen
2008-07-12 15:08 ` Jakub Narebski
2008-07-19 5:35 ` Lea Wiemann
2008-08-18 19:34 ` Lea Wiemann
2008-08-18 19:39 ` [PATCH 1/3 v10] gitweb: add test suite with Test::WWW::Mechanize::CGI Lea Wiemann
2008-08-19 1:17 ` Junio C Hamano
2008-08-19 14:37 ` Lea Wiemann
2008-08-18 19:39 ` [PATCH 2/3 v2] add new Perl API: Git::Repo, Git::Commit, Git::Tag, and Git::RepoRoot Lea Wiemann
2008-08-19 1:32 ` Junio C Hamano
2008-08-19 15:06 ` Lea Wiemann
2008-08-19 13:51 ` Lea Wiemann
2008-08-18 19:39 ` [PATCH 3/3 v2] gitweb: use new Git::Repo API, and add optional caching Lea Wiemann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48809D31.5030008@gmail.com \
--to=lewiemann@gmail.com \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=pasky@suse.cz \
--cc=warthog19@eaglescrag.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).