First web interface and service API draft

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* First web interface and service API draft
@ 2005-04-22 10:41 Christian Meder
  2005-04-22 11:34 ` Jon Seymour
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Christian Meder @ 2005-04-22 10:41 UTC (permalink / raw)
  To: git

Hi,

me again after a couple of hours of sleep ;-)

This probably gets a bit longer so if you are not interested in a web
service api or the web interface now is your chance to get off the
train.

I'm probably making a complete git of myself but that's not uncalled
for in this contxt ;-)

For those that are still with me let me start by iterating again that
I _do_ care for URIs as the primary API for web service
applications _and_ humans. I probably don't have to tell Linux people
anything about the importance to get the API right ;-)

As it's fairly early in the web service interface cycle I like to change
things around a little bit and starting to get the API straight.

The following considerations should be pretty implementation agnostic
and not specific to wit. The interface should be flexible enough to be
used as a kind of web command line.

-------
/<project>

Ok. The URI should start by stating the project name
e.g. /linux-2.6. This does bloat the URI slightly but I don't think
that we want to have one root namespace per git archive in the long
run. Additionally you can always put rewriting or redirecting rules at
the root level for additional convenience when there's an obvious
default project.

Should provide some meta data, stats, etc. if available.

-------
/<project>/blob/<blob-sha1>
/<project>/commit/<commit-sha1>

These are the easy ones: the web interface should be able to spit out
the plain text data of a blob and a commit at these URIs. Users would
be probably scripts and other downloads.
Open questions:
* Blob data should be probably binary ?
* Should it be commit or changeset ? Linus seems to have changed
nomenclature in the REAME
* If we serve the pristine commit objects we will put the email
addresses in plain sight. If we remove or change the email addresses
it's not the original commit object anymore. Thoughts ?

-------
/<project>/tree/<tree-sha1>

Tree objects are served in binary form. Primary audience are scripts,
etc. Human beings will probably get a heart attack when they
accidentally visit this URI.

-------
/<project>/blob/<blob-sha1>.html
/<project>/commit/<commit-sha1>.html
/<project>/tree/<tree-sha1>.html

A HTML version of blob, commit and tree fully linked aimed at human
beings.

-------
/<project>/tree/<tree-sha1>.tar.bz2
/<project>/tree/<tree-sha1>.tar.gz
/<project>/commit/<commit-sha1>.tar.bz2
/<project>/commit/<commit-sha1>.tar.gz

Tarballs of the specified commits or trees. Note that these can be
individual subtrees too.

-------
/<project>/tree/<tree-sha1>/diff/<ancestor-tree-sha1>

Unified plain text recursive diff of the given trees. I guess the
user could specify any two tree ids but the relevance of the results
would vary greatly ;-)
* Possibly a DOS issue
* does something like /<project>/tree/<tree-sha1>/diff/ make sense
producing a full diff from scratch ?  

-------
/<project>/tree/<tree-sha1>/diff/<ancestor-tree-sha1>/html

Non recursive HTML view of the objects which are contained in the diff
fully linked with the individual HTML views.

-------
/<project>/blob/<blob-sha1>/diff/<ancestor-sha1>

Unified plain text diff of the given blobs.
* again /<project>/blob/<blob-sha1>/diff/ sensible ?

-------
/<project>/blob/<blob-sha1>/diff/<ancestor-sha1>/html

HTML view (probably colorized) view of a single blob diff.

-------
/<project>/changelog/<time-spec>

HTML changelog for the given <time-spec>. I think valid values for
timespec should be number of days <nnn>d, number of entries <nnn> and
the keyword 'all'.

* perhaps additionally number of hours <nnn>h, number of months
  <nnn>m, number of years <nnn>y. Combinations shouldn't be allowed
* time ranges are probably overkill
* is a plain text version needed /<project>/changelog/<time-spec/plain?

-------
/<project>/changelog/<time-spec>/search/<regexp>

HTML changelog for the given <time-spec> filtered by the <regexp>.

* again plain version needed ?

------
/<project>/changelog/<time-spec>/search/author/<regexp>
/<project>/changelog/<time-spec>/search/committer/<regexp>
/<project>/changelog/<time-spec>/search/signedoffby/<regexp>

convenience wrappers for generic search restricted to these fields.

------

open questions:
* how to generate and publish additional merge information ?
* how to generate and publish tree and blob history information ? This
is probably expensive with git.
* how to represent branches ? should we code up the branches in the
project id like linux-2.6-mm or whatever ?

Comments ? Ideas ? Other feedback ?

				Christian

-- 
Christian Meder, email: chris@absolutegiganten.org

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

                (Eihei Dogen Zenji)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 10:41 First web interface and service API draft Christian Meder
@ 2005-04-22 11:34 ` Jon Seymour
  2005-04-22 12:10   ` Petr Baudis
  2005-04-22 12:10 ` Petr Baudis
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Jon Seymour @ 2005-04-22 11:34 UTC (permalink / raw)
  To: Christian Meder; +Cc: git

On 4/22/05, Christian Meder <chris@absolutegiganten.org> wrote:
>
> Comments ? Ideas ? Other feedback ?
> 

I'd suggest serving XML rather than HTML and using client side XSLT to
transform it into HTML. Client-side XSLT works well in IE 6 and all
versions of Firefox, so there is no question that it is a mature
technology. Provide a fall back via server transformed HTML if need
be, but that is trivial to do once you have the client-side XSLT
stylesheets.

Serving XML is as easy as serving HTML and gives you a much more
flexible outcome.

jon.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 10:41 First web interface and service API draft Christian Meder
  2005-04-22 11:34 ` Jon Seymour
@ 2005-04-22 12:10 ` Petr Baudis
       [not found]   ` <1114176579.3233.42.camel@localhost>
  2005-04-22 12:37 ` El Draper
  2005-04-22 14:23 ` Jan Harkes
  3 siblings, 1 reply; 16+ messages in thread
From: Petr Baudis @ 2005-04-22 12:10 UTC (permalink / raw)
  To: Christian Meder; +Cc: git

Dear diary, on Fri, Apr 22, 2005 at 12:41:56PM CEST, I got a letter
where Christian Meder <chris@absolutegiganten.org> told me that...
> Hi,

Hi,

> /<project>
> 
> Ok. The URI should start by stating the project name
> e.g. /linux-2.6. This does bloat the URI slightly but I don't think
> that we want to have one root namespace per git archive in the long
> run. Additionally you can always put rewriting or redirecting rules at
> the root level for additional convenience when there's an obvious
> default project.
> 
> Should provide some meta data, stats, etc. if available.

I don't think this makes much sense. I think you should just apply -p1
to all the directories, and define that there should be some / page
which should contain some metadata regarding the repository you are
accessing (probably branches, tags, and such).

> -------
> /<project>/blob/<blob-sha1>
> /<project>/commit/<commit-sha1>
> 
> These are the easy ones: the web interface should be able to spit out
> the plain text data of a blob and a commit at these URIs. Users would
> be probably scripts and other downloads.
> Open questions:
> * Blob data should be probably binary ?

What do you mean by binary?

> * Should it be commit or changeset ? Linus seems to have changed
> nomenclature in the REAME

We call it commit everywhere but in the README. :-)

The "changeset" name is bad anyway. It is a commit of a complete tree
state, diff against one of its parent commits is the set of changes.

> -------
> /<project>/tree/<tree-sha1>
> 
> Tree objects are served in binary form. Primary audience are scripts,
> etc. Human beings will probably get a heart attack when they
> accidentally visit this URI.

Binary form is unusable for scripts.

Anything wrong with putting ls-tree output there?

We should also have /gitobj/<sha1> for fetching the raw git objects.

> -------
> /<project>/blob/<blob-sha1>.html
> /<project>/commit/<commit-sha1>.html
> /<project>/tree/<tree-sha1>.html
> 
> A HTML version of blob, commit and tree fully linked aimed at human
> beings.

How can I imagine an "HTML version of blob"?

> -------
> /<project>/tree/<tree-sha1>/diff/<ancestor-tree-sha1>/html
> 
> Non recursive HTML view of the objects which are contained in the diff
> fully linked with the individual HTML views.

Why not .html?

> -------
> /<project>/changelog/<time-spec>

I'd personally prefer /log/, but whatever.

For consistency, I'd stay with the plaintext output by default, .html if
requested.

And I think abusing directories for this is bad. Query string seems much
more appropriate, since this is something that changes dynamically a
lot, not a permanent resource identifier.

OTOH, I'd use

	/log/<commit>

to specify what commit to start at. It just does not make sense
otherwise, you would not know where to start.

I think the <commit> should follow the same or similar rules as Cogito
id decoding. E.g. to get latest Linus' changelog, you'd do

	/log/linus

> -------
> /<project>/changelog/<time-spec>/search/<regexp>
> 
> HTML changelog for the given <time-spec> filtered by the <regexp>.
> 
> * again plain version needed ?
> 
> ------
> /<project>/changelog/<time-spec>/search/author/<regexp>
> /<project>/changelog/<time-spec>/search/committer/<regexp>
> /<project>/changelog/<time-spec>/search/signedoffby/<regexp>
> 
> convenience wrappers for generic search restricted to these fields.

Same here. just ?author=...&committer=...&signedoffby=... etc. You can
even combine several criteria.

> ------
> 
> open questions:
> * how to generate and publish additional merge information ?

I don't understand....

> * how to generate and publish tree and blob history information ? This
> is probably expensive with git.

...this either.

> * how to represent branches ? should we code up the branches in the
> project id like linux-2.6-mm or whatever ?

See above.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 11:34 ` Jon Seymour
@ 2005-04-22 12:10   ` Petr Baudis
  2005-04-22 12:27     ` Jon Seymour
  2005-04-22 13:30     ` Christian Meder
  0 siblings, 2 replies; 16+ messages in thread
From: Petr Baudis @ 2005-04-22 12:10 UTC (permalink / raw)
  To: Jon Seymour; +Cc: Christian Meder, git

Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
where Jon Seymour <jon.seymour@gmail.com> told me that...
> On 4/22/05, Christian Meder <chris@absolutegiganten.org> wrote:
> >
> > Comments ? Ideas ? Other feedback ?
> > 
> 
> I'd suggest serving XML rather than HTML and using client side XSLT to
> transform it into HTML. Client-side XSLT works well in IE 6 and all
> versions of Firefox, so there is no question that it is a mature
> technology. Provide a fall back via server transformed HTML if need
> be, but that is trivial to do once you have the client-side XSLT
> stylesheets.
> 
> Serving XML is as easy as serving HTML and gives you a much more
> flexible outcome.

Why "rather than"? Why not "in addition to"?

You just append either .html or .xml, based on what you want.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 12:10   ` Petr Baudis
@ 2005-04-22 12:27     ` Jon Seymour
  2005-04-22 13:32       ` Christian Meder
  2005-04-22 13:30     ` Christian Meder
  1 sibling, 1 reply; 16+ messages in thread
From: Jon Seymour @ 2005-04-22 12:27 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Christian Meder, git

On 4/22/05, Petr Baudis <pasky@ucw.cz> wrote:
> Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
> where Jon Seymour <jon.seymour@gmail.com> told me that...
> > On 4/22/05, Christian Meder <chris@absolutegiganten.org> wrote:
> > >
> > > Comments ? Ideas ? Other feedback ?
> > >
> >
> > I'd suggest serving XML rather than HTML and using client side XSLT to
> > transform it into HTML. ...
> 
> Why "rather than"? Why not "in addition to"?
> 
> You just append either .html or .xml, based on what you want.
> 

You are right - there is no good reason that an implementation should
not to support both.

>From the point of view of a specification, though, I think it would be
useful to focus on an XML content model rather than the details of one
particular HTML model - get the XML model right and you can do
whatever you like with the HTML model at any time after that.

jon.

On 4/22/05, Petr Baudis <pasky@ucw.cz> wrote:
> Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
> where Jon Seymour <jon.seymour@gmail.com> told me that...
> > On 4/22/05, Christian Meder <chris@absolutegiganten.org> wrote:
> > >
> > > Comments ? Ideas ? Other feedback ?
> > >
> >
> > I'd suggest serving XML rather than HTML and using client side XSLT to
> > transform it into HTML. Client-side XSLT works well in IE 6 and all
> > versions of Firefox, so there is no question that it is a mature
> > technology. Provide a fall back via server transformed HTML if need
> > be, but that is trivial to do once you have the client-side XSLT
> > stylesheets.
> >
> > Serving XML is as easy as serving HTML and gives you a much more
> > flexible outcome.
> 
> Why "rather than"? Why not "in addition to"?
> 
> You just append either .html or .xml, based on what you want.
> 
> --
>                                 Petr "Pasky" Baudis
> Stuff: http://pasky.or.cz/
> C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
> 


-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 10:41 First web interface and service API draft Christian Meder
  2005-04-22 11:34 ` Jon Seymour
  2005-04-22 12:10 ` Petr Baudis
@ 2005-04-22 12:37 ` El Draper
  2005-04-22 13:44   ` Christian Meder
  2005-04-22 14:23 ` Jan Harkes
  3 siblings, 1 reply; 16+ messages in thread
From: El Draper @ 2005-04-22 12:37 UTC (permalink / raw)
  To: Christian Meder; +Cc: git

Christian Meder wrote:

>Comments ? Ideas ? Other feedback ?
>
>  
>

Hi guys,

New around these parts, so be gentle :-)

I would like to suggest the idea of a SOAP interface. If we are talking 
about a true service orientated API, then a way of calling a uri and 
having it return a nice SOAP packet with the return data in it would be 
great. If we ensured compliance with web service standards, then it 
would then mean anyone could write themselves a client desktop based 
program, a web interface, or any utility command line tools (in Java, 
.net, whatever they want, and for whatever platform), that could 
communicate with the web service and retrieve relevant data. You'd then 
have a true service interface into a Git repository. Seeing as how the 
idea of returning XML has already come up, I don't think it would be a 
stretch to extend the web interface to returning web service compliant 
SOAP packets in order to return data.

Regards,
-= El =-

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 12:10   ` Petr Baudis
  2005-04-22 12:27     ` Jon Seymour
@ 2005-04-22 13:30     ` Christian Meder
  1 sibling, 0 replies; 16+ messages in thread
From: Christian Meder @ 2005-04-22 13:30 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jon Seymour, git

On Fri, 2005-04-22 at 14:10 +0200, Petr Baudis wrote:
> Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
> where Jon Seymour <jon.seymour@gmail.com> told me that...
> > On 4/22/05, Christian Meder <chris@absolutegiganten.org> wrote:
> > >
> > > Comments ? Ideas ? Other feedback ?
> > > 
> > 
> > I'd suggest serving XML rather than HTML and using client side XSLT to
> > transform it into HTML. Client-side XSLT works well in IE 6 and all
> > versions of Firefox, so there is no question that it is a mature
> > technology. Provide a fall back via server transformed HTML if need
> > be, but that is trivial to do once you have the client-side XSLT
> > stylesheets.
> > 
> > Serving XML is as easy as serving HTML and gives you a much more
> > flexible outcome.
> 
> Why "rather than"? Why not "in addition to"?
> 
> You just append either .html or .xml, based on what you want.

I agree with Petr. I think we should do both.



				Christian
-- 
Christian Meder, email: chris@absolutegiganten.org

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

                (Eihei Dogen Zenji)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 12:27     ` Jon Seymour
@ 2005-04-22 13:32       ` Christian Meder
  0 siblings, 0 replies; 16+ messages in thread
From: Christian Meder @ 2005-04-22 13:32 UTC (permalink / raw)
  To: jon; +Cc: Petr Baudis, git

On Fri, 2005-04-22 at 22:27 +1000, Jon Seymour wrote:
> On 4/22/05, Petr Baudis <pasky@ucw.cz> wrote:
> > Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
> > where Jon Seymour <jon.seymour@gmail.com> told me that...
> > > On 4/22/05, Christian Meder <chris@absolutegiganten.org> wrote:
> > > >
> > > > Comments ? Ideas ? Other feedback ?
> > > >
> > >
> > > I'd suggest serving XML rather than HTML and using client side XSLT to
> > > transform it into HTML. ...
> > 
> > Why "rather than"? Why not "in addition to"?
> > 
> > You just append either .html or .xml, based on what you want.
> > 
> 
> You are right - there is no good reason that an implementation should
> not to support both.
> 
> >From the point of view of a specification, though, I think it would be
> useful to focus on an XML content model rather than the details of one
> particular HTML model - get the XML model right and you can do
> whatever you like with the HTML model at any time after that.

Actually I think the order is get the C content model right (done), get
the Python object model right (in flux), produce an appropriate XML
model.


				Christian
> 
> jon.
> 
> On 4/22/05, Petr Baudis <pasky@ucw.cz> wrote:
> > Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
> > where Jon Seymour <jon.seymour@gmail.com> told me that...
> > > On 4/22/05, Christian Meder <chris@absolutegiganten.org> wrote:
> > > >
> > > > Comments ? Ideas ? Other feedback ?
> > > >
> > >
> > > I'd suggest serving XML rather than HTML and using client side XSLT to
> > > transform it into HTML. Client-side XSLT works well in IE 6 and all
> > > versions of Firefox, so there is no question that it is a mature
> > > technology. Provide a fall back via server transformed HTML if need
> > > be, but that is trivial to do once you have the client-side XSLT
> > > stylesheets.
> > >
> > > Serving XML is as easy as serving HTML and gives you a much more
> > > flexible outcome.
> > 
> > Why "rather than"? Why not "in addition to"?
> > 
> > You just append either .html or .xml, based on what you want.
> > 
> > --
> >                                 Petr "Pasky" Baudis
> > Stuff: http://pasky.or.cz/
> > C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
> > 
> 
> 
-- 
Christian Meder, email: chris@absolutegiganten.org

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

                (Eihei Dogen Zenji)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 12:37 ` El Draper
@ 2005-04-22 13:44   ` Christian Meder
  2005-04-22 13:47     ` Jon Seymour
  0 siblings, 1 reply; 16+ messages in thread
From: Christian Meder @ 2005-04-22 13:44 UTC (permalink / raw)
  To: El Draper; +Cc: git

On Fri, 2005-04-22 at 13:37 +0100, El Draper wrote:
> Christian Meder wrote:
> 
> >Comments ? Ideas ? Other feedback ?
> >
> >  
> >
> 
> Hi guys,
> 
> New around these parts, so be gentle :-)
> 
> I would like to suggest the idea of a SOAP interface. If we are talking 
> about a true service orientated API, then a way of calling a uri and 
> having it return a nice SOAP packet with the return data in it would be 
> great. If we ensured compliance with web service standards, then it 
> would then mean anyone could write themselves a client desktop based 
> program, a web interface, or any utility command line tools (in Java, 
> .net, whatever they want, and for whatever platform), that could 
> communicate with the web service and retrieve relevant data. You'd then 
> have a true service interface into a Git repository. Seeing as how the 
> idea of returning XML has already come up, I don't think it would be a 
> stretch to extend the web interface to returning web service compliant 
> SOAP packets in order to return data.

Ok, I should've known we get into this being a Web Java guy by
profession ;-)

Right now I'd like to concentrate more on a RESTful approach
http://www.xfront.com/REST-Web-Services.html

I'm concentrating on getting a clean and simple API for mere mortals and
developers alike. SOAP is likely further down on my list. But I
certainly will take patches ;-)



			Christian


-- 
Christian Meder, email: chris@absolutegiganten.org

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

                (Eihei Dogen Zenji)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 13:44   ` Christian Meder
@ 2005-04-22 13:47     ` Jon Seymour
  0 siblings, 0 replies; 16+ messages in thread
From: Jon Seymour @ 2005-04-22 13:47 UTC (permalink / raw)
  To: El Draper, git

> >
> > >From the point of view of a specification, though, I think it would be
> > useful to focus on an XML content model rather than the details of one
> > particular HTML model - get the XML model right and you can do
> > whatever you like with the HTML model at any time after that.
>
> Actually I think the order is get the C content model right (done), get
> the Python object model right (in flux), produce an appropriate XML
> model.

Mmm.. I am not sure that a Python model is logically a pre-requisite
to the XML model nor that the ideal C API model is complete - we still
don't have a libgit, for example.  For an XML model we can get by
pretty well with the data model as it is - and an XML model really
shouldn't be dependent on any particular API or programming language.

Certainly, though, an XML model isn't a pre-requisite to a Python
model. Though it might be a pre-req to a SOAP model :-).

jon.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 10:41 First web interface and service API draft Christian Meder
                   ` (2 preceding siblings ...)
  2005-04-22 12:37 ` El Draper
@ 2005-04-22 14:23 ` Jan Harkes
  2005-04-22 20:57   ` Christian Meder
  2005-04-22 22:45   ` Petr Baudis
  3 siblings, 2 replies; 16+ messages in thread
From: Jan Harkes @ 2005-04-22 14:23 UTC (permalink / raw)
  To: Christian Meder; +Cc: git

On Fri, Apr 22, 2005 at 12:41:56PM +0200, Christian Meder wrote:
> -------
> /<project>/blob/<blob-sha1>
> /<project>/commit/<commit-sha1>

It is trivial to find an object when given a sha, but to know the object
type you'd have to decompress it and check inside. Also the way git
stores these things you can't have both a blob and a commit with the
same sha anyways.

So why not use,
    /<project/<hexadecimal sha1 representation>
	will give you the raw object.

    /<project/<hexadecimal sha1 representation>.html (.xml/.txt)
	will give you a parsed version for user presentation

And since hexadecimal numbers only have [0-9a-f] as valid characters,
you can still have additional directories that can be guaranteed unique
as long as the first two characters are not a valid hexadecimal value.
So things like /branch/linus, or /changelog/, /log/, /diff/. Yeah, you
can't use /delta/ without looking at more than the first two characters,
but that's where dictionaries can come in handy.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 14:23 ` Jan Harkes
@ 2005-04-22 20:57   ` Christian Meder
  2005-04-23  6:39     ` Jon Seymour
  2005-04-22 22:45   ` Petr Baudis
  1 sibling, 1 reply; 16+ messages in thread
From: Christian Meder @ 2005-04-22 20:57 UTC (permalink / raw)
  To: Jan Harkes; +Cc: git

On Fri, 2005-04-22 at 10:23 -0400, Jan Harkes wrote:
> On Fri, Apr 22, 2005 at 12:41:56PM +0200, Christian Meder wrote:
> > -------
> > /<project>/blob/<blob-sha1>
> > /<project>/commit/<commit-sha1>
> 
> It is trivial to find an object when given a sha, but to know the object
> type you'd have to decompress it and check inside. Also the way git
> stores these things you can't have both a blob and a commit with the
> same sha anyways.
> 
> So why not use,
>     /<project/<hexadecimal sha1 representation>
> 	will give you the raw object.
> 
>     /<project/<hexadecimal sha1 representation>.html (.xml/.txt)
> 	will give you a parsed version for user presentation
> 
> And since hexadecimal numbers only have [0-9a-f] as valid characters,
> you can still have additional directories that can be guaranteed unique
> as long as the first two characters are not a valid hexadecimal value.
> So things like /branch/linus, or /changelog/, /log/, /diff/. Yeah, you
> can't use /delta/ without looking at more than the first two characters,
> but that's where dictionaries can come in handy.

Hmm. I'm not sure about throwing away the <objecttype> information in
the url. I think I'd prefer to retain the blob, tree and commit
namespaces because I think they help API users to explicitly state what
kind of object they expect. I can't think of a scenario where I'd want a
<sha1> of unknown type. Do you have a specific use case in mind ?



				Christian     
-- 
Christian Meder, email: chris@absolutegiganten.org

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

                (Eihei Dogen Zenji)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 14:23 ` Jan Harkes
  2005-04-22 20:57   ` Christian Meder
@ 2005-04-22 22:45   ` Petr Baudis
  1 sibling, 0 replies; 16+ messages in thread
From: Petr Baudis @ 2005-04-22 22:45 UTC (permalink / raw)
  To: Christian Meder, git

Dear diary, on Fri, Apr 22, 2005 at 04:23:42PM CEST, I got a letter
where Jan Harkes <jaharkes@cs.cmu.edu> told me that...
> On Fri, Apr 22, 2005 at 12:41:56PM +0200, Christian Meder wrote:
> > -------
> > /<project>/blob/<blob-sha1>
> > /<project>/commit/<commit-sha1>
> 
> It is trivial to find an object when given a sha, but to know the object
> type you'd have to decompress it and check inside. Also the way git
> stores these things you can't have both a blob and a commit with the
> same sha anyways.
> 
> So why not use,
>     /<project/<hexadecimal sha1 representation>
> 	will give you the raw object.
> 
>     /<project/<hexadecimal sha1 representation>.html (.xml/.txt)
> 	will give you a parsed version for user presentation

Because this gives you more type control, and type control is good where
it makes sense (i.e. the types are completely orthogonal). It makes
sense here - either you _know_ your sha1 is of given type, or you don't
know at all what are you doing.

Also, when looking at an URI, you can immediately say what type of
object does it point at. I actually consider that a pretty important
property.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
       [not found]   ` <1114176579.3233.42.camel@localhost>
@ 2005-04-22 22:57     ` Petr Baudis
  2005-04-24 22:29       ` Christian Meder
  0 siblings, 1 reply; 16+ messages in thread
From: Petr Baudis @ 2005-04-22 22:57 UTC (permalink / raw)
  To: Christian Meder; +Cc: git

Dear diary, on Fri, Apr 22, 2005 at 03:29:39PM CEST, I got a letter
where Christian Meder <chris@absolutegiganten.org> told me that...
> > > /<project>
> > > 
> > > Ok. The URI should start by stating the project name
> > > e.g. /linux-2.6. This does bloat the URI slightly but I don't think
> > > that we want to have one root namespace per git archive in the long
> > > run. Additionally you can always put rewriting or redirecting rules at
> > > the root level for additional convenience when there's an obvious
> > > default project.
> > > 
> > > Should provide some meta data, stats, etc. if available.
> > 
> > I don't think this makes much sense. I think you should just apply -p1
> > to all the directories, and define that there should be some / page
> > which should contain some metadata regarding the repository you are
> > accessing (probably branches, tags, and such).
> 
> Hi,

Hi,

> remember that I want to stay stateless as long as possible so everything
> important has to be encoded in the url. So somewhere in the url the git
> archive to show has to be encoded. If I remove the <project> portion how
> do I know on the server side which repo to show ?

since you are configured appropriately.

You need to be anyway. Someone needs to tell you or your web server
"this lives at http://pasky.or.cz/wit/". So you bind "this" to the
given repository.

No problem with an additional configuration possibility to say "at that
place, clone your life place for the given repositories", but if I want
to have just a single repository at a given URL, it should be possible.

I'm just trying to argue that having it _forced_ to have <project> as
the part of the URL is useless; this is matter of configuration.

> > > * Blob data should be probably binary ?
> > 
> > What do you mean by binary?
> 
> content-type: binary/octet-stream

Ah. So just as-is, you mean?

> > Anything wrong with putting ls-tree output there?
> 
> ls-tree output should be in .html (see below)

What if I actually want to process it by a script?

> > > -------
> > > /<project>/tree/<tree-sha1>
> > > 
> > > Tree objects are served in binary form. Primary audience are scripts,
> > > etc. Human beings will probably get a heart attack when they
> > > accidentally visit this URI.
> > 
> > Binary form is unusable for scripts.
> 
> Why should it be unusable for a downloading script. It's just the raw
> git object.
> 
> > We should also have /gitobj/<sha1> for fetching the raw git objects.
> 
> Everything above is supposed to be raw git objects. No special encoding
> whatever.

You have a consistency problem here.

Raw git objects as in database contain the leading object type at the
start, then possibly some more stuff, then '\0' and then compressed
binary stuff. You mean you are exporting _this_ stuff through this?

That's not very useful except for http-pull, if you as me. It also does
not blend well with the fact that you say commits are in text or so.

> > > -------
> > > /<project>/tree/<tree-sha1>/diff/<ancestor-tree-sha1>/html
> > > 
> > > Non recursive HTML view of the objects which are contained in the diff
> > > fully linked with the individual HTML views.
> > 
> > Why not .html?
> 
> I think .html isn't very clear because it would
> be ..../<ancestor-tree-sha1>.html which somehow looks like it has
> anything to do with the ancestor-tree. But it's the html version of the
> _diff_ and not the ancestor-tree.

Perhaps /tree/<sha1>.html/diff/<ancestor> ?

I'd lend to ?diff=<ancestor> more and more. The path part of URI is
there to express _hierarchy_, I think you are abusing that when there is
no hierarchy.

> > For consistency, I'd stay with the plaintext output by default, .html if
> > requested.
> 
> Remember that I'm just sitting on top of git and not git-pasky right
> now. So there's no canonical changelog plaintext output for me. But I'm
> not religious about that.

But there is canonical HTML output for you? ;-)

> > OTOH, I'd use
> > 
> > 	/log/<commit>
> > 
> > to specify what commit to start at. It just does not make sense
> > otherwise, you would not know where to start
> 
> Start for the changelog is always head, but I guess that's pretty
> standard. With git log you always start at the head too.

If you are sitting on top of git and not git-pasky, you have no assured
HEAD information at all.

> If you want to start at a specific commit. Why not start
> at /linux-2.6/commit/<sha1>.html ?

And how does that give me the changelog?

> > I think the <commit> should follow the same or similar rules as Cogito
> > id decoding. E.g. to get latest Linus' changelog, you'd do
> > 
> > 	/log/linus
> 
> Like I said above I think the shown head should be encoded in the
> project id.

I thought the project was mapped to repository? But I might just have
blindly assumed that. ;-) (That does not make me like your approach
more, though.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 20:57   ` Christian Meder
@ 2005-04-23  6:39     ` Jon Seymour
  0 siblings, 0 replies; 16+ messages in thread
From: Jon Seymour @ 2005-04-23  6:39 UTC (permalink / raw)
  To: Christian Meder; +Cc: Jan Harkes, git

On 4/23/05, Christian Meder <chris@absolutegiganten.org> wrote:
> On Fri, 2005-04-22 at 10:23 -0400, Jan Harkes wrote:
> > On Fri, Apr 22, 2005 at 12:41:56PM +0200, Christian Meder wrote:
> > > -------
> > > /<project>/blob/<blob-sha1>
> > > /<project>/commit/<commit-sha1>
> >
> > It is trivial to find an object when given a sha, but to know the object
> > type you'd have to decompress it and check inside. Also the way git
> > stores these things you can't have both a blob and a commit with the
> > same sha anyways.
> >
> > So why not use,
> >     /<project/<hexadecimal sha1 representation>
> >       will give you the raw object.
> 
> Hmm. I'm not sure about throwing away the <objecttype> information in
> the url. I think I'd prefer to retain the blob, tree and commit
> namespaces because I think they help API users to explicitly state what
> kind of object they expect. I can't think of a scenario where I'd want a
> <sha1> of unknown type. Do you have a specific use case in mind ?
> 

I was initially inclined to agree with Jan, but on brief reflection I
think Christian is correct to want to preserve the type info in the
URI. There are numerous reasons why this is a good idea:

- both carbon and silicon users of the URI who don't have direct
access to the repository can infer what the URI refers to without
actually fetching it

- programmatically the web server can make request routing decisions
based on the URI alone and is not forced to perfom a relatively
expensive and unnecessary db hit to derive the type.

That said, I can see some value in providing a web-based
type-resolution service.

So, given a URI of the form

     /<project>/object/<hexadecimal sha1 representation>

the server should resolve the type of the named object and issue an
HTTP re-direct to the typed URI, e.g.

     /<project>/blob/<hexadecimal sha1 representation>

Because browsers tend not to remember redirection sources, external
entities end up recording the typed URIs, but all the benefits of
Jan's suggestion still accrue.

jon.
-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: First web interface and service API draft
  2005-04-22 22:57     ` Petr Baudis
@ 2005-04-24 22:29       ` Christian Meder
  0 siblings, 0 replies; 16+ messages in thread
From: Christian Meder @ 2005-04-24 22:29 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

On Sat, 2005-04-23 at 00:57 +0200, Petr Baudis wrote:
> Dear diary, on Fri, Apr 22, 2005 at 03:29:39PM CEST, I got a letter
> where Christian Meder <chris@absolutegiganten.org> told me that...
> > > > /<project>
> > > > 
> > > > Ok. The URI should start by stating the project name
> > > > e.g. /linux-2.6. This does bloat the URI slightly but I don't think
> > > > that we want to have one root namespace per git archive in the long
> > > > run. Additionally you can always put rewriting or redirecting rules at
> > > > the root level for additional convenience when there's an obvious
> > > > default project.
> > > > 
> > > > Should provide some meta data, stats, etc. if available.
> > > 
> > > I don't think this makes much sense. I think you should just apply -p1
> > > to all the directories, and define that there should be some / page
> > > which should contain some metadata regarding the repository you are
> > > accessing (probably branches, tags, and such).
> > 
> > Hi,
> 
> Hi,
> 
> > remember that I want to stay stateless as long as possible so everything
> > important has to be encoded in the url. So somewhere in the url the git
> > archive to show has to be encoded. If I remove the <project> portion how
> > do I know on the server side which repo to show ?
> 
> since you are configured appropriately.
> 
> You need to be anyway. Someone needs to tell you or your web server
> "this lives at http://pasky.or.cz/wit/". So you bind "this" to the
> given repository.
> 
> No problem with an additional configuration possibility to say "at that
> place, clone your life place for the given repositories", but if I want
> to have just a single repository at a given URL, it should be possible.
> 
> I'm just trying to argue that having it _forced_ to have <project> as
> the part of the URL is useless; this is matter of configuration.

Ok. Got it. <project> for a multi-repo setup and in the simple case of
just one repo <project> can be dropped from the url. Reasonable.

> > > > * Blob data should be probably binary ?
> > > 
> > > What do you mean by binary?
> > 
> > content-type: binary/octet-stream
> 
> Ah. So just as-is, you mean?

Yes.

> 
> > > Anything wrong with putting ls-tree output there?
> > 
> > ls-tree output should be in .html (see below)
> 
> What if I actually want to process it by a script?

Use the .html variant and parse it. Or we add a .txt and/or .xml for
easier parsing.

> 
> > > > -------
> > > > /<project>/tree/<tree-sha1>
> > > > 
> > > > Tree objects are served in binary form. Primary audience are scripts,
> > > > etc. Human beings will probably get a heart attack when they
> > > > accidentally visit this URI.
> > > 
> > > Binary form is unusable for scripts.
> > 
> > Why should it be unusable for a downloading script. It's just the raw
> > git object.
> > 
> > > We should also have /gitobj/<sha1> for fetching the raw git objects.
> > 
> > Everything above is supposed to be raw git objects. No special encoding
> > whatever.
> 
> You have a consistency problem here.
> 
> Raw git objects as in database contain the leading object type at the
> start, then possibly some more stuff, then '\0' and then compressed
> binary stuff. You mean you are exporting _this_ stuff through this?
> 
> That's not very useful except for http-pull, if you as me. It also does
> not blend well with the fact that you say commits are in text or so.

Ok. We spoke of two different things. With raw objects I meant the
uncompressed raw content while you spoke of the raw compressed git
objects. Ok I'm dumb but now that I've understood what you said I agree
with you: we need one generic url for fetching compressed objects.

> 
> > > > -------
> > > > /<project>/tree/<tree-sha1>/diff/<ancestor-tree-sha1>/html
> > > > 
> > > > Non recursive HTML view of the objects which are contained in the diff
> > > > fully linked with the individual HTML views.
> > > 
> > > Why not .html?
> > 
> > I think .html isn't very clear because it would
> > be ..../<ancestor-tree-sha1>.html which somehow looks like it has
> > anything to do with the ancestor-tree. But it's the html version of the
> > _diff_ and not the ancestor-tree.
> 
> Perhaps /tree/<sha1>.html/diff/<ancestor> ?
> 
> I'd lend to ?diff=<ancestor> more and more. The path part of URI is
> there to express _hierarchy_, I think you are abusing that when there is
> no hierarchy.

But I'd argue that you are abusing queries ;-)
After all any given URI of the above kind is linking a specific diff
resource. It's a completely static resource from a user POV. The fact
that the server is probably dynamically generating it is just an
implementation detail.

> 
> > > For consistency, I'd stay with the plaintext output by default, .html if
> > > requested.
> > 
> > Remember that I'm just sitting on top of git and not git-pasky right
> > now. So there's no canonical changelog plaintext output for me. But I'm
> > not religious about that.
> 
> But there is canonical HTML output for you? ;-)

No. Changelog isn't defined by git so there's no canonical output of any
flavour.

> > > OTOH, I'd use
> > > 
> > > 	/log/<commit>
> > > 
> > > to specify what commit to start at. It just does not make sense
> > > otherwise, you would not know where to start
> > 
> > Start for the changelog is always head, but I guess that's pretty
> > standard. With git log you always start at the head too.
> 
> If you are sitting on top of git and not git-pasky, you have no assured
> HEAD information at all.

I've got HEAD. I'm still watching the discussion of tags.

> > If you want to start at a specific commit. Why not start
> > at /linux-2.6/commit/<sha1>.html ?
> 
> And how does that give me the changelog?

You could click through the commit chain interactively or we could add a
changelog from here function.
 
> > > I think the <commit> should follow the same or similar rules as Cogito
> > > id decoding. E.g. to get latest Linus' changelog, you'd do
> > > 
> > > 	/log/linus
> > 
> > Like I said above I think the shown head should be encoded in the
> > project id.
> 
> I thought the project was mapped to repository? But I might just have
> blindly assumed that. ;-) (That does not make me like your approach
> more, though.)

Ok. I think I misunderstood you here. You want to publish the different
heads you are tracking with the same repo, right ?

The proposal didn't account for this scenario yet. I'll think about it.



				Christian

-- 
Christian Meder, email: chris@absolutegiganten.org

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

                (Eihei Dogen Zenji)


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2005-04-24 22:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-22 10:41 First web interface and service API draft Christian Meder
2005-04-22 11:34 ` Jon Seymour
2005-04-22 12:10   ` Petr Baudis
2005-04-22 12:27     ` Jon Seymour
2005-04-22 13:32       ` Christian Meder
2005-04-22 13:30     ` Christian Meder
2005-04-22 12:10 ` Petr Baudis
     [not found]   ` <1114176579.3233.42.camel@localhost>
2005-04-22 22:57     ` Petr Baudis
2005-04-24 22:29       ` Christian Meder
2005-04-22 12:37 ` El Draper
2005-04-22 13:44   ` Christian Meder
2005-04-22 13:47     ` Jon Seymour
2005-04-22 14:23 ` Jan Harkes
2005-04-22 20:57   ` Christian Meder
2005-04-23  6:39     ` Jon Seymour
2005-04-22 22:45   ` Petr Baudis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).