* [doc] User Manual Suggestion @ 2009-04-22 19:38 David Abrahams 2009-04-23 17:57 ` J. Bruce Fields 0 siblings, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-22 19:38 UTC (permalink / raw) To: git http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#how-to-check-out covers "git reset" way too early, IMO, before one has the conceptual foundation necessary to understand what it means to "modify the current branch to point at v2.6.17". If this operation must be covered this early in the manual, it should probably not be until http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#manipulating-branches HTH, -- Dave Abrahams BoostPro Computing http://www.boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-22 19:38 [doc] User Manual Suggestion David Abrahams @ 2009-04-23 17:57 ` J. Bruce Fields 2009-04-23 18:37 ` Michael Witten 0 siblings, 1 reply; 90+ messages in thread From: J. Bruce Fields @ 2009-04-23 17:57 UTC (permalink / raw) To: David Abrahams; +Cc: git On Wed, Apr 22, 2009 at 03:38:52PM -0400, David Abrahams wrote: > > http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#how-to-check-out > covers "git reset" way too early, IMO, before one has the conceptual > foundation necessary to understand what it means to "modify the current > branch to point at v2.6.17". If this operation must be covered this > early in the manual, it should probably not be until > http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#manipulating-branches I agree; we should suggest just a git-checkout (to a detached HEAD) instead, though that needs a little explanation so people aren't scared by the warning message it gives. I also have a longstanding todo to experiment with rewriting the beginning to use detached heads more and defer branch management till later. --b. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 17:57 ` J. Bruce Fields @ 2009-04-23 18:37 ` Michael Witten 2009-04-23 20:16 ` Jeff King 2009-04-24 2:29 ` J. Bruce Fields 0 siblings, 2 replies; 90+ messages in thread From: Michael Witten @ 2009-04-23 18:37 UTC (permalink / raw) To: J. Bruce Fields; +Cc: David Abrahams, git On Thu, Apr 23, 2009 at 12:57, J. Bruce Fields <bfields@fieldses.org> wrote: > On Wed, Apr 22, 2009 at 03:38:52PM -0400, David Abrahams wrote: >> >> http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#how-to-check-out >> covers "git reset" way too early, IMO, before one has the conceptual >> foundation necessary to understand what it means to "modify the current >> branch to point at v2.6.17". If this operation must be covered this >> early in the manual, it should probably not be until >> http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#manipulating-branches > > I agree; we should suggest just a git-checkout (to a detached HEAD) > instead, though that needs a little explanation so people aren't scared > by the warning message it gives. Everyone talks about "before one has the conceptual foundation necessary to understand". Well, here's an idea: The git documentation should start with the concepts! Why don't the docs start out defining blobs and trees and the object database and references into that database? The reason everything is so confusing is that the understanding is brushed under the tutorial rug. People need to learn how to think before they can effectively learn to start doing. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 18:37 ` Michael Witten @ 2009-04-23 20:16 ` Jeff King 2009-04-23 20:45 ` Michael Witten 2009-04-23 21:26 ` David Abrahams 2009-04-24 2:29 ` J. Bruce Fields 1 sibling, 2 replies; 90+ messages in thread From: Jeff King @ 2009-04-23 20:16 UTC (permalink / raw) To: Michael Witten; +Cc: J. Bruce Fields, David Abrahams, git On Thu, Apr 23, 2009 at 01:37:05PM -0500, Michael Witten wrote: > Everyone talks about "before one has the conceptual foundation > necessary to understand". Well, here's an idea: The git documentation > should start with the concepts! > > Why don't the docs start out defining blobs and trees and the object > database and references into that database? The reason everything is > so confusing is that the understanding is brushed under the tutorial > rug. People need to learn how to think before they can effectively > learn to start doing. I agree with you, but not everyone does (and you can find prior debates in the list archives). The user-manual is pretty "top down". There are some "bottom-up" resources available, but I haven't seen one pointed to as "definitive". I think it might actually be nice for there to be a parallel to the user manual that follows the bottom-up approach, and people could read the one that appeals most to them (or if they have a lot of time on their hands, read both and hopefully it makes sense in the middle ;) ). But we would need somebody to volunteer to write it. I would be happy to help out, but I'm too short on time at the moment to be the driving force. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 20:16 ` Jeff King @ 2009-04-23 20:45 ` Michael Witten 2009-04-23 21:31 ` David Abrahams 2009-04-24 14:11 ` Jeff King 2009-04-23 21:26 ` David Abrahams 1 sibling, 2 replies; 90+ messages in thread From: Michael Witten @ 2009-04-23 20:45 UTC (permalink / raw) To: Jeff King; +Cc: J. Bruce Fields, David Abrahams, git On Thu, Apr 23, 2009 at 15:16, Jeff King <peff@peff.net> wrote: > On Thu, Apr 23, 2009 at 01:37:05PM -0500, Michael Witten wrote: > >> Everyone talks about "before one has the conceptual foundation >> necessary to understand". Well, here's an idea: The git documentation >> should start with the concepts! >> >> Why don't the docs start out defining blobs and trees and the object >> database and references into that database? The reason everything is >> so confusing is that the understanding is brushed under the tutorial >> rug. People need to learn how to think before they can effectively >> learn to start doing. > > I agree with you, but not everyone does (and you can find prior debates > in the list archives). The user-manual is pretty "top down". There are > some "bottom-up" resources available, but I haven't seen one pointed to > as "definitive".I think it might actually be nice for there to be a > parallel to the user manual that follows the bottom-up approach, and > people could read the one that appeals most to them (or if they have a > lot of time on their hands, read both and hopefully it makes sense in > the middle ;) ). I think the main problem, then, is that the tools have a UI that is somewhere in the middle. However, a discussion of blobs, trees, commits, objects, and references isn't necessarily low-level. It seems to me that it is a high-level understanding of the git world. Without those *definitions*, people are left to their own wrong, inconsistent thoughts. The low-level stuff is HOW those concepts have been used in the implementation of git: Where certain files are stored, how certain bytes are organized in memory, what are the underlying porcelain tools, etc. That what's low-level. > But we would need somebody to volunteer to write it. I would be happy to > help out, but I'm too short on time at the moment to be the driving > force. Maybe I'll try to write something, but it won't take place quickly, either. I'd want to read ALL of the existing documentation first. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 20:45 ` Michael Witten @ 2009-04-23 21:31 ` David Abrahams 2009-04-24 0:31 ` Michael Witten 2009-04-24 14:18 ` Jeff King 2009-04-24 14:11 ` Jeff King 1 sibling, 2 replies; 90+ messages in thread From: David Abrahams @ 2009-04-23 21:31 UTC (permalink / raw) To: Michael Witten; +Cc: Jeff King, J. Bruce Fields, git On Apr 23, 2009, at 4:45 PM, Michael Witten wrote: > On Thu, Apr 23, 2009 at 15:16, Jeff King <peff@peff.net> wrote: >> On Thu, Apr 23, 2009 at 01:37:05PM -0500, Michael Witten wrote: >> >>> Everyone talks about "before one has the conceptual foundation >>> necessary to understand". Well, here's an idea: The git >>> documentation >>> should start with the concepts! >>> >>> Why don't the docs start out defining blobs and trees and the object >>> database and references into that database? The reason everything is >>> so confusing is that the understanding is brushed under the tutorial >>> rug. People need to learn how to think before they can effectively >>> learn to start doing. >> >> I agree with you, but not everyone does (and you can find prior >> debates >> in the list archives). The user-manual is pretty "top down". There >> are >> some "bottom-up" resources available, but I haven't seen one >> pointed to >> as "definitive".I think it might actually be nice for there to be a >> parallel to the user manual that follows the bottom-up approach, and >> people could read the one that appeals most to them (or if they >> have a >> lot of time on their hands, read both and hopefully it makes sense in >> the middle ;) ). > > I think the main problem, then, is that the tools have a UI that is > somewhere in the middle. Well, "the UI" (how many do we really have for Git?) is spread across the spectrum. The git command-line alone lets you do incredibly low- level things that "nobody should ever do" and some really high-level things that are everyone's bread-and-butter. There's no obvious distinction. > However, a discussion of blobs, trees, commits, objects, and > references isn't necessarily low-level. It seems to me that it is a > high-level understanding of the git world. Without those > *definitions*, people are left to their own wrong, inconsistent > thoughts. 1000% agreed. > The low-level stuff is HOW those concepts have been used in the > implementation of git: Where certain files are stored, how certain > bytes are organized in memory, what are the underlying porcelain > tools, etc. That what's low-level. Yep >> But we would need somebody to volunteer to write it. I would be >> happy to >> help out, but I'm too short on time at the moment to be the driving >> force. > > Maybe I'll try to write something, but it won't take place quickly, > either. I'd want to read ALL of the existing documentation first. See you in a couple years ;-) -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 21:31 ` David Abrahams @ 2009-04-24 0:31 ` Michael Witten 2009-04-24 14:18 ` Jeff King 1 sibling, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-04-24 0:31 UTC (permalink / raw) To: David Abrahams; +Cc: Jeff King, J. Bruce Fields, git On Thu, Apr 23, 2009 at 16:31, David Abrahams <dave@boostpro.com> wrote: > > >> However, a discussion of blobs, trees, commits, objects, and >> references isn't necessarily low-level. It seems to me that it is a >> high-level understanding of the git world. Without those >> *definitions*, people are left to their own wrong, inconsistent >> thoughts. > > 1000% agreed. I think this is a case in point: http://marc.info/?l=git&m=124052299832318&w=2 ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 21:31 ` David Abrahams 2009-04-24 0:31 ` Michael Witten @ 2009-04-24 14:18 ` Jeff King 2009-04-24 14:20 ` J. Bruce Fields 2009-04-24 17:28 ` David Abrahams 1 sibling, 2 replies; 90+ messages in thread From: Jeff King @ 2009-04-24 14:18 UTC (permalink / raw) To: David Abrahams; +Cc: Michael Witten, J. Bruce Fields, git On Thu, Apr 23, 2009 at 05:31:13PM -0400, David Abrahams wrote: >> I think the main problem, then, is that the tools have a UI that is >> somewhere in the middle. > > Well, "the UI" (how many do we really have for Git?) is spread across the > spectrum. The git command-line alone lets you do incredibly low-level > things that "nobody should ever do" and some really high-level things that > are everyone's bread-and-butter. There's no obvious distinction. I think this is a bit better than it used to be. Plumbing commands are mostly hidden outside of the user's PATH. Unfortunately there are still some warts, like the fact that users may be referred to "git help rev-parse" to learn about how revisions are specified. But they have to wade through the information on the "rev-parse" command, which is something that most users will never need to know or care about. A lot of that is historical baggage. The original git was not a VCS but rather a _toolkit_ for building a VCS. So the natural place for talking about parsing revisions was rev-parse, because that was the only way to access the revision parsing code. :) I think a lot of documentation like the "specifying revisions" section of rev-parse might benefit from being split into its own "concept" section, like gitrevisions(7). And commands which allow specifying revisions (at least the major ones, like log, diff, etc) should reference it (but not include it directly, as we do with some documentation snippets -- the point is to make the user aware that they are learning a separate concept that can be applied in multiple places, and to give that concept a name). -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 14:18 ` Jeff King @ 2009-04-24 14:20 ` J. Bruce Fields 2009-04-24 17:28 ` David Abrahams 1 sibling, 0 replies; 90+ messages in thread From: J. Bruce Fields @ 2009-04-24 14:20 UTC (permalink / raw) To: Jeff King; +Cc: David Abrahams, Michael Witten, git On Fri, Apr 24, 2009 at 10:18:47AM -0400, Jeff King wrote: > On Thu, Apr 23, 2009 at 05:31:13PM -0400, David Abrahams wrote: > > >> I think the main problem, then, is that the tools have a UI that is > >> somewhere in the middle. > > > > Well, "the UI" (how many do we really have for Git?) is spread across the > > spectrum. The git command-line alone lets you do incredibly low-level > > things that "nobody should ever do" and some really high-level things that > > are everyone's bread-and-butter. There's no obvious distinction. > > I think this is a bit better than it used to be. Plumbing commands are > mostly hidden outside of the user's PATH. Unfortunately there are still > some warts, like the fact that users may be referred to "git help > rev-parse" to learn about how revisions are specified. But they have to > wade through the information on the "rev-parse" command, which is > something that most users will never need to know or care about. > > A lot of that is historical baggage. The original git was not a VCS but > rather a _toolkit_ for building a VCS. So the natural place for talking > about parsing revisions was rev-parse, because that was the only way to > access the revision parsing code. :) > > I think a lot of documentation like the "specifying revisions" section > of rev-parse might benefit from being split into its own "concept" > section, like gitrevisions(7). And commands which allow specifying > revisions (at least the major ones, like log, diff, etc) should > reference it (but not include it directly, as we do with some > documentation snippets -- the point is to make the user aware that they > are learning a separate concept that can be applied in multiple places, > and to give that concept a name). I'd be in favor of that. --b. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 14:18 ` Jeff King 2009-04-24 14:20 ` J. Bruce Fields @ 2009-04-24 17:28 ` David Abrahams 2009-04-24 18:15 ` Jeff King 1 sibling, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-24 17:28 UTC (permalink / raw) To: Jeff King; +Cc: Michael Witten, J. Bruce Fields, git On Apr 24, 2009, at 10:18 AM, Jeff King wrote: > On Thu, Apr 23, 2009 at 05:31:13PM -0400, David Abrahams wrote: > >>> I think the main problem, then, is that the tools have a UI that is >>> somewhere in the middle. >> >> Well, "the UI" (how many do we really have for Git?) is spread >> across the >> spectrum. The git command-line alone lets you do incredibly low- >> level >> things that "nobody should ever do" and some really high-level >> things that >> are everyone's bread-and-butter. There's no obvious distinction. > > I think this is a bit better than it used to be. Plumbing commands are > mostly hidden outside of the user's PATH. Huh? git hash-object git cat-file -t ... git ls-tree git rev-parse git write-tree git commit-tree ... These are just some of the ones I learned about by reading John Wiegley's "Git From the Bottom Up." Maybe I'm wrong about rev-parse, but for the most part, having all these low-level commands available through the same executable that's used for "git add," "git merge," "git commit," et. al. makes the whole shebang hard to approach. It would be better for users if the low- level stuff was accessed some other way. > A lot of that is historical baggage. The original git was not a VCS > but > rather a _toolkit_ for building a VCS. So the natural place for > talking > about parsing revisions was rev-parse, because that was the only way > to > access the revision parsing code. :) I understand that, but it doesn't change the present reality. > I think a lot of documentation like the "specifying revisions" section > of rev-parse might benefit from being split into its own "concept" > section, like gitrevisions(7). Yes, please. [excuse me, but what the #@&*! is "porcelainish" supposed to mean? (http://www.kernel.org/pub/software/scm/git/docs/git-rev-parse.html )] > And commands which allow specifying > revisions (at least the major ones, like log, diff, etc) should > reference it (but not include it directly, as we do with some > documentation snippets -- the point is to make the user aware that > they > are learning a separate concept that can be applied in multiple > places, > and to give that concept a name). Very nice. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 17:28 ` David Abrahams @ 2009-04-24 18:15 ` Jeff King 2009-04-24 19:00 ` David Abrahams 0 siblings, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-24 18:15 UTC (permalink / raw) To: David Abrahams; +Cc: Michael Witten, J. Bruce Fields, git On Fri, Apr 24, 2009 at 01:28:35PM -0400, David Abrahams wrote: >> I think this is a bit better than it used to be. Plumbing commands are >> mostly hidden outside of the user's PATH. > > Huh? > > git hash-object > git cat-file -t ... > git ls-tree > git rev-parse > git write-tree > git commit-tree How did you find out about them? They are not in your PATH, so shell completion doesn't find them. They are not in the programmable bash completion. They are not in the short command list git gives you when you type "git help" or "git" without arguments. So you must have read about them somewhere... > These are just some of the ones I learned about by reading John Wiegley's > "Git From the Bottom Up." ...like here. So if that document gave you the impression that those are part of an everyday git workflow, then I think the document is at fault, not git itself. I admit I haven't read "Git From the Bottom Up" carefully, but I think what Michael is proposing would probably start a little higher from the bottom than that document. You can give the concepts of the object types, show them in pretty-printed form with "git show", and not worry about telling the user "this is how 'git commit' could be implemented in terms of primitive operations". And then you can avoid most of the low-level commands entirely. > Maybe I'm wrong about rev-parse, but for the most part, having all these > low-level commands available through the same executable that's used for > "git add," "git merge," "git commit," et. al. makes the whole shebang hard > to approach. It would be better for users if the low-level stuff was > accessed some other way. Perhaps. The general approach is to make those commands accessible as "git foo", but not to _advertise_ them in the same way as the porcelain commands. The idea was to give a uniform calling convention without unnecessarily confusing users by presenting a large number of infrequently-used commands. At any rate, it is too late to change the calling convention for plumbing. The whole point of them is to be a stable interface for scripting. Changing them to "git low-level rev-parse" (if it was even something that we wanted to do, which I don't think it is) would break everyone's scripts. >> A lot of that is historical baggage. The original git was not a VCS but >> rather a _toolkit_ for building a VCS. So the natural place for talking >> about parsing revisions was rev-parse, because that was the only way to >> access the revision parsing code. :) > > I understand that, but it doesn't change the present reality. Right. I'm just trying to say how we got here, which I think is relevant because it gives a hint of what directions we can go in. In other words, nobody _designed_ what we have now. It evolved into this state, which obviously has some drawbacks. So I think you won't find much resistance in trying to evolve the documentation to present git more as a coherent tool, and less as a set of unrelated commands. > [excuse me, but what the #@&*! is "porcelainish" supposed to mean? > (http://www.kernel.org/pub/software/scm/git/docs/git-rev-parse.html)] Heh. That one is particularly egregious, because it rests on several layers of git jargon. The low-level tools are plumbing, like pipes and valves. The high-level commands intended for end users are porcelain, like sinks and toilets. The -ish suffix is often used in git to refer to a type, or something we can convert into a type (like a "tree-ish" could be a tree object, or a commit object which points to a tree, or a tag object which points to a commit which points to a tree). So I think by saying "porcelain-ish" here, the author meant "not just porcelain, but other things which take revisions and behave sort of like porcelain". Which is a truly horrible thing to throw at a new user who just wants to see how to specify a revision. So yeah, if you are saying that could be worded better, I absolutely agree. There are a lot of spots like that. They are getting fixed slowly over time. I'm not sure if that is enough, or if somebody knowledgeable really needs to take a sledge hammer to the existing documentation and just reorganize and rewrite a lot of it. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 18:15 ` Jeff King @ 2009-04-24 19:00 ` David Abrahams 2009-04-24 20:24 ` Jeff King 0 siblings, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-24 19:00 UTC (permalink / raw) To: Jeff King; +Cc: Michael Witten, J. Bruce Fields, git On Apr 24, 2009, at 2:15 PM, Jeff King wrote: > On Fri, Apr 24, 2009 at 01:28:35PM -0400, David Abrahams wrote: > >>> I think this is a bit better than it used to be. Plumbing commands >>> are >>> mostly hidden outside of the user's PATH. >> >> Huh? >> >> git hash-object >> git cat-file -t ... >> git ls-tree >> git rev-parse >> git write-tree >> git commit-tree > > How did you find out about them? The first time? $ man git > They are not in your PATH, so shell > completion doesn't find them. Huh? `which git` works. ls-tree is an argument to git as far as I know. Yes, I know there are aliases like git-ls-tree somewhere, but that only adds to the sense that all commands are equal. > They are not in the programmable bash > completion. They are not in the short command list git gives you when > you type "git help" or "git" without arguments. > > So you must have read about them somewhere.. $ man git which makes no distinction. $ xxx [--]help is usually OK if I already know xxx pretty well and just want a refresher. If know I'll need a little more than that, I use man straight away. >> These are just some of the ones I learned about by reading John >> Wiegley's >> "Git From the Bottom Up." > > ...like here. That's where I learned *what they do*. > So if that document gave you the impression that those are > part of an everyday git workflow, then I think the document is at > fault, > not git itself. It didn't. > I admit I haven't read "Git From the Bottom Up" carefully, but I think > what Michael is proposing would probably start a little higher from > the > bottom than that document. Yes, please. "Git for Computer Scientists" is a great foundation. From there add more information about naming things so I know what things like remotes/origin/master mean when I see them in gitk, and I'm off to the races. > You can give the concepts of the object > types, show them in pretty-printed form with "git show", and not worry > about telling the user "this is how 'git commit' could be > implemented in > terms of primitive operations". And then you can avoid most of the > low-level commands entirely. Yes, that's fine. Although I think there may be some things in GFTBU that are good fundamental concepts. There's a nice list of terms with definitions early in the document. >> Maybe I'm wrong about rev-parse, but for the most part, having all >> these >> low-level commands available through the same executable that's >> used for >> "git add," "git merge," "git commit," et. al. makes the whole >> shebang hard >> to approach. It would be better for users if the low-level stuff was >> accessed some other way. > > Perhaps. The general approach is to make those commands accessible as > "git foo", but not to _advertise_ them in the same way as the > porcelain > commands. What is "porcelain," please? This is one among many examples of jargon used only (or encountered by me for the first time) in the Git community. > The idea was to give a uniform calling convention without > unnecessarily confusing users by presenting a large number of > infrequently-used commands. It's not working, I'm sorry to say. > At any rate, it is too late to change the calling convention for > plumbing. I disagree. You can leave the old functionality there in a "deprecated" state and change the way you advertise it. It would even help a lot if the plumbing were all spelled "git-xxx" and the high level stuff were "git xxx." > The whole point of them is to be a stable interface for > scripting. Changing them to "git low-level rev-parse" (if it was even > something that we wanted to do, which I don't think it is) would break > everyone's scripts. See above. >> [excuse me, but what the #@&*! is "porcelainish" supposed to mean? >> (http://www.kernel.org/pub/software/scm/git/docs/git-rev-parse.html)] > > Heh. That one is particularly egregious, because it rests on several > layers of git jargon. The low-level tools are plumbing, like pipes and > valves. ? I use the valves on my kitchen sink all the time. > The high-level commands intended for end users are porcelain, > like sinks and toilets. The -ish suffix is often used in git to > refer to > a type, or something we can convert into a type (like a "tree-ish" > could > be a tree object, or a commit object which points to a tree, or a tag > object which points to a commit which points to a tree). So I think by > saying "porcelain-ish" here, the author meant "not just porcelain, but > other things which take revisions and behave sort of like porcelain". bah. humbug. > Which is a truly horrible thing to throw at a new user who just > wants to > see how to specify a revision. yeeeeah. > So yeah, if you are saying that could be worded better, I absolutely > agree. There are a lot of spots like that. They are getting fixed > slowly > over time. I'm not sure if that is enough, or if somebody > knowledgeable > really needs to take a sledge hammer to the existing documentation and > just reorganize and rewrite a lot of it. I'm thinking the latter. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 19:00 ` David Abrahams @ 2009-04-24 20:24 ` Jeff King 2009-04-24 21:06 ` David Abrahams 0 siblings, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-24 20:24 UTC (permalink / raw) To: David Abrahams; +Cc: Michael Witten, J. Bruce Fields, git On Fri, Apr 24, 2009 at 03:00:19PM -0400, David Abrahams wrote: >> How did you find out about them? > > The first time? > > $ man git > > [...] > > which makes no distinction [between porcelain and plumbing]. Really? The command list in my version is divided into "HIGH-LEVEL COMMANDS (PORCELAIN)" and "LOW-LEVEL COMMANDS (PLUMBING)", with the commands you mentioned falling into the latter. And skimming "git log Documentation/git.txt", it looks like it has been that way for some time. There is a little discussion under the plumbing section of what plumbing is. It could perhaps be more emphatic in warning regular users away. >> They are not in your PATH, so shell >> completion doesn't find them. > > Huh? `which git` works. ls-tree is an argument to git as far as I know. Yes, but shell completion will never present you with the text "ls-tree". You have to have found out about it somewhere else (and completion used to show, because git-ls-tree was in the PATH). > $ xxx [--]help > > is usually OK if I already know xxx pretty well and just want a > refresher. If know I'll need a little more than that, I use man straight > away. git --help shows a list of common commands, but otherwise "git help foo" and "git foo --help" _do_ show the manpage. It may be that "man git" could use some cleanup; specific suggestions are welcome. > What is "porcelain," please? This is one among many examples of jargon > used only (or encountered by me for the first time) in the Git community. I think I ended up explaining it later in my email, but let me know if you are still confused. >> The idea was to give a uniform calling convention without >> unnecessarily confusing users by presenting a large number of >> infrequently-used commands. > > It's not working, I'm sorry to say. Right, that's why I'm trying to figure out why you are hung up on the low-level commands. The idea was that you wouldn't need to be exposed to them at all, but obviously you were (or if you were exposed, it would be in a list that was clearly marked as "this is low-level stuff that you don't really need to worry about". So I'm trying to figure out where it went wrong. >> At any rate, it is too late to change the calling convention for >> plumbing. > > I disagree. You can leave the old functionality there in a "deprecated" > state and change the way you advertise it. But does that really help? It means that "git hash-object" is still there, which I thought was the problem you had. You can argue that it wouldn't be advertised to users, and so wouldn't be a problem, but that is _already_ the strategy we are using. So either that strategy is fine, in which case we are on the right track but may still have some work to do in properly implementing it. Or it's not, in which case your proposal is no better. > It would even help a lot if the plumbing were all spelled "git-xxx" > and the high level stuff were "git xxx." Differentating calling conventions like that was proposed when dashed forms were deprecated and removed from the PATH. But if we had dashed forms for plumbing (i.e., not forwarding them via the "git" wrapper), then you have to do one of: - put them in the user's PATH. Now tab completion or looking in your PATH means you see _just_ the plumbing commands, and none of the high level ones. Which is one of the reasons they were removed from the PATH in the first place (due to numerous user complaints). - put them elsewhere, and force plumbing users to add $GIT_EXEC_PATH to their PATH. That becomes very annoying for casual plumbing users. If you come to the mailing list with a problem, I would have to jump through extra hoops to ask you to show me the output of "git ls-files". Not to mention that the git wrapper does other useful things besides simply exec'ing. For example, it supports --git-dir, --bare, etc. So the problem is that the low-level commands _are_ still useful, and many people still want to call them, just like regular git commands. It's just that they are numerous and low-level, which makes them daunting for new users. And it has become obvious over several years of the git mailing list that users, once they see mention of a command, must start investigating it to find out if and how it is useful. And I am not saying that is a failing of users; on the contrary, I think it is quite a healthy behavior on a unix-ish system. But it means that if we want not to advertise low-level commands, we have to be very careful about the ways in which we mention them. Perhaps it would make sense for each plumbing command's man page to start with something like "this is a low-level command used for scripting git or investigating its internals. For high-level use, you may be more interested in $X", where $X may be "git commit" for write-tree, commit-tree, etc. And that would at least help intercept users before they get too confused. >>> [excuse me, but what the #@&*! is "porcelainish" supposed to mean? >>> (http://www.kernel.org/pub/software/scm/git/docs/git-rev-parse.html)] >> >> Heh. That one is particularly egregious, because it rests on several >> layers of git jargon. The low-level tools are plumbing, like pipes and >> valves. > > ? I use the valves on my kitchen sink all the time. Sorry, I meant the ones under the sink, that you would use if you were replacing the faucet. I would call the ones above "taps". But hopefully you get a sense of the distinction between plumbing and porcelain. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 20:24 ` Jeff King @ 2009-04-24 21:06 ` David Abrahams 2009-04-24 22:45 ` Björn Steinbrink 0 siblings, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-24 21:06 UTC (permalink / raw) To: Jeff King; +Cc: Michael Witten, J. Bruce Fields, git On Apr 24, 2009, at 4:24 PM, Jeff King wrote: > On Fri, Apr 24, 2009 at 03:00:19PM -0400, David Abrahams wrote: > >>> How did you find out about them? >> >> The first time? >> >> $ man git >> >> [...] >> >> which makes no distinction [between porcelain and plumbing]. > > Really? The command list in my version is divided into "HIGH-LEVEL > COMMANDS (PORCELAIN)" and "LOW-LEVEL COMMANDS (PLUMBING)", with the > commands you mentioned falling into the latter. And skimming "git log > Documentation/git.txt", it looks like it has been that way for some > time. Sorry, you are totally right. The list is just so crazy-long; I may have skimmed it. >> Huh? `which git` works. ls-tree is an argument to git as far as I >> know. > > Yes, but shell completion will never present you with the text > "ls-tree". You have to have found out about it somewhere else (and > completion used to show, because git-ls-tree was in the PATH). > >> $ xxx [--]help >> >> is usually OK if I already know xxx pretty well and just want a >> refresher. If know I'll need a little more than that, I use man >> straight >> away. > > git --help shows a list of common commands, but otherwise "git help > foo" and "git foo --help" _do_ show the manpage. It may be that "man > git" could use some cleanup; specific suggestions are welcome. > >> What is "porcelain," please? This is one among many examples of >> jargon >> used only (or encountered by me for the first time) in the Git >> community. > > I think I ended up explaining it later in my email, but let me know if > you are still confused. Nope; I'm fine now. It's not a great analogy, because everyone who uses a sink ends up dealing with spigots and valves, but I get it. >>> The idea was to give a uniform calling convention without >>> unnecessarily confusing users by presenting a large number of >>> infrequently-used commands. >> >> It's not working, I'm sorry to say. > > Right, that's why I'm trying to figure out why you are hung up on the > low-level commands. The idea was that you wouldn't need to be > exposed to > them at all, but obviously you were (or if you were exposed, it > would be > in a list that was clearly marked as "this is low-level stuff that you > don't really need to worry about". So I'm trying to figure out where > it > went wrong. I'm sorry that I can't be much help in that department. If I really knew how I ended up with that wrong impression, I probably would have corrected it already. It's weird; git is composed of ideas that are all very familiar to me (reference-counted management of immutable data, hashing, etc.) yet for me, getting to know it has been really tough. By contrast, for example, subversion was instantly understandable when I pawed through the SVN book. >>> At any rate, it is too late to change the calling convention for >>> plumbing. >> >> I disagree. You can leave the old functionality there in a >> "deprecated" >> state and change the way you advertise it. > > But does that really help? It means that "git hash-object" is still > there, which I thought was the problem you had. You can argue that it > wouldn't be advertised to users, and so wouldn't be a problem, but > that > is _already_ the strategy we are using. So either that strategy is > fine, > in which case we are on the right track but may still have some work > to > do in properly implementing it. Or it's not, in which case your > proposal > is no better. You've got me stumped there, I have to admit. >> It would even help a lot if the plumbing were all spelled "git-xxx" >> and the high level stuff were "git xxx." > > Differentating calling conventions like that was proposed when dashed > forms were deprecated and removed from the PATH. But if we had dashed > forms for plumbing (i.e., not forwarding them via the "git" wrapper), > then you have to do one of: > > - put them in the user's PATH. Now tab completion or looking in your > PATH means you see _just_ the plumbing commands, and none of the > high level ones. Which is one of the reasons they were removed from > the PATH in the first place (due to numerous user complaints). > > - put them elsewhere, and force plumbing users to add $GIT_EXEC_PATH > to their PATH. That becomes very annoying for casual plumbing > users. > If you come to the mailing list with a problem, I would have to > jump > through extra hoops to ask you to show me the output of "git > ls-files". I see your point. llgit xxx ? > Not to mention that the git wrapper does other useful things besides > simply exec'ing. For example, it supports --git-dir, --bare, etc. > So the problem is that the low-level commands _are_ still useful, and > many people still want to call them, just like regular git commands. > It's just that they are numerous and low-level, which makes them > daunting for new users. > > And it has become obvious over several years of the git mailing list > that users, once they see mention of a command, must start > investigating > it to find out if and how it is useful. And I am not saying that is a > failing of users; on the contrary, I think it is quite a healthy > behavior on a unix-ish system. But it means that if we want not to > advertise low-level commands, we have to be very careful about the > ways > in which we mention them. > > Perhaps it would make sense for each plumbing command's man page to > start with something like "this is a low-level command used for > scripting git or investigating its internals. For high-level use, you > may be more interested in $X", where $X may be "git commit" for > write-tree, commit-tree, etc. And that would at least help intercept > users before they get too confused. Sounds like a great idea to me. >>>> [excuse me, but what the #@&*! is "porcelainish" supposed to mean? >>>> (http://www.kernel.org/pub/software/scm/git/docs/git-rev- >>>> parse.html)] >>> >>> Heh. That one is particularly egregious, because it rests on several >>> layers of git jargon. The low-level tools are plumbing, like pipes >>> and >>> valves. >> >> ? I use the valves on my kitchen sink all the time. > > Sorry, I meant the ones under the sink, that you would use if you were > replacing the faucet. I would call the ones above "taps". But > hopefully > you get a sense of the distinction between plumbing and porcelain. I know, but the point is, they're not porcelain. They're "plumbing fixtures." I think UI/API works way better than porcelain/plumbing. We are, after all, programmers. It would also be good to link to a definition any time you use a term of art in the docs. I would even do that in the case of UI/API since the distinction could appear to be subtle. I should also say, most of the docs and interfaces I see in Git (and its wrappers, web interfaces, etc.) give the SHA1 hashes way too much exposure. The times when it's actually more convenient to use a hash instead of one of the other notations are rare, and if hashes weren't so exposed I bet most interfaces would make those other names more available. One reason I think hashes retain their prominent exposure is that you have no other reasonably stable way of referring to commits, since branch~NN counts backward from HEAD. Adding such a thing would help. Oh, one other specific issue: the rev-parse manpage uses $GIT_DIR without saying what it is. I *think* that means the root of the working copy and has nothing to do with environment variables, but it's hard to be sure, and if I'm right about that, it's misleading notation. Someone needs to get gitiseasy.org/gitiseasy.net and then provide content that lives up to the name :^) -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 21:06 ` David Abrahams @ 2009-04-24 22:45 ` Björn Steinbrink 2009-04-25 0:39 ` David Abrahams 0 siblings, 1 reply; 90+ messages in thread From: Björn Steinbrink @ 2009-04-24 22:45 UTC (permalink / raw) To: David Abrahams; +Cc: Jeff King, Michael Witten, J. Bruce Fields, git On 2009.04.24 17:06:27 -0400, David Abrahams wrote: > On Apr 24, 2009, at 4:24 PM, Jeff King wrote: >> On Fri, Apr 24, 2009 at 03:00:19PM -0400, David Abrahams wrote: >>> It would even help a lot if the plumbing were all spelled "git-xxx" >>> and the high level stuff were "git xxx." >> >> Differentating calling conventions like that was proposed when dashed >> forms were deprecated and removed from the PATH. But if we had dashed >> forms for plumbing (i.e., not forwarding them via the "git" wrapper), >> then you have to do one of: >> >> - put them in the user's PATH. Now tab completion or looking in your >> PATH means you see _just_ the plumbing commands, and none of the >> high level ones. Which is one of the reasons they were removed >> from the PATH in the first place (due to numerous user >> complaints). >> >> - put them elsewhere, and force plumbing users to add $GIT_EXEC_PATH >> to their PATH. That becomes very annoying for casual plumbing >> users. If you come to the mailing list with a problem, I would >> have to jump through extra hoops to ask you to show me the output >> of "git ls-files". > > I see your point. > > llgit xxx > > ? If that was the exclusive way of calling the low-level commands, that would still break existing scripts. And if you keep e.g. "git write-tree" and just add "llgit write-tree" as an alias, that will IMHO just cause more confusion once old and new git users meet. And I agree with Peff, it's not important whether it's "git foo", "llgit foo", "git lowlevel foo" or something else. It's just about how much your users really _need_ to know and how you tell them to use the stuff. > I think UI/API works way better than porcelain/plumbing. We are, after > all, programmers. We are programmers, but not all git users are programmers. > It would also be good to link to a definition any time you use a term > of art in the docs. I would even do that in the case of UI/API since > the distinction could appear to be subtle. > > I should also say, most of the docs and interfaces I see in Git (and > its wrappers, web interfaces, etc.) give the SHA1 hashes way too much > exposure. The times when it's actually more convenient to use a hash > instead of one of the other notations are rare, How often do you need a name for a commit shown by a command and can accept that it is not stable? I usually need a name because I want to reference that commit later on, either because I need to talk to other users, or because I'm working on something and might need to look at that commit now and then, regardless on my current state of things. One big exception in my workflow is when I use "git blame", then I usually just need the name once to look at the full commit. But then I prefer a 7-8 characters long sha-1 prefix to something like improve_foo_speed~132^12~1^3. And "pseudo-stable" numbers have been discussed to death. > and if hashes weren't so exposed I bet most interfaces would make > those other names more available. One reason I think hashes retain > their prominent exposure is that you have no other reasonably stable > way of referring to commits, since branch~NN counts backward from > HEAD. Adding such a thing would help. It counts backwards from "branch". > Oh, one other specific issue: the rev-parse manpage uses $GIT_DIR > without saying what it is. I *think* that means the root of the > working copy and has nothing to do with environment variables, but > it's hard to be sure, and if I'm right about that, it's misleading > notation. $GIT_DIR means the .git directory of a non-bare repo. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 22:45 ` Björn Steinbrink @ 2009-04-25 0:39 ` David Abrahams 2009-04-26 23:35 ` Björn Steinbrink 0 siblings, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-25 0:39 UTC (permalink / raw) To: Björn Steinbrink; +Cc: Jeff King, Michael Witten, J. Bruce Fields, git On Apr 24, 2009, at 6:45 PM, Björn Steinbrink wrote: >> I think UI/API works way better than porcelain/plumbing. We are, >> after >> all, programmers. > > We are programmers, but not all git users are programmers. I'm sure you will admit that the vast majority are programmers. This is about speaking effectively to your primary audience. >> It would also be good to link to a definition any time you use a term >> of art in the docs. I would even do that in the case of UI/API since >> the distinction could appear to be subtle. >> >> I should also say, most of the docs and interfaces I see in Git (and >> its wrappers, web interfaces, etc.) give the SHA1 hashes way too much >> exposure. The times when it's actually more convenient to use a hash >> instead of one of the other notations are rare, > > How often do you need a name for a commit shown by a command and can > accept that it is not stable? I can accept it as long as it's stable inside my own repo. Maybe I need the SHA1 to talk about it wherever it may roam. I think you could count in the other direction (i.e. from the roots instead of the leaves) to get fairly stable symbolic names. Also, I don't think I need to see the hashes for trees and blobs most of the time. > I usually need a name because I > want to reference that commit later on, either because I need to > talk to > other users, or because I'm working on something and might need to > look > at that commit now and then, regardless on my current state of things. > One big exception in my workflow is when I use "git blame", then I > usually just need the name once to look at the full commit. But then I > prefer a 7-8 characters long sha-1 prefix to something like > improve_foo_speed~132^12~1^3. And "pseudo-stable" numbers have been > discussed to death. Okay, I "say uncle." >> and if hashes weren't so exposed I bet most interfaces would make >> those other names more available. One reason I think hashes retain >> their prominent exposure is that you have no other reasonably stable >> way of referring to commits, since branch~NN counts backward from >> HEAD. Adding such a thing would help. > > It counts backwards from "branch". Right, thanks. >> Oh, one other specific issue: the rev-parse manpage uses $GIT_DIR >> without saying what it is. I *think* that means the root of the >> working copy and has nothing to do with environment variables, but >> it's hard to be sure, and if I'm right about that, it's misleading >> notation. > > $GIT_DIR means the .git directory of a non-bare repo. Thanks for clarifying. But don't neglect to fix the docs so the next guy doesn't have to ask ;-) BTW, "[non-]bare repo" is yet another Git-specific jargon. I know what it means... again, only because I asked someone. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 0:39 ` David Abrahams @ 2009-04-26 23:35 ` Björn Steinbrink 0 siblings, 0 replies; 90+ messages in thread From: Björn Steinbrink @ 2009-04-26 23:35 UTC (permalink / raw) To: David Abrahams; +Cc: Jeff King, Michael Witten, J. Bruce Fields, git On 2009.04.24 20:39:12 -0400, David Abrahams wrote: > > On Apr 24, 2009, at 6:45 PM, Björn Steinbrink wrote: > >>> I think UI/API works way better than porcelain/plumbing. We are, >>> after all, programmers. >> >> We are programmers, but not all git users are programmers. > > I'm sure you will admit that the vast majority are programmers. This is > about speaking effectively to your primary audience. My experience says to at least drop the "vast", but that might be biased, due to the fact that the non-programmers probably need more time when you explain things to them. But thinking about it again, I don't think I like UI/API regardless of that. High-Level/Low-Level yes, but API? No. The plumbing is meant to be stable so it can serve as an API, and it also has options that only make sense when you use it that way (e.g. the parse-opt support in rev-parse) but I also happen to just use those programs as a UI. For example ls-files, ls-remote, or apply. And git(1) also has the sections titled "HIGH-LEVEL COMMANDS (PORCELAIN)" and "LOW_LEVEL COMMANDS (PLUMBING)". So if we were to get rid of the porcelain and plumbing terms, then _I_ would go for just "high-level commands" and "low-level commands". >>> I should also say, most of the docs and interfaces I see in Git (and >>> its wrappers, web interfaces, etc.) give the SHA1 hashes way too much >>> exposure. The times when it's actually more convenient to use a hash >>> instead of one of the other notations are rare, >> >> How often do you need a name for a commit shown by a command and can >> accept that it is not stable? > > I can accept it as long as it's stable inside my own repo. Maybe I > need the SHA1 to talk about it wherever it may roam. I think you > could count in the other direction (i.e. from the roots instead of the > leaves) to get fairly stable symbolic names. I'm sure this has been discussed in the earlier "stable revision numbers" threads as well, so you can find more information there, but I just want to mention that one drawback of this is that those numbers still have no notion of "commit age". You could have 5000 commits in your repo, and then you fetch someone elses stuff that might have some very old stuff that you don't have yet. And that gets high numbers now. So 5051 might be older than 200. Doesn't exactly help to make those numbers "useful". Just like the "gaps" you get by using e.g. rebase -i or other means that cause commits to be garbage collected. > Also, I don't think I need to see the hashes for trees and blobs most of > the time. OK, I think finally see what you might mean there. I'm almost exclusively using the CLI and gitk and seldomly see tree/blob object names in a prominent way unless I ask for them. But I just noticed that gitweb is at least showing a "daunting" number of object names without further details when you ask for a "commit", while the "commitdiff" is closer to what "git show <commit>" would show. And yeah, I think that could be improved, moving the object name more into the "background" (I don't think it should be completely removed, just be less prominent). Any other "high-level" tool that you noticed being noisy about tree/blob hashes? >>> Oh, one other specific issue: the rev-parse manpage uses $GIT_DIR >>> without saying what it is. I *think* that means the root of the >>> working copy and has nothing to do with environment variables, but >>> it's hard to be sure, and if I'm right about that, it's misleading >>> notation. >> >> $GIT_DIR means the .git directory of a non-bare repo. > > > Thanks for clarifying. But don't neglect to fix the docs so the next > guy doesn't have to ask ;-) Hm, I provide the information, you provide the patch? ;-) Hm, maybe I'll find some time to provide one myself. But my git and general todo lists already grew beyond all limits... > BTW, "[non-]bare repo" is yet another Git-specific jargon. I know what > it means... again, only because I asked someone. At least "bare repository" appears as an entry in the glossary (gitglossary(7), also reachable via "git help glossary"). Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 20:45 ` Michael Witten 2009-04-23 21:31 ` David Abrahams @ 2009-04-24 14:11 ` Jeff King 2009-04-24 14:30 ` Michael Witten 1 sibling, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-24 14:11 UTC (permalink / raw) To: Michael Witten; +Cc: J. Bruce Fields, David Abrahams, git On Thu, Apr 23, 2009 at 03:45:46PM -0500, Michael Witten wrote: > However, a discussion of blobs, trees, commits, objects, and > references isn't necessarily low-level. It seems to me that it is a > high-level understanding of the git world. Without those > *definitions*, people are left to their own wrong, inconsistent > thoughts. > > The low-level stuff is HOW those concepts have been used in the > implementation of git: Where certain files are stored, how certain > bytes are organized in memory, what are the underlying porcelain > tools, etc. That what's low-level. I think I wasn't clear in my original message. I didn't mean teaching low-level stuff like plumbing or file layouts. By "bottom-up" I really meant teaching concepts (like objects, their types, and references), from which user operations and workflows can be explained (or often deduced by the user). Whereas a top-down approach would _start_ with workflows and say "To accomplish X, do Y". So I think we are in agreement about the right "level" to start at. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 14:11 ` Jeff King @ 2009-04-24 14:30 ` Michael Witten 2009-04-24 14:33 ` Michael Witten 2009-04-24 15:04 ` Jeff King 0 siblings, 2 replies; 90+ messages in thread From: Michael Witten @ 2009-04-24 14:30 UTC (permalink / raw) To: Jeff King; +Cc: J. Bruce Fields, David Abrahams, git On Fri, Apr 24, 2009 at 09:11, Jeff King <peff@peff.net> wrote: > I think I wasn't clear in my original message. I didn't mean teaching > low-level stuff like plumbing or file layouts. By "bottom-up" I really > meant teaching concepts (like objects, their types, and references), > from which user operations and workflows can be explained (or often > deduced by the user). Whereas a top-down approach would _start_ with > workflows and say "To accomplish X, do Y". I knew you would make exactly this rebuttle ;-D However, notice that you can't reasonably be expected to understand "accomplish X" without having concepts like objects and references. The reason most people get by is that git's operation can be compatible with a number of other theories people might have already picked up from using computers. The trouble starts when their existing theories don't mesh well with the underlying git theory, leading the user to develop the equivalent of epicycles in order to explain to himself whats going on. Basically, the problem is that the documentation is currently catering for people, who just want to download source files (as Bruce basically said); a quick shell synopsis for this is fine, but there needs to be documentation solely devoted to understanding git fully and precisely. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 14:30 ` Michael Witten @ 2009-04-24 14:33 ` Michael Witten 2009-04-24 15:04 ` Jeff King 1 sibling, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-04-24 14:33 UTC (permalink / raw) To: Jeff King; +Cc: J. Bruce Fields, David Abrahams, git On Fri, Apr 24, 2009 at 09:30, Michael Witten <mfwitten@gmail.com> wrote: > there needs to be > documentation solely devoted to understanding git fully and precisely. A user should be able to read from top-to bottom in one-pass----no jumping around or later clarifications. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 14:30 ` Michael Witten 2009-04-24 14:33 ` Michael Witten @ 2009-04-24 15:04 ` Jeff King 2009-04-24 15:18 ` Michael Witten 1 sibling, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-24 15:04 UTC (permalink / raw) To: Michael Witten; +Cc: J. Bruce Fields, David Abrahams, git On Fri, Apr 24, 2009 at 09:30:20AM -0500, Michael Witten wrote: > On Fri, Apr 24, 2009 at 09:11, Jeff King <peff@peff.net> wrote: > > I think I wasn't clear in my original message. I didn't mean teaching > > low-level stuff like plumbing or file layouts. By "bottom-up" I really > > meant teaching concepts (like objects, their types, and references), > > from which user operations and workflows can be explained (or often > > deduced by the user). Whereas a top-down approach would _start_ with > > workflows and say "To accomplish X, do Y". > > I knew you would make exactly this rebuttle ;-D > > However, notice that you can't reasonably be expected to understand > "accomplish X" without having concepts like objects and references. Heh. I don't think you also predicted the paragraph that I ended up deleting, which made it more clear that I was not trying to rebut, but rather agree. Like you, I think that not teaching concepts first leads to confusion later. Version control (or at least git) is just complex enough that you are much better off understanding what is happening than simply following a recipe. So when your recipe doesn't go as planned, or you don't know which recipe to use, or you need some variant of a recipe, you have some basis for understanding what to do. But users in the past have really seemed to want to start with recipes, so that they can be productive as soon as possible (and I think some people have said that the top-down ordering just makes more sense to them, so it may just be a matter of learning style). And I think the user manual is somewhat of a response to that request, since the command manpages are very bottom-up (but are also quite confusing, just because of their size, and because concept information is scattered throughout). So I am advocating for more bottom-up documentation (which I think you are), but I don't necessarily think it should _replace_ the top-down documentation (which I'm not sure is your position or not). > The reason most people get by is that git's operation can be > compatible with a number of other theories people might have already > picked up from using computers. The trouble starts when their existing > theories don't mesh well with the underlying git theory, leading the > user to develop the equivalent of epicycles in order to explain to > himself whats going on. Epicycles? I thought commit orbits were defined by the ether through they flowed. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 15:04 ` Jeff King @ 2009-04-24 15:18 ` Michael Witten 2009-04-24 17:38 ` J. Bruce Fields 0 siblings, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-04-24 15:18 UTC (permalink / raw) To: Jeff King; +Cc: J. Bruce Fields, David Abrahams, git On Fri, Apr 24, 2009 at 10:04, Jeff King <peff@peff.net> wrote: > On Fri, Apr 24, 2009 at 09:30:20AM -0500, Michael Witten wrote: > >> On Fri, Apr 24, 2009 at 09:11, Jeff King <peff@peff.net> wrote: >> > I think I wasn't clear in my original message. I didn't mean teaching >> > low-level stuff like plumbing or file layouts. By "bottom-up" I really >> > meant teaching concepts (like objects, their types, and references), >> > from which user operations and workflows can be explained (or often >> > deduced by the user). Whereas a top-down approach would _start_ with >> > workflows and say "To accomplish X, do Y". >> >> I knew you would make exactly this rebuttle ;-D >> >> However, notice that you can't reasonably be expected to understand >> "accomplish X" without having concepts like objects and references. > > Heh. I don't think you also predicted the paragraph that I ended up > deleting, which made it more clear that I was not trying to rebut, but > rather agree. Indeed. I saw that last sentence of yours, but I consciously ignored it, because I like to argue ;-) > Like you, I think that not teaching concepts first leads to confusion > later. Version control (or at least git) is just complex enough that > you are much better off understanding what is happening than simply > following a recipe. So when your recipe doesn't go as planned, or you > don't know which recipe to use, or you need some variant of a recipe, > you have some basis for understanding what to do. That, my friend, is the most important lesson of learning. > But users in the past have really seemed to want to start with recipes, > so that they can be productive as soon as possible (and I think some > people have said that the top-down ordering just makes more sense to > them, so it may just be a matter of learning style). And I think the > user manual is somewhat of a response to that request, since the > command manpages are very bottom-up (but are also quite confusing, just > because of their size, and because concept information is scattered > throughout). > > So I am advocating for more bottom-up documentation (which I think you > are), but I don't necessarily think it should _replace_ the top-down > documentation (which I'm not sure is your position or not). I think that we've already got that tutorial-esque style covered (I haven't read it in a while): http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html However, the User Manual should make a Mathematician happy. >> The reason most people get by is that git's operation can be >> compatible with a number of other theories people might have already >> picked up from using computers. The trouble starts when their existing >> theories don't mesh well with the underlying git theory, leading the >> user to develop the equivalent of epicycles in order to explain to >> himself whats going on. > > Epicycles? I thought commit orbits were defined by the ether through > they flowed. Actually, those commit orbits are defined by the giant glass sphere to which they are attached. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 15:18 ` Michael Witten @ 2009-04-24 17:38 ` J. Bruce Fields 2009-04-24 18:27 ` Jeff King ` (2 more replies) 0 siblings, 3 replies; 90+ messages in thread From: J. Bruce Fields @ 2009-04-24 17:38 UTC (permalink / raw) To: Michael Witten; +Cc: Jeff King, David Abrahams, git On Fri, Apr 24, 2009 at 10:18:15AM -0500, Michael Witten wrote: > I think that we've already got that tutorial-esque style covered (I > haven't read it in a while): > > http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html > > However, the User Manual should make a Mathematician happy. I'm all for making mathematicians happy. But, again, help?: - Specific examples? - Patches? Please, patches? - Suggested text? - Suggested outline? There's no shortage of high-level ideas. What there's always a need for more of is people willing to submit patches, respond to review, etc. --b. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 17:38 ` J. Bruce Fields @ 2009-04-24 18:27 ` Jeff King 2009-04-24 18:35 ` J. Bruce Fields [not found] ` <34BD51FF-0908-48A8-BBBC-E27B0EFB32E5@boostpro.com> 2009-04-24 19:12 ` Michael Witten 2 siblings, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-24 18:27 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Michael Witten, David Abrahams, git On Fri, Apr 24, 2009 at 01:38:52PM -0400, J. Bruce Fields wrote: > On Fri, Apr 24, 2009 at 10:18:15AM -0500, Michael Witten wrote: > > I think that we've already got that tutorial-esque style covered (I > > haven't read it in a while): > > > > http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html > > > > However, the User Manual should make a Mathematician happy. > > I'm all for making mathematicians happy. But, again, help?: > > - Specific examples? > - Patches? Please, patches? > - Suggested text? > - Suggested outline? > > There's no shortage of high-level ideas. What there's always a need for > more of is people willing to submit patches, respond to review, etc. I usually hate to "me too", but I really want to second this notion. We have been getting minor documentation fixups trickling in, and I think those really help, and maybe they eventually would make the documentation perfect. But I have the feeling we would benefit from somebody taking ownership and considering the big picture of how the documentation fits together, and then really pushing it forward with something concrete. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 18:27 ` Jeff King @ 2009-04-24 18:35 ` J. Bruce Fields 0 siblings, 0 replies; 90+ messages in thread From: J. Bruce Fields @ 2009-04-24 18:35 UTC (permalink / raw) To: Jeff King; +Cc: Michael Witten, David Abrahams, git On Fri, Apr 24, 2009 at 02:27:52PM -0400, Jeff King wrote: > On Fri, Apr 24, 2009 at 01:38:52PM -0400, J. Bruce Fields wrote: > > > On Fri, Apr 24, 2009 at 10:18:15AM -0500, Michael Witten wrote: > > > I think that we've already got that tutorial-esque style covered (I > > > haven't read it in a while): > > > > > > http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html > > > > > > However, the User Manual should make a Mathematician happy. > > > > I'm all for making mathematicians happy. But, again, help?: > > > > - Specific examples? > > - Patches? Please, patches? > > - Suggested text? > > - Suggested outline? > > > > There's no shortage of high-level ideas. What there's always a need for > > more of is people willing to submit patches, respond to review, etc. > > I usually hate to "me too", but I really want to second this notion. We > have been getting minor documentation fixups trickling in, and I think > those really help, and maybe they eventually would make the > documentation perfect. But I have the feeling we would benefit from > somebody taking ownership and considering the big picture of how the > documentation fits together, and then really pushing it forward with > something concrete. Yup, and dealing seriously with objections, getting concensus for the resulting solutions, etc--in other words, being a maintainer. I thought I'd be able to do that at some point, but just haven't consistently had the time. That said, several smaller suggestions have been made which could be handled now: - I don't think I've seen objections to the idea of a git-revision-specifying manpage, whatever you want to call it--so probably that just needs someone to write the patch. - There've been complaints about terms being used before they're defined sufficiently well. I can believe it, but: specific examples would help! --b. ^ permalink raw reply [flat|nested] 90+ messages in thread
[parent not found: <34BD51FF-0908-48A8-BBBC-E27B0EFB32E5@boostpro.com>]
* Re: [doc] User Manual Suggestion [not found] ` <34BD51FF-0908-48A8-BBBC-E27B0EFB32E5@boostpro.com> @ 2009-04-24 18:52 ` J. Bruce Fields 2009-04-25 10:35 ` Felipe Contreras 0 siblings, 1 reply; 90+ messages in thread From: J. Bruce Fields @ 2009-04-24 18:52 UTC (permalink / raw) To: David Abrahams; +Cc: Michael Witten, Jeff King, git On Fri, Apr 24, 2009 at 02:32:36PM -0400, David Abrahams wrote: > > On Apr 24, 2009, at 1:38 PM, J. Bruce Fields wrote: > >> On Fri, Apr 24, 2009 at 10:18:15AM -0500, Michael Witten wrote: >>> I think that we've already got that tutorial-esque style covered (I >>> haven't read it in a while): >>> >>> http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html >>> >>> However, the User Manual should make a Mathematician happy. >> >> I'm all for making mathematicians happy. But, again, help?: >> >> - Specific examples? >> - Patches? Please, patches? >> - Suggested text? >> - Suggested outline? >> >> There's no shortage of high-level ideas. What there's always a need >> for >> more of is people willing to submit patches, respond to review, etc. > > > I'll probably try to write something myself once I figure this stuff > out. That would be great, thanks. Several people have gone off and posted their own tutorials someplace, and that's fine, but it would be especially helpful if you could contribute to the actual Documentation/ directory. That may mean arguing with people and making compromises. But it also means the results will be distributed with git, will be integrated with other git documentation, and will get first-class technical review. I'd also encourage incrementally improving existing documentation where possible instead of starting over from scratch. But having broken that rule myself a couple times I'm hardly in a position to insist. If you must start over, at least think about how to replace or fit it in with existing documentation. --b. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 18:52 ` J. Bruce Fields @ 2009-04-25 10:35 ` Felipe Contreras 0 siblings, 0 replies; 90+ messages in thread From: Felipe Contreras @ 2009-04-25 10:35 UTC (permalink / raw) To: J. Bruce Fields; +Cc: David Abrahams, Michael Witten, Jeff King, git On Fri, Apr 24, 2009 at 9:52 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > That would be great, thanks. Several people have gone off and posted > their own tutorials someplace, and that's fine, but it would be > especially helpful if you could contribute to the actual Documentation/ > directory. That may mean arguing with people and making compromises. > But it also means the results will be distributed with git, will be > integrated with other git documentation, and will get first-class > technical review. > > I'd also encourage incrementally improving existing documentation where > possible instead of starting over from scratch. But having broken that > rule myself a couple times I'm hardly in a position to insist. If you > must start over, at least think about how to replace or fit it in with > existing documentation. People will continue to write git documentation from scratch because there is a huge gap from the top-bottom approach to a point where you actually "get git", and people are trying to find short-cuts so that other people can really get it too. I spent years using git simply repeating the templates I had seen in multiple places until I stumbled upon "git from the bottom up" and then I finally understood the beauty and simplicity of git's design. From that point I understood why many command didn't do what I expected. Note that "bottom" doesn't mean plumbing, the "plumbing" is usually referred to the git.git tools, but you can work with git low-level objects through your own implementation as people like Scott Chacon have indeed done (git-ruby). "bottom" then means git basic building blocks: blobs, trees, commits, refs. Ideally the UI should expose the basic concepts of git, but instead its is hiding them, so no wonder people *need* special documentation to 'understand git conceptually', or learn 'git from the bottom up', etc. -- Felipe Contreras ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 17:38 ` J. Bruce Fields 2009-04-24 18:27 ` Jeff King [not found] ` <34BD51FF-0908-48A8-BBBC-E27B0EFB32E5@boostpro.com> @ 2009-04-24 19:12 ` Michael Witten 2 siblings, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-04-24 19:12 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Jeff King, David Abrahams, git On Fri, Apr 24, 2009 at 12:38, J. Bruce Fields <bfields@fieldses.org> wrote: > I'm all for making mathematicians happy. But, again, help?: I intend to help, but I have a terrible tendency to shave the GNU; right now, I'm waist deep in shavings. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 20:16 ` Jeff King 2009-04-23 20:45 ` Michael Witten @ 2009-04-23 21:26 ` David Abrahams 2009-04-23 22:51 ` Johan Herland 1 sibling, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-23 21:26 UTC (permalink / raw) To: Jeff King; +Cc: Michael Witten, J. Bruce Fields, git On Apr 23, 2009, at 4:16 PM, Jeff King wrote: > On Thu, Apr 23, 2009 at 01:37:05PM -0500, Michael Witten wrote: > >> Everyone talks about "before one has the conceptual foundation >> necessary to understand". Well, here's an idea: The git documentation >> should start with the concepts! >> >> Why don't the docs start out defining blobs and trees and the object >> database and references into that database? The reason everything is >> so confusing is that the understanding is brushed under the tutorial >> rug. People need to learn how to think before they can effectively >> learn to start doing. > > I agree with you, but not everyone does (and you can find prior > debates > in the list archives). The user-manual is pretty "top down". And that's a problem because so many things are badly named. It also leaves out lots of top > There are > some "bottom-up" resources available, but I haven't seen one pointed > to > as "definitive". I've been pointed at: 1. http://eagain.net/articles/git-for-computer-scientists 2. http://www.newartisans.com/2008/04/git-from-the-bottom-up.html which, IMO, should be read in that order. I've just sent John Wiegley a huge pile of editorial commentary on #2, which I think could improve things. But that said, "laying conceptual foundation" doesn't imply bottom- up! In fact, I don't think the first one is particularly bottom-up -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 21:26 ` David Abrahams @ 2009-04-23 22:51 ` Johan Herland 2009-04-24 0:30 ` Michael Witten 0 siblings, 1 reply; 90+ messages in thread From: Johan Herland @ 2009-04-23 22:51 UTC (permalink / raw) To: git; +Cc: David Abrahams, Jeff King, Michael Witten, J. Bruce Fields On Thursday 23 April 2009, David Abrahams wrote: > On Apr 23, 2009, at 4:16 PM, Jeff King wrote: > > There are some "bottom-up" resources available, but I haven't seen one > > pointed to as "definitive". > I've been pointed at: > > 1. http://eagain.net/articles/git-for-computer-scientists > 2. http://www.newartisans.com/2008/04/git-from-the-bottom-up.html There's also http://www.eecs.harvard.edu/~cduan/technical/git/ which I think is a great bottom-up introduction: - not too heavy on the concepts - shows how the concepts relates to common git commands - short enough to be covered in just 1-2 sessions. In fact, I'm loosely planning a presentation on Git (for $dayjob), and I'm probably going to base it on this introduction. Have fun! :) ...Johan -- Johan Herland, <johan@herland.net> www.herland.net ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 22:51 ` Johan Herland @ 2009-04-24 0:30 ` Michael Witten 2009-04-24 20:30 ` Johan Herland 0 siblings, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-04-24 0:30 UTC (permalink / raw) To: Johan Herland; +Cc: git, David Abrahams, Jeff King, J. Bruce Fields On Thu, Apr 23, 2009 at 17:51, Johan Herland <johan@herland.net> wrote: > On Thursday 23 April 2009, David Abrahams wrote: >> On Apr 23, 2009, at 4:16 PM, Jeff King wrote: >> > There are some "bottom-up" resources available, but I haven't seen one >> > pointed to as "definitive". >> I've been pointed at: >> >> 1. http://eagain.net/articles/git-for-computer-scientists >> 2. http://www.newartisans.com/2008/04/git-from-the-bottom-up.html > > There's also http://www.eecs.harvard.edu/~cduan/technical/git/ which I think > is a great bottom-up introduction: > - not too heavy on the concepts I really don't understand this mentality. Concepts are the only things that are important. From concepts falls all else. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 0:30 ` Michael Witten @ 2009-04-24 20:30 ` Johan Herland 2009-04-24 21:34 ` Daniel Barkalow 0 siblings, 1 reply; 90+ messages in thread From: Johan Herland @ 2009-04-24 20:30 UTC (permalink / raw) To: Michael Witten; +Cc: git, David Abrahams, Jeff King, J. Bruce Fields On Friday 24 April 2009, Michael Witten wrote: > On Thu, Apr 23, 2009 at 17:51, Johan Herland <johan@herland.net> wrote: > > There's also http://www.eecs.harvard.edu/~cduan/technical/git/ which I > > think is a great bottom-up introduction: > > - not too heavy on the concepts > > I really don't understand this mentality. Concepts are the only things > that are important. From concepts falls all else. Sorry for not being clear: Concepts are indeed (and should be) important. What I mean is that the concepts introduced are short and simple enough for novice users to understand (without much VCS experience, if any at all). If we start off _too_ detailed, we risk loosing the audience, and no one is better off. Like Jeff King said elsewhere in this thread: We want to start a little higher from the bottom. The above introduction does not focus on blobs or trees, but manages to introduce Git in a useful manner by starting off with only two concepts: commits and refs. With only these two concepts, and showing how high-level commands (remember: no plumbing) work with these concepts, I believe it is possible to teach anyone to use Git well. Of course, as users progress towards becoming power-users, more concepts are needed, but I don't think these are needed from the start. As Einstein might have said: As simple as possible, but no simpler. Have fun! ...Johan -- Johan Herland, <johan@herland.net> www.herland.net ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 20:30 ` Johan Herland @ 2009-04-24 21:34 ` Daniel Barkalow 2009-04-24 21:38 ` Jeff King 0 siblings, 1 reply; 90+ messages in thread From: Daniel Barkalow @ 2009-04-24 21:34 UTC (permalink / raw) To: Johan Herland Cc: Michael Witten, git, David Abrahams, Jeff King, J. Bruce Fields On Fri, 24 Apr 2009, Johan Herland wrote: > On Friday 24 April 2009, Michael Witten wrote: > > On Thu, Apr 23, 2009 at 17:51, Johan Herland <johan@herland.net> wrote: > > > There's also http://www.eecs.harvard.edu/~cduan/technical/git/ which I > > > think is a great bottom-up introduction: > > > - not too heavy on the concepts > > > > I really don't understand this mentality. Concepts are the only things > > that are important. From concepts falls all else. > > Sorry for not being clear: Concepts are indeed (and should be) important. > What I mean is that the concepts introduced are short and simple enough for > novice users to understand (without much VCS experience, if any at all). If > we start off _too_ detailed, we risk loosing the audience, and no one is > better off. > > Like Jeff King said elsewhere in this thread: We want to start a little > higher from the bottom. The above introduction does not focus on blobs or > trees, but manages to introduce Git in a useful manner by starting off with > only two concepts: commits and refs. I'd say that blobs and trees are an implementation detail of "the full content of a version of the project", not something conceptually important. Likewise, the date representation used in commits isn't important. It might be worth saying that git purposefully discards any information in your filesystem that is just incidental and not project content, like whether other users on the system where the working directory is can access your files; but a full enumeration of what the "content" and "incidental" categories contain can go in an appendix or something. (FWIW, git originally didn't use tree objects for subdirectories or mask out the g+w bit from tree entries. These weren't conceptual changes, but implementation details.) -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 21:34 ` Daniel Barkalow @ 2009-04-24 21:38 ` Jeff King 2009-04-24 22:18 ` Michael Witten ` (2 more replies) 0 siblings, 3 replies; 90+ messages in thread From: Jeff King @ 2009-04-24 21:38 UTC (permalink / raw) To: Daniel Barkalow Cc: Johan Herland, Michael Witten, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 05:34:00PM -0400, Daniel Barkalow wrote: > I'd say that blobs and trees are an implementation detail of "the full > content of a version of the project", not something conceptually > important. Likewise, the date representation used in commits isn't I disagree. I think it's important to note that trees and blobs have a name, and you can refer to them. Once you know that, the fact that you can do: git show master git show master:Documentation git show master:Makefile just makes sense. You are always just specifying an object, but the type is different for each (and show "does the right thing" based on object type). No, that isn't critical for understanding how _commit_ operations work, but I think that is exactly the sort of conceptual knowledge that let people use git more fully. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 21:38 ` Jeff King @ 2009-04-24 22:18 ` Michael Witten 2009-04-24 22:25 ` Michael Witten 2009-04-24 23:16 ` Björn Steinbrink 2009-04-24 23:21 ` Daniel Barkalow 2009-04-25 0:19 ` David Abrahams 2 siblings, 2 replies; 90+ messages in thread From: Michael Witten @ 2009-04-24 22:18 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 16:38, Jeff King <peff@peff.net> wrote: > On Fri, Apr 24, 2009 at 05:34:00PM -0400, Daniel Barkalow wrote: > >> I'd say that blobs and trees are an implementation detail of "the full >> content of a version of the project", not something conceptually >> important. Likewise, the date representation used in commits isn't > ... > No, that isn't critical for understanding how _commit_ operations work, > but I think that is exactly the sort of conceptual knowledge that let > people use git more fully. I think the key conlusion here is that the main concepts are *objects* and references to those objects. One type of object is not necessarily more low-level or high-level than another type of object; each type of object is the most important type of object for a particular task in or view of the git world. > I disagree. I think it's important to note that trees and blobs have a > name, and you can refer to them. Once you know that, the fact that you > can do: > > git show master > git show master:Documentation > git show master:Makefile > > just makes sense. You are always just specifying an object, but the type > is different for each (and show "does the right thing" based on object > type). In fact, I think it's important to note that the notation: git show master:Makefile actually involves a translation from a Unix filesystem address to a git object address that is then used to find the relevant data. In fact, I think masking this kind of thing with a catch-all word 'reference' is a bad idea. Rather than being hidden, it should be exposed: I think it would be beneficial to use the word 'address' rather than 'reference' when talking about the SHA-1 names. Then HEAD could be called a pointer variable, etc. So, a pointer variable's value is an object address that is the location of an object in git 'memory'. I think using this approach would make things significantly more transparent. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 22:18 ` Michael Witten @ 2009-04-24 22:25 ` Michael Witten 2009-04-24 23:11 ` Daniel Barkalow 2009-04-24 23:16 ` Björn Steinbrink 1 sibling, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-04-24 22:25 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 17:18, Michael Witten <mfwitten@gmail.com> wrote: > In fact, I think masking this kind of thing with a catch-all word > 'reference' is a bad idea. Rather than being hidden, it should be > exposed: I think it would be beneficial to use the word 'address' > rather than 'reference' when talking about the SHA-1 names. Then HEAD > could be called a pointer variable, etc. > > So, a pointer variable's value is an object address that is the > location of an object in git 'memory'. I think using this approach > would make things significantly more transparent. In fact, it's not particularly important that SHA-1 is used to compute the address into git memory. The only thing that's important is that the address is determined by content alone (I'm not even sure that specifying that the address is a cryptographically sound hash of the content is important; shouldn't that follow from the declaration that it must be uniquely based on content alone?); the fact that's a SHA-1 is purely an implementation detail, and so it shouldn't appear prominently in the documentation. So, what do you say? Let's start a reformation of the git terminology to use analogies that have been around since the dawn of computing: 'memory', 'address', and 'pointer'. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 22:25 ` Michael Witten @ 2009-04-24 23:11 ` Daniel Barkalow 2009-04-24 23:14 ` Jeff King ` (2 more replies) 0 siblings, 3 replies; 90+ messages in thread From: Daniel Barkalow @ 2009-04-24 23:11 UTC (permalink / raw) To: Michael Witten Cc: Jeff King, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, 24 Apr 2009, Michael Witten wrote: > On Fri, Apr 24, 2009 at 17:18, Michael Witten <mfwitten@gmail.com> wrote: > > In fact, I think masking this kind of thing with a catch-all word > > 'reference' is a bad idea. Rather than being hidden, it should be > > exposed: I think it would be beneficial to use the word 'address' > > rather than 'reference' when talking about the SHA-1 names. Then HEAD > > could be called a pointer variable, etc. > > > > So, a pointer variable's value is an object address that is the > > location of an object in git 'memory'. I think using this approach > > would make things significantly more transparent. > > In fact, it's not particularly important that SHA-1 is used to compute > the address into git memory. The only thing that's important is that > the address is determined by content alone (I'm not even sure that > specifying that the address is a cryptographically sound hash of the > content is important; shouldn't that follow from the declaration that > it must be uniquely based on content alone?); the fact that's a SHA-1 > is purely an implementation detail, and so it shouldn't appear > prominently in the documentation. > > So, what do you say? > > Let's start a reformation of the git terminology to use analogies that > have been around since the dawn of computing: 'memory', 'address', and > 'pointer'. I actually think calling them "sha1s" is better, simply because this bit of jargon doesn't mean anything else (git deals with email, so "address" is overloaded). And the term is already in use for this particular case, and it doesn't mean anything else at all (since, of course, the crypto thing is "SHA-1", not "sha1"), and it's short (which is important for making it easy to look at usage help). -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:11 ` Daniel Barkalow @ 2009-04-24 23:14 ` Jeff King 2009-04-24 23:18 ` Michael Witten ` (2 more replies) 2009-04-24 23:26 ` Michael Witten 2009-04-25 0:41 ` David Abrahams 2 siblings, 3 replies; 90+ messages in thread From: Jeff King @ 2009-04-24 23:14 UTC (permalink / raw) To: Daniel Barkalow Cc: Michael Witten, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 07:11:40PM -0400, Daniel Barkalow wrote: > > Let's start a reformation of the git terminology to use analogies that > > have been around since the dawn of computing: 'memory', 'address', and > > 'pointer'. > > I actually think calling them "sha1s" is better, simply because this bit > of jargon doesn't mean anything else (git deals with email, so "address" > is overloaded). And the term is already in use for this particular case, > and it doesn't mean anything else at all (since, of course, the crypto > thing is "SHA-1", not "sha1"), and it's short (which is important for > making it easy to look at usage help). Junio suggested "object name" in another thread, which I think is nicely descriptive. FWIW, I think the pointer nomenclature has terrible connotations. I think everyone who works on git groks pointers just fine, but aren't they generally reviled among the progrmaming populace as the most complex and error-prone part of learning to program? Do we really need to increase git's reputation as complex and error-prone? ;) -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:14 ` Jeff King @ 2009-04-24 23:18 ` Michael Witten 2009-04-24 23:31 ` Michael Witten 2009-04-25 10:18 ` Felipe Contreras 2 siblings, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-04-24 23:18 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 18:14, Jeff King <peff@peff.net> wrote: > but aren't > they generally reviled among the progrmaming populace as the most > complex and error-prone part of learning to program? And now you know why people struggle with git; as I said in a previous email: http://marc.info/?l=git&m=124022418313288&w=2 I think that the human brain struggles with indirection. Consider that so many programmers have a hard time understanding pointers; no wonderso many people find git's underlying concepts boggling. Of course, the difference here is that we're not asking people to do memory management; we have garbage collection. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:14 ` Jeff King 2009-04-24 23:18 ` Michael Witten @ 2009-04-24 23:31 ` Michael Witten 2009-04-24 23:35 ` Jeff King 2009-04-25 10:18 ` Felipe Contreras 2 siblings, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-04-24 23:31 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 18:14, Jeff King <peff@peff.net> wrote: > Junio suggested "object name" in another thread, which I think is nicely > descriptive. The reason I don't like "object name" is that "name" has connotations that don't go well with the idea of referencing. Isn't "address" (or "location") better in this sense? ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:31 ` Michael Witten @ 2009-04-24 23:35 ` Jeff King 2009-04-25 0:19 ` Michael Witten 0 siblings, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-24 23:35 UTC (permalink / raw) To: Michael Witten Cc: Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 06:31:26PM -0500, Michael Witten wrote: > On Fri, Apr 24, 2009 at 18:14, Jeff King <peff@peff.net> wrote: > > Junio suggested "object name" in another thread, which I think is nicely > > descriptive. > > The reason I don't like "object name" is that "name" has connotations > that don't go well with the idea of referencing. Isn't "address" (or > "location") better in this sense? I'm not sure I agree, but if you are concerned with "name", then I think something like "object id" or "object identifier" would probably be better. "address" and "location" imply to me that they are part of a contiguous set. And while technically they may be considered addresses of a sparse 2^160 array, I'm not sure that explanation is really helping new users understand what is going on. What the user really cares about is that it is persistent and unambiguous. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:35 ` Jeff King @ 2009-04-25 0:19 ` Michael Witten 0 siblings, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-04-25 0:19 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 18:35, Jeff King <peff@peff.net> wrote: > On Fri, Apr 24, 2009 at 06:31:26PM -0500, Michael Witten wrote: > >> On Fri, Apr 24, 2009 at 18:14, Jeff King <peff@peff.net> wrote: >> > Junio suggested "object name" in another thread, which I think is nicely >> > descriptive. >> >> The reason I don't like "object name" is that "name" has connotations >> that don't go well with the idea of referencing. Isn't "address" (or >> "location") better in this sense? > > I'm not sure I agree, but if you are concerned with "name", then I think > something like "object id" or "object identifier" would probably be > better. "address" and "location" imply to me that they are part of a > contiguous set. And while technically they may be considered addresses > of a sparse 2^160 array, I'm not sure that explanation is really helping > new users understand what is going on. You make an interesting point about implied contiguousness, but I don't think any git operation is in danger of evoking that thought. I mainly like the idea of "address" and "location", because they go extremely well with "pointer", "handle" and the idea of a "git store (memory)". Most importantly, this is an analogy that has been around a long time. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:14 ` Jeff King 2009-04-24 23:18 ` Michael Witten 2009-04-24 23:31 ` Michael Witten @ 2009-04-25 10:18 ` Felipe Contreras 2 siblings, 0 replies; 90+ messages in thread From: Felipe Contreras @ 2009-04-25 10:18 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Michael Witten, Johan Herland, git, David Abrahams, J. Bruce Fields On Sat, Apr 25, 2009 at 2:14 AM, Jeff King <peff@peff.net> wrote: > On Fri, Apr 24, 2009 at 07:11:40PM -0400, Daniel Barkalow wrote: > >> > Let's start a reformation of the git terminology to use analogies that >> > have been around since the dawn of computing: 'memory', 'address', and >> > 'pointer'. >> >> I actually think calling them "sha1s" is better, simply because this bit >> of jargon doesn't mean anything else (git deals with email, so "address" >> is overloaded). And the term is already in use for this particular case, >> and it doesn't mean anything else at all (since, of course, the crypto >> thing is "SHA-1", not "sha1"), and it's short (which is important for >> making it easy to look at usage help). > > Junio suggested "object name" in another thread, which I think is nicely > descriptive. It's not a name, it's an identification, so how about "id"? You have tree ids, commit ids, blob ids, and so on. -- Felipe Contreras ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:11 ` Daniel Barkalow 2009-04-24 23:14 ` Jeff King @ 2009-04-24 23:26 ` Michael Witten 2009-04-25 18:55 ` Daniel Barkalow 2009-04-25 0:41 ` David Abrahams 2 siblings, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-04-24 23:26 UTC (permalink / raw) To: Daniel Barkalow Cc: Jeff King, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 18:11, Daniel Barkalow <barkalow@iabervon.org> wrote: > On Fri, 24 Apr 2009, Michael Witten wrote: > >> On Fri, Apr 24, 2009 at 17:18, Michael Witten <mfwitten@gmail.com> wrote: >> > In fact, I think masking this kind of thing with a catch-all word >> > 'reference' is a bad idea. Rather than being hidden, it should be >> > exposed: I think it would be beneficial to use the word 'address' >> > rather than 'reference' when talking about the SHA-1 names. Then HEAD >> > could be called a pointer variable, etc. >> > >> > So, a pointer variable's value is an object address that is the >> > location of an object in git 'memory'. I think using this approach >> > would make things significantly more transparent. >> >> In fact, it's not particularly important that SHA-1 is used to compute >> the address into git memory. The only thing that's important is that >> the address is determined by content alone (I'm not even sure that >> specifying that the address is a cryptographically sound hash of the >> content is important; shouldn't that follow from the declaration that >> it must be uniquely based on content alone?); the fact that's a SHA-1 >> is purely an implementation detail, and so it shouldn't appear >> prominently in the documentation. >> >> So, what do you say? >> >> Let's start a reformation of the git terminology to use analogies that >> have been around since the dawn of computing: 'memory', 'address', and >> 'pointer'. > > I actually think calling them "sha1s" is better, simply because this bit > of jargon doesn't mean anything else (git deals with email, so "address" > is overloaded). I don't know if I buy that reason; the human brain is pretty good with context. I would at least like 'location' better. > And the term is already in use for this particular case, > and it doesn't mean anything else at all (since, of course, the crypto > thing is "SHA-1", not "sha1"), and it's short (which is important for > making it easy to look at usage help). What happens when SHA-1 is shown to be broken or there is a better alternative? Then we'll see "sha1 for historical reasons"... bleh! ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:26 ` Michael Witten @ 2009-04-25 18:55 ` Daniel Barkalow 2009-04-25 19:16 ` Michael Witten 0 siblings, 1 reply; 90+ messages in thread From: Daniel Barkalow @ 2009-04-25 18:55 UTC (permalink / raw) To: Michael Witten Cc: Jeff King, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, 24 Apr 2009, Michael Witten wrote: > > And the term is already in use for this particular case, > > and it doesn't mean anything else at all (since, of course, the crypto > > thing is "SHA-1", not "sha1"), and it's short (which is important for > > making it easy to look at usage help). > > What happens when SHA-1 is shown to be broken or there is a better > alternative? Then we'll see "sha1 for historical reasons"... bleh! Why do you think SHA-1 has anything to do with it? Git's sha1s could just as easily be 160 bits of a SHA-256 hash and there wouldn't be any user-visible difference. The term doesn't imply any particular significant connection to a particular algorithm. It could be like "pencil lead", which has never been made of lead, but is called that for no particularly important reason. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 18:55 ` Daniel Barkalow @ 2009-04-25 19:16 ` Michael Witten 2009-04-25 19:24 ` Felipe Contreras 0 siblings, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-04-25 19:16 UTC (permalink / raw) To: Daniel Barkalow Cc: Jeff King, Johan Herland, git, David Abrahams, J. Bruce Fields On Sat, Apr 25, 2009 at 13:55, Daniel Barkalow <barkalow@iabervon.org> wrote: > On Fri, 24 Apr 2009, Michael Witten wrote: > >> > And the term is already in use for this particular case, >> > and it doesn't mean anything else at all (since, of course, the crypto >> > thing is "SHA-1", not "sha1"), and it's short (which is important for >> > making it easy to look at usage help). >> >> What happens when SHA-1 is shown to be broken or there is a better >> alternative? Then we'll see "sha1 for historical reasons"... bleh! > > Why do you think SHA-1 has anything to do with it? Well, it's named sha1. > Git's sha1s could just > as easily be 160 bits of a SHA-256 hash and there wouldn't be any > user-visible difference. The term doesn't imply any particular significant > connection to a particular algorithm. Then give it a generic name like 'hash'. > It could be like "pencil lead", which has never been made of lead, > but is called that for no particularly important reason. Hence the perennial: "Hey! Did you know that pencil lead isn't lead at all?" to which someone might respond: "Why do you think lead has anything to do with it?" Look familiar? ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 19:16 ` Michael Witten @ 2009-04-25 19:24 ` Felipe Contreras 2009-04-25 19:36 ` David Abrahams 0 siblings, 1 reply; 90+ messages in thread From: Felipe Contreras @ 2009-04-25 19:24 UTC (permalink / raw) To: Michael Witten Cc: Daniel Barkalow, Jeff King, Johan Herland, git, David Abrahams, J. Bruce Fields On Sat, Apr 25, 2009 at 10:16 PM, Michael Witten <mfwitten@gmail.com> wrote: > On Sat, Apr 25, 2009 at 13:55, Daniel Barkalow <barkalow@iabervon.org> wrote: >> On Fri, 24 Apr 2009, Michael Witten wrote: >> >>> > And the term is already in use for this particular case, >>> > and it doesn't mean anything else at all (since, of course, the crypto >>> > thing is "SHA-1", not "sha1"), and it's short (which is important for >>> > making it easy to look at usage help). >>> >>> What happens when SHA-1 is shown to be broken or there is a better >>> alternative? Then we'll see "sha1 for historical reasons"... bleh! >> >> Why do you think SHA-1 has anything to do with it? > > Well, it's named sha1. > >> Git's sha1s could just >> as easily be 160 bits of a SHA-256 hash and there wouldn't be any >> user-visible difference. The term doesn't imply any particular significant >> connection to a particular algorithm. > > Then give it a generic name like 'hash'. For most purposes in the documentation sha1's are used as ids, so why don't use "id" instead? Like 'commit id'. The fact that the id is also a hash sum is hardly relevant for the user. -- Felipe Contreras ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 19:24 ` Felipe Contreras @ 2009-04-25 19:36 ` David Abrahams 2009-04-25 20:53 ` Felipe Contreras 2009-04-26 11:28 ` Björn Steinbrink 0 siblings, 2 replies; 90+ messages in thread From: David Abrahams @ 2009-04-25 19:36 UTC (permalink / raw) To: Felipe Contreras Cc: Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On Apr 25, 2009, at 3:24 PM, Felipe Contreras wrote: > On Sat, Apr 25, 2009 at 10:16 PM, Michael Witten > <mfwitten@gmail.com> wrote: >> On Sat, Apr 25, 2009 at 13:55, Daniel Barkalow >> <barkalow@iabervon.org> wrote: >>> On Fri, 24 Apr 2009, Michael Witten wrote: >>> >>>>> And the term is already in use for this particular case, >>>>> and it doesn't mean anything else at all (since, of course, the >>>>> crypto >>>>> thing is "SHA-1", not "sha1"), and it's short (which is >>>>> important for >>>>> making it easy to look at usage help). >>>> >>>> What happens when SHA-1 is shown to be broken or there is a better >>>> alternative? Then we'll see "sha1 for historical reasons"... bleh! >>> >>> Why do you think SHA-1 has anything to do with it? >> >> Well, it's named sha1. >> >>> Git's sha1s could just >>> as easily be 160 bits of a SHA-256 hash and there wouldn't be any >>> user-visible difference. The term doesn't imply any particular >>> significant >>> connection to a particular algorithm. >> >> Then give it a generic name like 'hash'. > > For most purposes in the documentation sha1's are used as ids, so why > don't use "id" instead? Like 'commit id'. The fact that the id is also > a hash sum is hardly relevant for the user. Where it's relevant when the user notices that two distinct files have the same id (because they happen to have the same contents) and wonders what's up. It's not a foregone conclusion that objects with the same value have identical ids, but it's immediately apparent if the id is known to be a hash. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 19:36 ` David Abrahams @ 2009-04-25 20:53 ` Felipe Contreras 2009-04-26 11:28 ` Björn Steinbrink 1 sibling, 0 replies; 90+ messages in thread From: Felipe Contreras @ 2009-04-25 20:53 UTC (permalink / raw) To: David Abrahams Cc: Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On Sat, Apr 25, 2009 at 10:36 PM, David Abrahams <dave@boostpro.com> wrote: > > On Apr 25, 2009, at 3:24 PM, Felipe Contreras wrote: > Where it's relevant when the user notices that two distinct files have the > same id (because they happen to have the same contents) and wonders what's > up. > > It's not a foregone conclusion that objects with the same value have > identical ids, but it's immediately apparent if the id is known to be a > hash. That's true. hash +1 -- Felipe Contreras ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 19:36 ` David Abrahams 2009-04-25 20:53 ` Felipe Contreras @ 2009-04-26 11:28 ` Björn Steinbrink 2009-04-26 13:55 ` David Abrahams 2009-04-26 16:36 ` Michael Witten 1 sibling, 2 replies; 90+ messages in thread From: Björn Steinbrink @ 2009-04-26 11:28 UTC (permalink / raw) To: David Abrahams Cc: Felipe Contreras, Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On 2009.04.25 15:36:24 -0400, David Abrahams wrote: > Where it's relevant when the user notices that two distinct files have > the same id (because they happen to have the same contents) and wonders > what's up. Why would the user have to care about the object files in the repo? And why would your implementation save the same object twice, in two distinct files? The SHA-1 hash is created from the object, that means the its type, size and data. It's not an id of a file in the working tree, but of an object. > It's not a foregone conclusion that objects with the same value have > identical ids, but it's immediately apparent if the id is known to be a > hash. You can't have two objects with the same contents to begin with, same content => same object. You can just have that one object stored multiple times in different places (for sane implementations this likely means that you have more than one repo to look at, and each has its own copy of that object, but that's nothing you as an user should have to care about). It's an identity relation: same name/id => same object. Unlike e.g. a hash-table where you are expected to deal with collisions, and having the same hash doesn't mean that you have identical data. But that's not true of git, it expects an identity relation, which is IMHO better expressed through "object name" or "object id". You can still say that the name/id is generated by using a hash function, but the important part is that the name/id is used to _uniquely_ identify an object, which isn't apparent when you call it a hash. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 11:28 ` Björn Steinbrink @ 2009-04-26 13:55 ` David Abrahams 2009-04-26 17:56 ` Björn Steinbrink 2009-04-26 16:36 ` Michael Witten 1 sibling, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-26 13:55 UTC (permalink / raw) To: Björn Steinbrink Cc: Felipe Contreras, Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On Apr 26, 2009, at 7:28 AM, Björn Steinbrink wrote: > On 2009.04.25 15:36:24 -0400, David Abrahams wrote: >> Where it's relevant when the user notices that two distinct files >> have >> the same id (because they happen to have the same contents) and >> wonders >> what's up. > > Why would the user have to care about the object files in the repo? What a strange question. I have no idea how to answer. It seems self- evident to me that users of a VCS care that their files are stored in it. > And > why would your implementation save the same object twice, in two > distinct files? One could easily have the expectation that contents can be duplicated because there are numerous precedents in everyone's experience of computing, for example in filesystems and in any programming language that is not pure-functional. > The SHA-1 hash is created from the object, that means > the its type, size and data. It's not an id of a file in the working > tree, but of an object All true. All somewhat subtle distinctions that are not nearly as apparent unless you actually use the word "hash" as I have been advocating. >> It's not a foregone conclusion that objects with the same value have >> identical ids, but it's immediately apparent if the id is known to >> be a >> hash. > > You can't have two objects with the same contents to begin with, same > content => same object. In the Git world, I agree. In general, I disagree. The fact that is so in the Git world is reinforced by the notion that the id of an object is a hash of its contents. > You can just have that one object stored > multiple times in different places (for sane implementations this > likely > means that you have more than one repo to look at, and each has its > own > copy of that object, but that's nothing you as an user should have to > care about). > It's an identity relation: same name/id => same object. Unlike e.g. a > hash-table where you are expected to deal with collisions, and having > the same hash doesn't mean that you have identical data. But that's > not > true of git, it expects an identity relation, which is IMHO better > expressed through "object name" or "object id". Yes, that's true in the Git world (though not necessarily elsewhere), or at least you hope it is. In fact, there's no guarantee that SHA1 collisions won't occur; it's just exremely unlikely. In fact, if you google it you can find some interesting papers about SHA1 collision. Another way to express what you wrote above: same same id => same hash ?=> same contents => same object where ?=> means "almost certainly implies." What you left out was the implication in the other direction, which is a true guarantee at all steps, and "hash" is well-understood to mean same contents => same hash > You can still say that > the name/id is generated by using a hash function, but the important > part is that the name/id is used to _uniquely_ identify an object, > which > isn't apparent when you call it a hash. I think the implication is important in both directions. Neither one is self-evident to a new user. Maybe the right answer is 'hash id'. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 13:55 ` David Abrahams @ 2009-04-26 17:56 ` Björn Steinbrink 2009-04-26 20:17 ` David Abrahams 0 siblings, 1 reply; 90+ messages in thread From: Björn Steinbrink @ 2009-04-26 17:56 UTC (permalink / raw) To: David Abrahams Cc: Felipe Contreras, Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On 2009.04.26 09:55:34 -0400, David Abrahams wrote: > > On Apr 26, 2009, at 7:28 AM, Björn Steinbrink wrote: > >> On 2009.04.25 15:36:24 -0400, David Abrahams wrote: >>> Where it's relevant when the user notices that two distinct files >>> have the same id (because they happen to have the same contents) and >>> wonders what's up. >> >> Why would the user have to care about the object files in the repo? > > What a strange question. I have no idea how to answer. It seems > self- evident to me that users of a VCS care that their files are > stored in it. _Their_ files. The files that come from/end up in the working tree. I cared about those when I used SVN, too. But I never went to the SVN repo to find out if there are two equal files in it. We're talking about object names, and those belong to objects, not files in the working tree. >> And why would your implementation save the same object twice, in two >> distinct files? > > One could easily have the expectation that contents can be duplicated > because there are numerous precedents in everyone's experience of > computing, for example in filesystems and in any programming language > that is not pure-functional. That's not answering my question. I asked why you come up with an implementation that is "broken" enough to save the same object twice with different file names. If the implementation does not do that, your "when the user notices that two distinct files has the same id" is immediately invalid. The user cannot come into that situation then. And anyway, when the user notices something, that's a discovery, not an expectation. >> The SHA-1 hash is created from the object, that means >> the its type, size and data. It's not an id of a file in the working >> tree, but of an object > > All true. All somewhat subtle distinctions that are not nearly as > apparent unless you actually use the word "hash" as I have been > advocating. Hu? How does saying "object hash" instead of "object id" make it any more apparent that a file in the working tree is something else than a git object? >>> It's not a foregone conclusion that objects with the same value have >>> identical ids, but it's immediately apparent if the id is known to >>> be a >>> hash. >> >> You can't have two objects with the same contents to begin with, same >> content => same object. > > In the Git world, I agree. In general, I disagree. I don't think were discussing a term to describe something that identifies an object in general. So, "in general" you can disagree as much as you want, but for git that doesn't matter at all. > The fact that is so in the Git world is reinforced by the notion that > the id of an object is a hash of its contents. > >> You can just have that one object stored multiple times in different >> places (for sane implementations this likely means that you have >> more than one repo to look at, and each has its own copy of that >> object, but that's nothing you as an user should have to care about). > >> It's an identity relation: same name/id => same object. Unlike e.g. a >> hash-table where you are expected to deal with collisions, and having >> the same hash doesn't mean that you have identical data. But that's >> not true of git, it expects an identity relation, which is IMHO >> better expressed through "object name" or "object id". > > Yes, that's true in the Git world (though not necessarily elsewhere), or > at least you hope it is. In fact, there's no guarantee that SHA1 > collisions won't occur; it's just exremely unlikely. In fact, if you > google it you can find some interesting papers about SHA1 collision. Sure, it's an assumption that has been made and is required to hold true for git to work. > Another way to express what you wrote above: > > same same id => same hash ?=> same contents => same object > > where ?=> means "almost certainly implies." No, that chain shows how git could be "unreliable" when you get hash collisions. You could put that into a chapter that explains the implications of the way git generates its object ids. But it's not very interesting when you use git and (implicitly) trust the assumption that no collisions happen. For that case, you need a different chain: same name/id ==> same object ==> same content That's interesting when you e.g. want to "access" some object or when you look at a tree that references the same object twice. For example when both references are for file entries, you know that those files have the same content. That it is a hash doesn't matter, the id could be anything that uniquely identifies an object. The "same object ==> same content" part should be pretty obvious, so you only need to know that the "same name/id ==> same object" part is true, i.e. that the object name/id uniquely identifies the object. And that _is_ true, simply because you cannot have two objects in the same repo that have the same hash and thus the same id. Even if you get a collision, you'll still have just one object. And that's not something that a term that contains the word "hash" is telling me, it would instead tell me that it is not something that really uniquely identifies an object, although git uses it as such. Only when you want to explain how git manages to avoid duplicated storage of fully identical contents, then you need to mention that the object names are the hashes of the full object contents. But that's not what you actually use the object names for. same content ==> same content hash ==> object name/id ==> same object (Actually, you need an additional detail: "same file/symlink/directory/... contents ==> same object contents", which can't be made explicit by just saying that you use a hash). Your chain was in the wrong order and explains neither the "a tree that has the same object name/id for two entries" case (because of the uncertainity of the "same hash ?=> same content" part), nor, when read in the other direction, where all implications are true, why same content leads to the same object (as it already starts at the object level). >> You can still say that the name/id is generated by using a hash >> function, but the important part is that the name/id is used to >> _uniquely_ identify an object, which isn't apparent when you call it >> a hash. > > I think the implication is important in both directions. Neither one is > self-evident to a new user. Maybe the right answer is 'hash id'. git could work different. Just moving the storage of the filenames from the tree objects to the blobs would mean that you'd get different objects for files that have the same content but different names. You'd still have a hash of the object contents as the object name, but suddenly you get more objects. Just saying "hash" or "hash id" doesn't magically explain all the other things. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 17:56 ` Björn Steinbrink @ 2009-04-26 20:17 ` David Abrahams 2009-04-26 22:25 ` Björn Steinbrink 0 siblings, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-26 20:17 UTC (permalink / raw) To: Björn Steinbrink Cc: Felipe Contreras, Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On Apr 26, 2009, at 1:56 PM, Björn Steinbrink wrote: > On 2009.04.26 09:55:34 -0400, David Abrahams wrote: >> >> On Apr 26, 2009, at 7:28 AM, Björn Steinbrink wrote: >> >>> On 2009.04.25 15:36:24 -0400, David Abrahams wrote: >>>> Where it's relevant when the user notices that two distinct files >>>> have the same id (because they happen to have the same contents) >>>> and >>>> wonders what's up. >>> >>> Why would the user have to care about the object files in the repo? >> >> What a strange question. I have no idea how to answer. It seems >> self- evident to me that users of a VCS care that their files are >> stored in it. > > _Their_ files. The files that come from/end up in the working tree. I > cared about those when I used SVN, too. But I never went to the SVN > repo > to find out if there are two equal files in it. We're talking about > object names, and those belong to objects, not files in the working > tree. I'm telling you, many new users who aren't already versed in Git will naturally associate the SHA1 codes exposed by the interface with the files they've checked in without understand that they actually identify object files (another poorly chosen Git name, if I've manage to deduce what it means) rather than directly corresponding to states of their files. And anyway, if you want to get into implementation details, SHA1s don't always identify object files because blobs get delta-compressed. >>> And why would your implementation save the same object twice, in two >>> distinct files? >> >> One could easily have the expectation that contents can be duplicated >> because there are numerous precedents in everyone's experience of >> computing, for example in filesystems and in any programming language >> that is not pure-functional. > > That's not answering my question. I asked why you come up with an > implementation that is "broken" enough to save the same object twice > with different file names. I don't know what you mean by "come up with an implementation." I'm not inventing an implementation. I'm saying, new users inevitably and inexorably develop a mental model of the system they're learning about, and they don't always develop the right mental model, and I'm saying that it's easy to see how they can fall into incorrect assumptions. The word "hash" helps a bit with avoiding one of those assumptions. > If the implementation does not do that, your > "when the user notices that two distinct files has the same id" is > immediately invalid. The user cannot come into that situation then. I think this is why Git remains more opaque than it should be. You can't assume that people will naturally develop the smartest possible mental model of a VCS, even with faced with some hints in the form of a partial understanding of Git. > And > anyway, when the user notices something, that's a discovery, not an > expectation. It's better to give people something to connect their discoveries to (e.g. "oh, I see, they call those things hashes, so it makes sense that these two identical things are stored once") >>> The SHA-1 hash is created from the object, that means >>> the its type, size and data. It's not an id of a file in the working >>> tree, but of an object >> >> All true. All somewhat subtle distinctions that are not nearly as >> apparent unless you actually use the word "hash" as I have been >> advocating. > > Hu? How does saying "object hash" instead of "object id" make it any > more apparent that a file in the working tree is something else than a > git object? It makes it apparent that two identical things can only have one ID, and thus must correspond to one object. >>> You can't have two objects with the same contents to begin with, >>> same >>> content => same object. >> >> In the Git world, I agree. In general, I disagree. > > I don't think were discussing a term to describe something that > identifies an object in general. So, "in general" you can disagree as > much as you want, but for git that doesn't matter at all. You don't think the general rules of the computing world and existing meanings of terms have an impact on a new user's ability to grok Git? If not, we don't have much to discuss. >> The fact that is so in the Git world is reinforced by the notion that >> the id of an object is a hash of its contents. >> >>> You can just have that one object stored multiple times in different >>> places (for sane implementations this likely means that you have >>> more than one repo to look at, and each has its own copy of that >>> object, but that's nothing you as an user should have to care >>> about). >> >>> It's an identity relation: same name/id => same object. Unlike >>> e.g. a >>> hash-table where you are expected to deal with collisions, and >>> having >>> the same hash doesn't mean that you have identical data. But that's >>> not true of git, it expects an identity relation, which is IMHO >>> better expressed through "object name" or "object id". >> >> Yes, that's true in the Git world (though not necessarily >> elsewhere), or >> at least you hope it is. In fact, there's no guarantee that SHA1 >> collisions won't occur; it's just exremely unlikely. In fact, if you >> google it you can find some interesting papers about SHA1 collision. > > Sure, it's an assumption that has been made and is required to hold > true > for git to work. > >> Another way to express what you wrote above: >> >> same same id => same hash ?=> same contents => same object >> >> where ?=> means "almost certainly implies." > > No, that chain shows how git could be "unreliable" when you get hash > collisions. You could put that into a chapter that explains the > implications of the way git generates its object ids. But it's not > very > interesting when you use git and (implicitly) trust the assumption > that > no collisions happen. My point in mentioning that it's not certain was to point out that you left out the implication that actually /is/ certain, even across repos. > Only when you want to explain how git manages to avoid duplicated > storage of fully identical contents, then you need to mention that the > object names are the hashes of the full object contents. But that's > not > what you actually use the object names for. > > same content ==> same content hash ==> object name/id ==> same object > > (Actually, you need an additional detail: "same > file/symlink/directory/... contents ==> same object contents", which > can't be made explicit by just saying that you use a hash). > > Your chain was in the wrong order If you think there's a right order, you haven't understood that all the arrows are bidirectional. > and explains neither the "a tree that > has the same object name/id for two entries" case (because of the > uncertainity of the "same hash ?=> same content" part), nor, when read > in the other direction, where all implications are true, why same > content leads to the same object (as it already starts at the object > level). >> I think the implication is important in both directions. Neither >> one is >> self-evident to a new user. Maybe the right answer is 'hash id'. > > git could work different. Just moving the storage of the filenames > from > the tree objects to the blobs would mean that you'd get different > objects for files that have the same content but different names. > You'd > still have a hash of the object contents as the object name, but > suddenly you get more objects. Just saying "hash" or "hash id" doesn't > magically explain all the other things. But that's a strawman. I'm not claiming that it magically explains all the other things. I'm just claiming that it helps in avoiding some possible misunderstandings. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 20:17 ` David Abrahams @ 2009-04-26 22:25 ` Björn Steinbrink 2009-04-27 1:41 ` David Abrahams 2009-04-27 16:30 ` David Abrahams 0 siblings, 2 replies; 90+ messages in thread From: Björn Steinbrink @ 2009-04-26 22:25 UTC (permalink / raw) To: David Abrahams Cc: Felipe Contreras, Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On 2009.04.26 16:17:43 -0400, David Abrahams wrote: > > On Apr 26, 2009, at 1:56 PM, Björn Steinbrink wrote: > >> On 2009.04.26 09:55:34 -0400, David Abrahams wrote: >>> >>> On Apr 26, 2009, at 7:28 AM, Björn Steinbrink wrote: >>> >>>> On 2009.04.25 15:36:24 -0400, David Abrahams wrote: >>>>> Where it's relevant when the user notices that two distinct files >>>>> have the same id (because they happen to have the same contents) >>>>> and wonders what's up. >>>> >>>> Why would the user have to care about the object files in the repo? >>> >>> What a strange question. I have no idea how to answer. It seems >>> self- evident to me that users of a VCS care that their files are >>> stored in it. >> >> _Their_ files. The files that come from/end up in the working tree. I >> cared about those when I used SVN, too. But I never went to the SVN >> repo to find out if there are two equal files in it. We're talking >> about object names, and those belong to objects, not files in the >> working tree. > > I'm telling you, many new users who aren't already versed in Git will > naturally associate the SHA1 codes exposed by the interface with the > files they've checked in without understand that they actually > identify object files (another poorly chosen Git name, if I've manage > to deduce what it means) Hm, not sure if that name is really important. The way objects are stored is an implementation detail. Usually, we're just talking about "objects" not the files the loose objects are stored in (loose object = an object stored in its own file, not in a pack file). But as you complained about it, how would you call a file in which an object is stored? > rather than directly corresponding to states > of their files. And anyway, if you want to get into implementation > details, SHA1s don't always identify object files because blobs get > delta-compressed. True, they identify the object, it's not even necessesary to mention delta compression, just having the object in a pack file causes the object name to no longer identify the file in which the object can be found. Heck, the object might be in a different repo when you use alternates ;-). And I think I never explicitly said that they identify a file storing an object, but implied that by "accepting" your example and assuming that you meant two object files having the same id. I should have said that your "two distinct files have the same id" makes no sense and should have asked what you mean. >>>> And why would your implementation save the same object twice, in >>>> two distinct files? >>> >>> One could easily have the expectation that contents can be >>> duplicated because there are numerous precedents in everyone's >>> experience of computing, for example in filesystems and in any >>> programming language that is not pure-functional. >> >> That's not answering my question. I asked why you come up with an >> implementation that is "broken" enough to save the same object twice >> with different file names. > > I don't know what you mean by "come up with an implementation." I'm > not inventing an implementation. Sorry, "come up with" is clearly wrong. "Assume" or "expect" or so might have been more correct. But I think we could agree that you misused the "id" term by using it for files, and what ensued confused both of us? If you didn't mean the stored objects by "files", then that part of the discussion was just based on a misunderstanding and can be ignored. > I'm saying, new users inevitably and inexorably develop a mental model > of the system they're learning about, and they don't always develop > the right mental model, and I'm saying that it's easy to see how they > can fall into incorrect assumptions. The word "hash" helps a bit with > avoiding one of those assumptions. I've not met a lot of people that were actually confused about the fact that the same object might be "reused" for tree entries with different names. But most (all?) of those that were confused knew that the objects are identified by hashes, but expected the filenames to be part of the object and didn't know about tree objects. >> If the implementation does not do that, your "when the user notices >> that two distinct files has the same id" is immediately invalid. The >> user cannot come into that situation then. > > I think this is why Git remains more opaque than it should be. You > can't assume that people will naturally develop the smartest possible > mental model of a VCS, even with faced with some hints in the form of > a partial understanding of Git. I don't think I understand what you mean here. If users don't understand the data model, that's caused by missing/bad documentation or because the user doesn't want to read the existing documentation. (I'll make no assumptions here, it's been some time since I had a close look at the docs). But I've been talking about how the given implementation stores data in the repository. Could you explain? >> And anyway, when the user notices something, that's a discovery, not >> an expectation. > > It's better to give people something to connect their discoveries to > (e.g. "oh, I see, they call those things hashes, so it makes sense > that these two identical things are stored once") We're talking about seeing, for example, the same object name more than once, for different "files", in e.g. gitweb, right? Then the "Hu? Isn't the filename part of the object?" thing might still apply. The user can still very easily make a wrong guess. As Michael said in another mail, the important point is probably rather to teach people to make a distinction between files and directories in the working tree and the contents stored in the git objects. And that's not accomplished by saying that the id is a hash, when the user doesn't know what the hash is based upon. Somewhat related: I'm trying to remember if I ever had problems explaining the concept of hardlinks to someone, but I don't remember any such situation anymore. There are no hashes involved there, and I feel like that was quite easy to grasp for most people I talked to. It's pretty similar, separating content from names. >>>> The SHA-1 hash is created from the object, that means the its type, >>>> size and data. It's not an id of a file in the working tree, but of >>>> an object >>> >>> All true. All somewhat subtle distinctions that are not nearly as >>> apparent unless you actually use the word "hash" as I have been >>> advocating. >> >> Hu? How does saying "object hash" instead of "object id" make it any >> more apparent that a file in the working tree is something else than >> a git object? > > It makes it apparent that two identical things can only have one ID, > and thus must correspond to one object. See above, the user needs to know "what" is identical in the first place. >>>> You can't have two objects with the same contents to begin with, >>>> same content => same object. >>> >>> In the Git world, I agree. In general, I disagree. >> >> I don't think were discussing a term to describe something that >> identifies an object in general. So, "in general" you can disagree as >> much as you want, but for git that doesn't matter at all. > > You don't think the general rules of the computing world and existing > meanings of terms have an impact on a new user's ability to grok Git? > If not, we don't have much to discuss. This was probably also based on the files+id misunderstanding combined with the fact that you used the term "object" where I thought that you meant a "git object" (you probably didn't, right?). Because when talking about "git objects" you actually can't have two different ones with the same "value" (I guess you mean type, size and content when you say "value", right?) And admittedly, for this one, the "hash" term _would_ help to get the user to understand that in git you cannot have two different objects with the same contents and that this makes git different and efficient. But I still don't buy that this is important for understanding the basic data model. It's a nice hint why git can always quickly tell that two things are equal and why the repository size doesn't explode. But the important part is the separation of names and content, that trees give names to the contents stored in blobs. The "hash" name would only help to understand its efficiency once you already understood the data model. See below. >>> The fact that is so in the Git world is reinforced by the notion >>> that the id of an object is a hash of its contents. >>> >>>> You can just have that one object stored multiple times in >>>> different places (for sane implementations this likely means that >>>> you have more than one repo to look at, and each has its own copy >>>> of that object, but that's nothing you as an user should have to >>>> care about). >>> >>>> It's an identity relation: same name/id => same object. Unlike >>>> e.g. a hash-table where you are expected to deal with collisions, >>>> and having the same hash doesn't mean that you have identical >>>> data. But that's not true of git, it expects an identity relation, >>>> which is IMHO better expressed through "object name" or "object >>>> id". >>> >>> Yes, that's true in the Git world (though not necessarily >>> elsewhere), or at least you hope it is. In fact, there's no >>> guarantee that SHA1 collisions won't occur; it's just exremely >>> unlikely. In fact, if you google it you can find some interesting >>> papers about SHA1 collision. >> >> Sure, it's an assumption that has been made and is required to hold >> true for git to work. >> >>> Another way to express what you wrote above: >>> >>> same same id => same hash ?=> same contents => same object >>> >>> where ?=> means "almost certainly implies." >> >> No, that chain shows how git could be "unreliable" when you get hash >> collisions. You could put that into a chapter that explains the >> implications of the way git generates its object ids. But it's not >> very interesting when you use git and (implicitly) trust the >> assumption that no collisions happen. > > My point in mentioning that it's not certain was to point out that you > left out the implication that actually /is/ certain, even across > repos. And my point is that this is not important for understanding the basic data model, but only how git efficiently implements it, and which assumptions it has to make. >> Only when you want to explain how git manages to avoid duplicated >> storage of fully identical contents, then you need to mention that >> the object names are the hashes of the full object contents. But >> that's not what you actually use the object names for. >> >> same content ==> same content hash ==> object name/id ==> same object >> >> (Actually, you need an additional detail: "same >> file/symlink/directory/... contents ==> same object contents", which >> can't be made explicit by just saying that you use a hash). >> >> Your chain was in the wrong order > > If you think there's a right order, you haven't understood that all > the arrows are bidirectional. There's one that is not truly bidirectional. id <=> hash <?=> contents <=> object I can't go from id/hash to contents/object without hitting the "hash => content" assumption. I had two chains for a reason. id => object => content => hash and content => hash => id => object are guaranteed, at least within a single repo. While: content => hash => id => object ?=> content has a non-guaranteed part again, just an assumption, at least when the first and last "content" mean the same content. If you get a collision, you rather have a guarantee that one version of the content is "not in the repo". And as I said, that fact, that the identifier is not globally unique, along with the fact that git cannot have two different objects with the same contents or name is not required to understand how commits, tree and blobs go together to store the history of a project. It's IM(NS?)HO far more important to understand the separation of names and content. Then you can understand that multiple names can be associated with the same object holding some content (which can be done with other kinds of ids as well, even with more than one object having the same contents, just not necessarily as efficiently). And that objects have a name that is used to identify the object. And only then can you understand and appreciate how hashes help to efficiently implement that model, knowing which data is used to calculate the hash. >> and explains neither the "a tree that has the same object name/id for >> two entries" case (because of the uncertainity of the "same hash ?=> >> same content" part), nor, when read in the other direction, where all >> implications are true, why same content leads to the same object (as >> it already starts at the object level). > >>> I think the implication is important in both directions. Neither >>> one is self-evident to a new user. Maybe the right answer is 'hash >>> id'. >> >> git could work different. Just moving the storage of the filenames >> from the tree objects to the blobs would mean that you'd get >> different objects for files that have the same content but different >> names. You'd still have a hash of the object contents as the object >> name, but suddenly you get more objects. Just saying "hash" or "hash >> id" doesn't magically explain all the other things. > > But that's a strawman. I'm not claiming that it magically explains > all the other things. I'm just claiming that it helps in avoiding > some possible misunderstandings. And I think that it doesn't help much at all and might confuse users, because they expect the hash to be based on the wrong stuff. It's just important that the "thing" is used to identify an object. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 22:25 ` Björn Steinbrink @ 2009-04-27 1:41 ` David Abrahams 2009-04-27 16:30 ` David Abrahams 1 sibling, 0 replies; 90+ messages in thread From: David Abrahams @ 2009-04-27 1:41 UTC (permalink / raw) To: Björn Steinbrink Cc: Felipe Contreras, Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields on Sun Apr 26 2009, Björn Steinbrink <B.Steinbrink-AT-gmx.de> wrote: >> I think this is why Git remains more opaque than it should be. You >> can't assume that people will naturally develop the smartest possible >> mental model of a VCS, even with faced with some hints in the form of >> a partial understanding of Git. > > I don't think I understand what you mean here. If users don't understand > the data model, that's caused by missing/bad documentation or because > the user doesn't want to read the existing documentation. (I'll make no > assumptions here, it's been some time since I had a close look at the > docs). But I've been talking about how the given implementation stores > data in the repository. Could you explain? You don't have to "not want to read the documentation" to have an incomplete mental model. The mental model development doesn't happen upon finishing the documentation; it happens while the person is learning. Halfway through the docs, I have an incomplete mental model. If you make it hard enough for me, maybe I never finish and I retain that incomplete model forever. The more you can help people avoid incorrect assumptions as they read along, the easier it will be for them to grok the next bit they are reading, and the less likely they are to become discouraged. -- Dave Abrahams BoostPro Computing http://www.boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 22:25 ` Björn Steinbrink 2009-04-27 1:41 ` David Abrahams @ 2009-04-27 16:30 ` David Abrahams 2009-04-27 16:52 ` Michael Witten 1 sibling, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-27 16:30 UTC (permalink / raw) To: Björn Steinbrink Cc: Felipe Contreras, Michael Witten, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On Apr 26, 2009, at 6:25 PM, Björn Steinbrink wrote: > On 2009.04.26 16:17:43 -0400, David Abrahams wrote: >> >> I'm telling you, many new users who aren't already versed in Git will >> naturally associate the SHA1 codes exposed by the interface with the >> files they've checked in without understand that they actually >> identify object files (another poorly chosen Git name, if I've manage >> to deduce what it means) > > Hm, not sure if that name is really important. The way objects are > stored is an implementation detail. Usually, we're just talking about > "objects" not the files the loose objects are stored in (loose > object = > an object stored in its own file, not in a pack file). But as you > complained about it, how would you call a file in which an object is > stored? "Object" is OK. "Object file" is overloaded and confusing. I'd just say there are "Git data files" or "files in Git's object store", some of which store single objects whose id is the same as the filename, and some of which store multiple objects. >> rather than directly corresponding to states >> of their files. And anyway, if you want to get into implementation >> details, SHA1s don't always identify object files because blobs get >> delta-compressed. > > True, they identify the object, it's not even necessesary to mention > delta compression, just having the object in a pack file causes the > object name to no longer identify the file in which the object can be > found. Right. > Heck, the object might be in a different repo when you use > alternates ;-). And I think I never explicitly said that they > identify a file storing an object, but implied that by "accepting" > your > example and assuming that you meant two object files having the same > id. Yes, that assumption was wrong, and then when you responded using the term "object file" I didn't know what it meant. > I should have said that your "two distinct files have the same id" > makes > no sense and should have asked what you mean. > >>>>> And why would your implementation save the same object twice, in >>>>> two distinct files? >>>> >>>> One could easily have the expectation that contents can be >>>> duplicated because there are numerous precedents in everyone's >>>> experience of computing, for example in filesystems and in any >>>> programming language that is not pure-functional. >>> >>> That's not answering my question. I asked why you come up with an >>> implementation that is "broken" enough to save the same object twice >>> with different file names. >> >> I don't know what you mean by "come up with an implementation." I'm >> not inventing an implementation. > > Sorry, "come up with" is clearly wrong. "Assume" or "expect" or so > might > have been more correct. I think I explained why one might make that assumption. > But I think we could agree that you misused the > "id" term by using it for files, and what ensued confused both of > us? If > you didn't mean the stored objects by "files", then that part of the > discussion was just based on a misunderstanding and can be ignored. I meant what the user thinks of as files stored in the repository. >> I'm saying, new users inevitably and inexorably develop a mental >> model >> of the system they're learning about, and they don't always develop >> the right mental model, and I'm saying that it's easy to see how they >> can fall into incorrect assumptions. The word "hash" helps a bit >> with >> avoiding one of those assumptions. > > I've not met a lot of people that were actually confused about the > fact > that the same object might be "reused" for tree entries with different > names. But most (all?) of those that were confused knew that the > objects > are identified by hashes, but expected the filenames to be part of the > object and didn't know about tree objects. Well, there's certainly precedent for the idea that the filenames are distinct from file contents. >>> And anyway, when the user notices something, that's a discovery, not >>> an expectation. >> >> It's better to give people something to connect their discoveries to >> (e.g. "oh, I see, they call those things hashes, so it makes sense >> that these two identical things are stored once") > > We're talking about seeing, for example, the same object name more > than > once, for different "files", in e.g. gitweb, right? Then the "Hu? > Isn't > the filename part of the object?" thing might still apply. The user > can > still very easily make a wrong guess. > > As Michael said in another mail, the important point is probably > rather > to teach people to make a distinction between files and directories in > the working tree and the contents stored in the git objects. And > that's > not accomplished by saying that the id is a hash, when the user > doesn't > know what the hash is based upon. > > Somewhat related: I'm trying to remember if I ever had problems > explaining the concept of hardlinks to someone, but I don't remember > any > such situation anymore. There are no hashes involved there, and I feel > like that was quite easy to grasp for most people I talked to. It's > pretty similar, separating content from names. The difference is that hardlinks are only generated explicitly. You'd need something like a hash to generate them automatically and implicitly. >>>>> You can't have two objects with the same contents to begin with, >>>>> same content => same object. >>>> >>>> In the Git world, I agree. In general, I disagree. >>> >>> I don't think were discussing a term to describe something that >>> identifies an object in general. So, "in general" you can disagree >>> as >>> much as you want, but for git that doesn't matter at all. >> >> You don't think the general rules of the computing world and existing >> meanings of terms have an impact on a new user's ability to grok Git? >> If not, we don't have much to discuss. > > This was probably also based on the files+id misunderstanding combined > with the fact that you used the term "object" where I thought that you > meant a "git object" (you probably didn't, right?). I didn't. I meant the general notion of "object" in computing. I'm trying to talk about how the language used by Git's docs can bias people toward correct or incorrect understandings of Git as they're learning. > Because when talking > about "git objects" you actually can't have two different ones with > the > same "value" (I guess you mean type, size and content when you say > "value", right?) Yes. Size is a function of content, so that adds nothing, and whether it even makes sense to say that two things of different type have identical content is debatable. > And admittedly, for this one, the "hash" term _would_ help to get the > user to understand that in git you cannot have two different objects > with the same contents and that this makes git different and > efficient. > But I still don't buy that this is important for understanding the > basic > data model. It's a nice hint why git can always quickly tell that two > things are equal and why the repository size doesn't explode. But the > important part is the separation of names and content, that trees give > names to the contents stored in blobs. But there's nothing unique about that; it's not distinct from what filesystems do. > The "hash" name would only help > to understand its efficiency once you already understood the data > model. It would help to reinforce that an object's id is a function of its contents. It would help to make clear why the same object can be identified in the same way across all repos. >>>> Another way to express what you wrote above: >>>> >>>> same same id => same hash ?=> same contents => same object >>>> >>>> where ?=> means "almost certainly implies." >>> >>> No, that chain shows how git could be "unreliable" when you get hash >>> collisions. You could put that into a chapter that explains the >>> implications of the way git generates its object ids. But it's not >>> very interesting when you use git and (implicitly) trust the >>> assumption that no collisions happen. >> >> My point in mentioning that it's not certain was to point out that >> you >> left out the implication that actually /is/ certain, even across >> repos. > > And my point is that this is not important for understanding the basic > data model, but only how git efficiently implements it, and which > assumptions it has to make. Look, you're talking to someone who has just had to go through the process of learning all this stuff. What I'm telling you is based on my experiences. Just one datapoint, to be sure, but knowing that it's a hash was crucial for me. >> If you think there's a right order, you haven't understood that all >> the arrows are bidirectional. > > There's one that is not truly bidirectional. > > id <=> hash <?=> contents <=> object > > I can't go from id/hash to contents/object without hitting the "hash > => > content" assumption. Quite right. You can't derive contents from the hash. >> But that's a strawman. I'm not claiming that it magically explains >> all the other things. I'm just claiming that it helps in avoiding >> some possible misunderstandings. > > And I think that it doesn't help much at all and might confuse users, > because they expect the hash to be based on the wrong stuff. It's just > important that the "thing" is used to identify an object. OK, I give up. *I* now understand the system, and it's starting to look like too much of a struggle to improve things for others, so they can fend for themselves I guess. Thanks for the lively discussion, anyway. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-27 16:30 ` David Abrahams @ 2009-04-27 16:52 ` Michael Witten 0 siblings, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-04-27 16:52 UTC (permalink / raw) To: David Abrahams Cc: Björn Steinbrink, Felipe Contreras, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields 2009/4/27 David Abrahams <dave@boostpro.com>: > > I didn't. I meant the general notion of "object" in computing. I'm trying > to talk about how the language used by Git's docs can bias people toward > correct or incorrect understandings of Git as they're learning. Actually, I believe object was first used to describe anything stored in memory. Given that, I still think my usage of C pointer terminology is superior to everything; it's just the case that objects are content addressable in the git world and location addressable in the C world. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 11:28 ` Björn Steinbrink 2009-04-26 13:55 ` David Abrahams @ 2009-04-26 16:36 ` Michael Witten 2009-04-26 18:12 ` Björn Steinbrink 1 sibling, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-04-26 16:36 UTC (permalink / raw) To: Björn Steinbrink Cc: David Abrahams, Felipe Contreras, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields 2009/4/26 Björn Steinbrink <B.Steinbrink@gmx.de>: > On 2009.04.25 15:36:24 -0400, David Abrahams wrote: >> Where it's relevant when the user notices that two distinct files have >> the same id (because they happen to have the same contents) and wonders >> what's up. >... > And why would your implementation save the same object twice, in two > distinct files? This question makes me think that you don't understand the parent's point. He's not talking about implementation details; in fact, there's no reason to mix the git world and the file system world at all in this discussion. David is pointing out that a user might notice that two different trees list the same blob. This can be startling if you have incomplete picture about what's going on. From a practical point of view, you might argue that not too many people are looking at trees and blobs; however, it seems to me that most people are afraid to use any of git's most useful features precisely because they don't understand the git model and they don't understand that nothing is ever lost unless you explicitly clean up unreferenced objects---they don't see how easy it is manipulate their repos. I argue that if they are given the full knowledge of git's concepts, then they will be able to reason about their repo actions with confidence, even if they only work with commits. I think the key is to stress in the documentation the idea that there are 2 separate worlds (the git object world and the working directory's file system world) and that the git tools provide an interface between them; this seems like a small and unnecessarily academic point, but I believe that it's important to working with confidence. > ... > You can't have two objects with the same contents to begin with, same > content => same object. You can just have that one object stored > multiple times in different places (for sane implementations this likely > means that you have more than one repo to look at, and each has its own > copy of that object, but that's nothing you as an user should have to > care about). Indeed it's nothing you should care about. It's an implementation detail again; theoretically, every repo is in the same git world where all git objects are stored---in a sense, a particular repo state is itself an object of this world. > It's an identity relation: same name/id => same object. Unlike e.g. a > hash-table where you are expected to deal with collisions, and having > the same hash doesn't mean that you have identical data. However, having the same *cryptographic* hash does mean that you have identical data. The overall point is this: The documentation should force people to learn the right ideas, so that they can have confidence to think beyond blog-post templates for using git. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 16:36 ` Michael Witten @ 2009-04-26 18:12 ` Björn Steinbrink 2009-04-26 20:20 ` David Abrahams 0 siblings, 1 reply; 90+ messages in thread From: Björn Steinbrink @ 2009-04-26 18:12 UTC (permalink / raw) To: Michael Witten Cc: David Abrahams, Felipe Contreras, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On 2009.04.26 11:36:04 -0500, Michael Witten wrote: > 2009/4/26 Björn Steinbrink <B.Steinbrink@gmx.de>: > > On 2009.04.25 15:36:24 -0400, David Abrahams wrote: > >> Where it's relevant when the user notices that two distinct files have > >> the same id (because they happen to have the same contents) and wonders > >> what's up. > >... > > And why would your implementation save the same object twice, in two > > distinct files? > > This question makes me think that you don't understand the parent's > point. He's not talking about implementation details; in fact, there's > no reason to mix the git world and the file system world at all in > this discussion. > > David is pointing out that a user might notice that two different > trees list the same blob. This can be startling if you have incomplete > picture about what's going on. David said that the user encounters two distinct files with the same id. The ids are properties of the objects. So he must have meant object files, or he attributed the id to the wrong thing. I assumed that he didn't mix those things up and really meant the object files, thus my reply. > >From a practical point of view, you might argue that not too many > people are looking at trees and blobs; Heh, I'd rather argue that too _few_ people have looked at commits and trees at least once, whether it's an actual object or a graph like in git for computer scientists. > however, it seems to me that most people are afraid to use any of > git's most useful features precisely because they don't understand the > git model and they don't understand that nothing is ever lost unless > you explicitly clean up unreferenced objects---they don't see how easy > it is manipulate their repos. I argue that if they are given the full > knowledge of git's concepts, then they will be able to reason about > their repo actions with confidence, even if they only work with > commits. Agreed. > I think the key is to stress in the documentation the idea that there > are 2 separate worlds (the git object world and the working > directory's file system world) and that the git tools provide an > interface between them; this seems like a small and unnecessarily > academic point, but I believe that it's important to working with > confidence. Agreed. That's also why I asked David why the user would look at the object files in the repo (the .git dir). To some degree those are also an implementation detail. The user works with the working tree and uses the git tools to modify the repo. > > It's an identity relation: same name/id => same object. Unlike e.g. a > > hash-table where you are expected to deal with collisions, and having > > the same hash doesn't mean that you have identical data. > > However, having the same *cryptographic* hash does mean that you have > identical data. That's the _assumption_ that git makes. Hash collisions are always possible, just hard to create intentionally when the hash function has not yet been broken. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-26 18:12 ` Björn Steinbrink @ 2009-04-26 20:20 ` David Abrahams 0 siblings, 0 replies; 90+ messages in thread From: David Abrahams @ 2009-04-26 20:20 UTC (permalink / raw) To: Björn Steinbrink Cc: Michael Witten, Felipe Contreras, Daniel Barkalow, Jeff King, Johan Herland, git, J. Bruce Fields On Apr 26, 2009, at 2:12 PM, Björn Steinbrink wrote: > That's also why I asked David why the user would look at the > object files in the repo (the .git dir). For what it's worth, I didn't understand what you meant by "object files" until now. I never claimed they would look at those files, at least not intentionally. But just look at any web interface to a Git repo and you'll see why they might encounter the object file names even before they've installed git on their own machine. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:11 ` Daniel Barkalow 2009-04-24 23:14 ` Jeff King 2009-04-24 23:26 ` Michael Witten @ 2009-04-25 0:41 ` David Abrahams 2 siblings, 0 replies; 90+ messages in thread From: David Abrahams @ 2009-04-25 0:41 UTC (permalink / raw) To: Daniel Barkalow Cc: Michael Witten, Jeff King, Johan Herland, git, J. Bruce Fields On Apr 24, 2009, at 7:11 PM, Daniel Barkalow wrote: > I actually think calling them "sha1s" is better, simply because this > bit > of jargon doesn't mean anything else (git deals with email, so > "address" > is overloaded). And the term is already in use for this particular > case, > and it doesn't mean anything else at all (since, of course, the crypto > thing is "SHA-1", not "sha1"), and it's short (which is important for > making it easy to look at usage help). The word "hash" would be an improvement. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 22:18 ` Michael Witten 2009-04-24 22:25 ` Michael Witten @ 2009-04-24 23:16 ` Björn Steinbrink 2009-04-25 0:01 ` Michael Witten 1 sibling, 1 reply; 90+ messages in thread From: Björn Steinbrink @ 2009-04-24 23:16 UTC (permalink / raw) To: Michael Witten Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On 2009.04.24 17:18:44 -0500, Michael Witten wrote: > On Fri, Apr 24, 2009 at 16:38, Jeff King <peff@peff.net> wrote: > > On Fri, Apr 24, 2009 at 05:34:00PM -0400, Daniel Barkalow wrote: > > > >> I'd say that blobs and trees are an implementation detail of "the full > >> content of a version of the project", not something conceptually > >> important. Likewise, the date representation used in commits isn't > > ... > > No, that isn't critical for understanding how _commit_ operations work, > > but I think that is exactly the sort of conceptual knowledge that let > > people use git more fully. > > I think the key conlusion here is that the main concepts are *objects* > and references to those objects. One type of object is not necessarily > more low-level or high-level than another type of object; each type of > object is the most important type of object for a particular task in > or view of the git world. > > > I disagree. I think it's important to note that trees and blobs have a > > name, and you can refer to them. Once you know that, the fact that you > > can do: > > > > git show master > > git show master:Documentation > > git show master:Makefile > > > > just makes sense. You are always just specifying an object, but the type > > is different for each (and show "does the right thing" based on object > > type). > > In fact, I think it's important to note that the notation: > > git show master:Makefile > > actually involves a translation from a Unix filesystem address to a > git object address that is then used to find the relevant data. Hm? Resolving master:Makefile means to first find what master is, most likely the shortname for refs/heads/master. That usually references a commit object (by its name). The "<tree-ish>:<path>" syntax then causes git to lookup the tree referenced by that commit (again, by its name). And then the tree entry for "Makefile" is looked up, leading to the name for the object identified by "master:Makefile". > In fact, I think masking this kind of thing with a catch-all word > 'reference' is a bad idea. "master:Makefile" is not a reference. Just "master" is a shortname for a reference, the full name might be refs/heads/master. git has: - object names (which happen to be SHA-1 hashes). - references (which reference objects by their names) - symbolic references (which reference other references by their names) The "<tree-ish>:<path>" syntax is not called "reference". > Rather than being hidden, it should be exposed: I think it would be > beneficial to use the word 'address' rather than 'reference' when > talking about the SHA-1 names. Then HEAD could be called a pointer > variable, etc. What's wrong with just calling the object name "object name"? References are something different, and the above "master:Makefile" is yet a different thing, using the "extended SHA1" syntax to identify an object. > So, a pointer variable's value is an object address that is the > location of an object in git 'memory'. I think using this approach > would make things significantly more transparent. But then HEAD would be a pointer pointer variable (symbolic ref), unless you have a detached HEAD. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:16 ` Björn Steinbrink @ 2009-04-25 0:01 ` Michael Witten 2009-04-25 0:48 ` David Abrahams 2009-05-02 15:53 ` Björn Steinbrink 0 siblings, 2 replies; 90+ messages in thread From: Michael Witten @ 2009-04-25 0:01 UTC (permalink / raw) To: Björn Steinbrink Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields 2009/4/24 Björn Steinbrink <B.Steinbrink@gmx.de>: >> In fact, I think it's important to note that the notation: >> >> git show master:Makefile >> >> actually involves a translation from a Unix filesystem address to a >> git object address that is then used to find the relevant data. > > Hm? Resolving master:Makefile means to first find what master is, most > likely the shortname for refs/heads/master. That usually references a > commit object (by its name). The "<tree-ish>:<path>" syntax then causes > git to lookup the tree referenced by that commit (again, by its name). > And then the tree entry for "Makefile" is looked up, leading to the name > for the object identified by "master:Makefile". Firstly, your head is too bound to low-level implementation. Secondly, you've basically just expounded upon what I said. The Makefile part is for humans to write using a filesystem path (address) that is mapped into what I call a git address. The point is that the user is interfacing between two theories of content storage. >> In fact, I think masking this kind of thing with a catch-all word >> 'reference' is a bad idea. > > "master:Makefile" is not a reference. Just "master" is a shortname for a > reference, the full name might be refs/heads/master. > > git has: > - object names (which happen to be SHA-1 hashes). > - references (which reference objects by their names) > - symbolic references (which reference other references by their names) > > The "<tree-ish>:<path>" syntax is not called "reference". I will admit that I used this term wrongly then, and that git has a set of terminologies much closer to what I proposed: * object addresses: object names * pointers: references * handle: symbolic reference (I don't know, I just now made that one up) I was under the impression that object names were in fact called references and that things like '[refs/heads/]master' were just considered conveniences. I'm glad to have been disabused; though I like my terms better ;-D >> Rather than being hidden, it should be exposed: I think it would be >> beneficial to use the word 'address' rather than 'reference' when >> talking about the SHA-1 names. Then HEAD could be called a pointer >> variable, etc. > > What's wrong with just calling the object name "object name"? What's wrong with calling the object address "object address"? As I've stated: "address", "pointer", and "handle" are an analogy to terminology that has been around for ages. In fact, another name for "pointer" is "reference". > are something different, and the above "master:Makefile" is yet a > different thing, using the "extended SHA1" syntax to identify an object. It is certainly something different. It's an interface between theories of content storage. >> So, a pointer variable's value is an object address that is the >> location of an object in git 'memory'. I think using this approach >> would make things significantly more transparent. > > But then HEAD would be a pointer pointer variable (symbolic ref), unless > you have a detached HEAD. We call those handles. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 0:01 ` Michael Witten @ 2009-04-25 0:48 ` David Abrahams 2009-04-26 22:42 ` Björn Steinbrink 2009-05-02 15:53 ` Björn Steinbrink 1 sibling, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-25 0:48 UTC (permalink / raw) To: Michael Witten Cc: Björn Steinbrink, Jeff King, Daniel Barkalow, Johan Herland, git, J. Bruce Fields On Apr 24, 2009, at 8:01 PM, Michael Witten wrote: >> What's wrong with just calling the object name "object name"? > > What's wrong with calling the object address "object address"? Neither captures the connection to the object's contents. I think "value ID" would be closer, but it's probably too horrible. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 0:48 ` David Abrahams @ 2009-04-26 22:42 ` Björn Steinbrink 0 siblings, 0 replies; 90+ messages in thread From: Björn Steinbrink @ 2009-04-26 22:42 UTC (permalink / raw) To: David Abrahams Cc: Michael Witten, Jeff King, Daniel Barkalow, Johan Herland, git, J. Bruce Fields On 2009.04.24 20:48:57 -0400, David Abrahams wrote: > > On Apr 24, 2009, at 8:01 PM, Michael Witten wrote: > >>> What's wrong with just calling the object name "object name"? >> >> What's wrong with calling the object address "object address"? > > Neither captures the connection to the object's contents. I think > "value ID" would be closer, but it's probably too horrible. I think I asked this in another mail, but I'm quite tired, so just to make sure: What do you mean by "value"? I might be weird (I'm not a native speaker, so I probably make funny and wrong connotations from time to time), but while I can accept "content" to include the type and size of the object, the term "value" makes me want to exclude those pieces of meta data. So "value" somehow feels wrong to me, as the hash covers those two fields. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 0:01 ` Michael Witten 2009-04-25 0:48 ` David Abrahams @ 2009-05-02 15:53 ` Björn Steinbrink 2009-05-02 18:36 ` Michael Witten 1 sibling, 1 reply; 90+ messages in thread From: Björn Steinbrink @ 2009-05-02 15:53 UTC (permalink / raw) To: Michael Witten Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On 2009.04.24 19:01:48 -0500, Michael Witten wrote: > 2009/4/24 Björn Steinbrink <B.Steinbrink@gmx.de>: > >> In fact, I think it's important to note that the notation: > >> > >> git show master:Makefile > >> > >> actually involves a translation from a Unix filesystem address to a > >> git object address that is then used to find the relevant data. > > > > Hm? Resolving master:Makefile means to first find what master is, most > > likely the shortname for refs/heads/master. That usually references a > > commit object (by its name). The "<tree-ish>:<path>" syntax then causes > > git to lookup the tree referenced by that commit (again, by its name). > > And then the tree entry for "Makefile" is looked up, leading to the name > > for the object identified by "master:Makefile". > > Firstly, your head is too bound to low-level implementation. > > Secondly, you've basically just expounded upon what I said. The > Makefile part is for humans to write using a filesystem path (address) > that is mapped into what I call a git address. The point is that the > user is interfacing between two theories of content storage. Sorry, that part missed a few sentences I thought I had written. It was meant to show where the term "reference" is used. I just walked along your example, as that was right there, and I didn't have to come up with something else ;-) Of course there are two "parts", just like scp uses <host>:<path>. > >> Rather than being hidden, it should be exposed: I think it would be > >> beneficial to use the word 'address' rather than 'reference' when > >> talking about the SHA-1 names. Then HEAD could be called a pointer > >> variable, etc. > > > > What's wrong with just calling the object name "object name"? > > What's wrong with calling the object address "object address"? The term "object name" is already used in the docs, so you'll have to prove that it's bad and needs to be replaced. > As I've stated: "address", "pointer", and "handle" are an analogy to > terminology that has been around for ages. In fact, another name for > "pointer" is "reference". AFAIK a pointer is just one kind of reference. C++ references are another kind, file descriptors are yet another. A reference is one piece of data that lets me access a different piece of data. And there are probably plenty of examples where you could apply that analogy, yet nobody (I know) does. Arrays, database tables, ... And "memory" usually means "RAM" to me, not "WORM"-memory (well, actually, you can also delete and then rewrite, but not modify). So the analogy would even hurt my mental model (just like the "commit --amend" command might be consider harmful, because it actually creates a new commit, but some users actually think the original commit is modified). > >> So, a pointer variable's value is an object address that is the > >> location of an object in git 'memory'. I think using this approach > >> would make things significantly more transparent. > > > > But then HEAD would be a pointer pointer variable (symbolic ref), unless > > you have a detached HEAD. > > We call those handles. Isn't a handle basically an opaque/abstract reference, at least in "modern" usage? Symvolic references aren't. The user is free to create and manipulate them, and gets full access to the things referenced by them. And saying that HEAD is a reference, that might be symbolic is IMHO by far easier to understand than saying that HEAD might be a pointer or a handle. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-05-02 15:53 ` Björn Steinbrink @ 2009-05-02 18:36 ` Michael Witten 2009-05-02 21:11 ` Björn Steinbrink 0 siblings, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-05-02 18:36 UTC (permalink / raw) To: Björn Steinbrink Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields 2009/5/2 Björn Steinbrink <B.Steinbrink@gmx.de>: >> As I've stated: "address", "pointer", and "handle" are an analogy to >> terminology that has been around for ages. In fact, another name for >> "pointer" is "reference". > > AFAIK a pointer is just one kind of reference. C++ references are > another kind... Actually, a C++ reference is a pointer with restrictions (AFAIK). > A reference is one piece of data that lets me access a different > piece of data. The key word there is 'access', which implies some kind of storage (or memory). > > And there are probably plenty of examples where you could apply that > analogy, yet nobody (I know) does. Arrays, database tables, ... Well, this terminology is certainly used with arrays in C, because array elements can be accessed with pointers. Also, databases use a much different scheme for addressing information than does memory. However, you're probably correct that pointer terminology doesn't exist much outside of C/C++ and older languages (Ada?). > > And "memory" usually means "RAM" to me, not "WORM"-memory (well, > actually, you can also delete and then rewrite, but not modify). Well, I don't see how Random Access Memory really conflicts. One certainly can access objects in the object memory/store randomly. The main difference is that the computer store is addressed by location, wheras the git store is addressed by content. Also, I would say that conceptually deletion is an implementation detail. Because git's object store is content addressable, one could think of it as already containing all possible objects (of course, I'm assuming that the 160-bit hash is also an implementation detail; an infinite number of objects implies infinitely large addresses, though the nonsignificant zeros could be disregarded as with real numbers or something. I don't know, I'm making this up as I go :-D). That the git tools ever complain no such object exists is an implementation detail resulting from our finite storage in reality. > So the > analogy would even hurt my mental model (just like the "commit --amend" > command might be consider harmful, because it actually creates a new > commit, but some users actually think the original commit is modified). Actually, this is why it's so important to have the underlying concepts at hand. Understanding that objects are simply addressed by content (that is, objects are immutable) completely extirpates this kind of confusion. >> >> So, a pointer variable's value is an object address that is the >> >> location of an object in git 'memory'. I think using this approach >> >> would make things significantly more transparent. >> > >> > But then HEAD would be a pointer pointer variable (symbolic ref), unless >> > you have a detached HEAD. >> >> We call those handles. > > Isn't a handle basically an opaque/abstract reference, at least in > "modern" usage? Symvolic references aren't. The user is free to create > and manipulate them, and gets full access to the things referenced by > them. And saying that HEAD is a reference, that might be symbolic is > IMHO by far easier to understand than saying that HEAD might be a > pointer or a handle. Fair enough. Call them symbolic pointers; however, I don't really see the problem with pointer pointers. In any case, I *think* my point is that it's important to understand that git uses content addressing; at first I was emphatic about the idea of 'addressing', so I went with pointer terminology (which works quite well, in my opinion). However, I think the 'content' part is more important, which is why 'object hash' is loads better than 'object name' or 'object id'. Also, at least the documentation could say that 'objects are addressed by their hashes', which says a whole lot in one quick sentence about how git works. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-05-02 18:36 ` Michael Witten @ 2009-05-02 21:11 ` Björn Steinbrink 2009-05-02 23:13 ` Michael Witten 0 siblings, 1 reply; 90+ messages in thread From: Björn Steinbrink @ 2009-05-02 21:11 UTC (permalink / raw) To: Michael Witten Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On 2009.05.02 13:36:35 -0500, Michael Witten wrote: > 2009/5/2 Björn Steinbrink <B.Steinbrink@gmx.de>: > >> As I've stated: "address", "pointer", and "handle" are an analogy to > >> terminology that has been around for ages. In fact, another name for > >> "pointer" is "reference". > > > > AFAIK a pointer is just one kind of reference. C++ references are > > another kind... > > Actually, a C++ reference is a pointer with restrictions (AFAIK). I'm not really aware of what the C++ standard says about it, but from a usage point of view, they're IMHO different enough to consider them as truly different types of references. > > And there are probably plenty of examples where you could apply that > > analogy, yet nobody (I know) does. Arrays, database tables, ... > > Well, this terminology is certainly used with arrays in C, because > array elements can be accessed with pointers. But when you apply the analogy, then the array is the memory, and an integer is an address and an index variable is a pointer. > Also, databases use a much different scheme for addressing information > than does memory. I don't see any inherent problem in saying that the primary key determines the address of a row. (It just gets funny when you have a table schema without a primary key *g*) > > And "memory" usually means "RAM" to me, not "WORM"-memory (well, > > actually, you can also delete and then rewrite, but not modify). > > Well, I don't see how Random Access Memory really conflicts. One > certainly can access objects in the object memory/store randomly. The > main difference is that the computer store is addressed by location, > wheras the git store is addressed by content. When I have a (non const) pointer in C I can write to the memory location it references. With git, I can't do that. ("RWRAM" would have been more correct, I'm damaged by the common usage of RAM as meaning RWRAM). > Also, I would say that conceptually deletion is an implementation > detail. Yeah, thus I put it in parentheses, just to show that, in practise, we don't even have WORM-memory (but still taking the hash collision problem into account, so we need to write once). > Because git's object store is content addressable, one could > think of it as already containing all possible objects (of course, I'm > assuming that the 160-bit hash is also an implementation detail; an > infinite number of objects implies infinitely large addresses, though > the nonsignificant zeros could be disregarded as with real numbers or > something. I don't know, I'm making this up as I go :-D). That the git > tools ever complain no such object exists is an implementation detail > resulting from our finite storage in reality. I prefer to take the hash collision into account when looking at things like that, but yeah, one could look at it like that. > > So the analogy would even hurt my mental model (just like the > > "commit --amend" command might be consider harmful, because it > > actually creates a new commit, but some users actually think the > > original commit is modified). > > Actually, this is why it's so important to have the underlying > concepts at hand. Understanding that objects are simply addressed by > content (that is, objects are immutable) completely extirpates this > kind of confusion. I never disagreed with that, though I put more emphasis on the plain object relationships and their immutability than on the fact that hashes are used. Having that part right (how objects work together to form history) is a large part of what you need to understand all the rest. > >> >> So, a pointer variable's value is an object address that is the > >> >> location of an object in git 'memory'. I think using this approach > >> >> would make things significantly more transparent. > >> > > >> > But then HEAD would be a pointer pointer variable (symbolic ref), unless > >> > you have a detached HEAD. > >> > >> We call those handles. > > > > Isn't a handle basically an opaque/abstract reference, at least in > > "modern" usage? Symvolic references aren't. The user is free to create > > and manipulate them, and gets full access to the things referenced by > > them. And saying that HEAD is a reference, that might be symbolic is > > IMHO by far easier to understand than saying that HEAD might be a > > pointer or a handle. > > Fair enough. Call them symbolic pointers; however, I don't really see > the problem with pointer pointers. You called them handles anyway ;-) But seriously, it's that "pointer" triggers C for me. And having an entity that can switch between being a pointer and a pointer pointer needs casting or a union (or a struct if you want to), not something I'd like to have to think about in my mental model of git. > In any case, I *think* my point is that it's important to understand > that git uses content addressing; at first I was emphatic about the > idea of 'addressing', so I went with pointer terminology (which works > quite well, in my opinion). However, I think the 'content' part is > more important, which is why 'object hash' is loads better than > 'object name' or 'object id'. Also, at least the documentation could > say that 'objects are addressed by their hashes', which says a whole > lot in one quick sentence about how git works. Hm, like chapter 7 "Git concepts"? >>>>>> The Object Database We already saw in the section called “Understanding History: Commits” that all commits are stored under a 40-digit "object name". In fact, all the information needed to represent the history of a project is stored in objects with such names. In each case the name is calculated by taking the SHA-1 hash of the contents of the object. The SHA-1 hash is a cryptographic hash function. What that means to us is that it is impossible to find two different objects with the same name. This has a number of advantages; among others: * Git can quickly determine whether two objects are identical or not, just by comparing names. * Since object names are computed the same way in every repository, the same content stored in two repositories will always be stored under the same name. * Git can detect errors when it reads an object, by checking that the object's name is still the SHA-1 hash of its contents. <<<<<< Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-05-02 21:11 ` Björn Steinbrink @ 2009-05-02 23:13 ` Michael Witten 2009-05-02 23:32 ` Björn Steinbrink 2009-05-03 1:18 ` Mark Lodato 0 siblings, 2 replies; 90+ messages in thread From: Michael Witten @ 2009-05-02 23:13 UTC (permalink / raw) To: Björn Steinbrink Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields 2009/5/2 Björn Steinbrink <B.Steinbrink@gmx.de>: >> In any case, I *think* my point is that it's important to understand >> that git uses content addressing; at first I was emphatic about the >> idea of 'addressing', so I went with pointer terminology (which works >> quite well, in my opinion). However, I think the 'content' part is >> more important, which is why 'object hash' is loads better than >> 'object name' or 'object id'. Also, at least the documentation could >> say that 'objects are addressed by their hashes', which says a whole >> lot in one quick sentence about how git works. > > Hm, like chapter 7 "Git concepts"? That's exactly the problem. It should be in chapter 0. I also dislike the use of 'name' rather than 'hash'; a name is something provided by the user, but a hash is something computed. The use of sha[-]1 is even more egregious. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-05-02 23:13 ` Michael Witten @ 2009-05-02 23:32 ` Björn Steinbrink 2009-05-03 1:10 ` Michael Witten 2009-05-03 1:18 ` Mark Lodato 1 sibling, 1 reply; 90+ messages in thread From: Björn Steinbrink @ 2009-05-02 23:32 UTC (permalink / raw) To: Michael Witten Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On 2009.05.02 18:13:24 -0500, Michael Witten wrote: > 2009/5/2 Björn Steinbrink <B.Steinbrink@gmx.de>: > >> In any case, I *think* my point is that it's important to understand > >> that git uses content addressing; at first I was emphatic about the > >> idea of 'addressing', so I went with pointer terminology (which works > >> quite well, in my opinion). However, I think the 'content' part is > >> more important, which is why 'object hash' is loads better than > >> 'object name' or 'object id'. Also, at least the documentation could > >> say that 'objects are addressed by their hashes', which says a whole > >> lot in one quick sentence about how git works. > > > > Hm, like chapter 7 "Git concepts"? > > That's exactly the problem. It should be in chapter 0. I'm not opposed to re-ordering stuff. Though I often think that having commands and concepts "together" is better. Maybe we just need that twice? Once the plain data model, and once a "hands on" version where the effects of the commands are described in terms of the data model. The former "sucks" for those that want to just "dive in" (but might still be happy to get told what their actions do), the latter sucks when you just want to look something up. Hm? Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-05-02 23:32 ` Björn Steinbrink @ 2009-05-03 1:10 ` Michael Witten 2009-05-03 1:48 ` Björn Steinbrink 0 siblings, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-05-03 1:10 UTC (permalink / raw) To: Björn Steinbrink Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields 2009/5/2 Björn Steinbrink <B.Steinbrink@gmx.de>: >> > Hm, like chapter 7 "Git concepts"? >> >> That's exactly the problem. It should be in chapter 0. > > I'm not opposed to re-ordering stuff. Though I often think that having > commands and concepts "together" is better. Maybe we just need that > twice? Once the plain data model, and once a "hands on" version where > the effects of the commands are described in terms of the data model. > > The former "sucks" for those that want to just "dive in" (but might > still be happy to get told what their actions do), the latter sucks when > you just want to look something up. Indeed. I think the key is to split up the documentation for these 2 paths. http://marc.info/?l=git&m=124058631814726&w=2 The mixing of the 2 is what makes everyone unhappy. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-05-03 1:10 ` Michael Witten @ 2009-05-03 1:48 ` Björn Steinbrink 0 siblings, 0 replies; 90+ messages in thread From: Björn Steinbrink @ 2009-05-03 1:48 UTC (permalink / raw) To: Michael Witten Cc: Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On 2009.05.02 20:10:14 -0500, Michael Witten wrote: > 2009/5/2 Björn Steinbrink <B.Steinbrink@gmx.de>: > >> > Hm, like chapter 7 "Git concepts"? > >> > >> That's exactly the problem. It should be in chapter 0. > > > > I'm not opposed to re-ordering stuff. Though I often think that having > > commands and concepts "together" is better. Maybe we just need that > > twice? Once the plain data model, and once a "hands on" version where > > the effects of the commands are described in terms of the data model. > > > > The former "sucks" for those that want to just "dive in" (but might > > still be happy to get told what their actions do), the latter sucks when > > you just want to look something up. > > Indeed. I think the key is to split up the documentation for these 2 paths. > > http://marc.info/?l=git&m=124058631814726&w=2 > > The mixing of the 2 is what makes everyone unhappy. I'm not sure which part of that email you're referring to (and I'm getting tired, 3:20am...). I'm just seeing the paragraph where Jeff has said that we have a split, between the tutorial and the manual. And what I tried to said, is that we might need the tutorial to be less of a "recipe collection", but more of a hands-on introduction that actively explains the data model and how data is manipulated by using the commands. And the user manual might become less example oriented, focussing more on concepts, giving examples in addition. So that we have both approaches, hands-on and theoretical, but both keeping the data model in mind, at least to some extend. For example the "hands on" version might rather create a "toy" repository than importing an existing project right away, to get a smaller scope of things to describe at once, and to be able to show e.g. full "graphs" of the early repo as it evolves. Users that simply don't want to care can still skip over the explanations and suffer^Wjust pick up the commands. You could e.g. say "To create a lightweight tag you use ..., which adds a new reference, while ... adds an annotated tags, which is a real tag object, with a message and a tagger and which can possibly be signed using your GPG key." And maybe explain the tag object a bit further. While the manual might, for example, have a section "Tags" instead of the current "Creating tags"(*), where the different types of tags are described, how they fit into the data model, what the different types of tags mean, and only then give examples how to create them. Lots of possible work... Björn (*) Why's that in the "Exploring git history chapter"? Let's see if I can sort out my local asciidoc problems and find some time to provide some basic patches for that. Though I still haven't managed to get the one for the git-push man page done... *sigh* ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-05-02 23:13 ` Michael Witten 2009-05-02 23:32 ` Björn Steinbrink @ 2009-05-03 1:18 ` Mark Lodato 2009-05-03 1:26 ` Michael Witten 1 sibling, 1 reply; 90+ messages in thread From: Mark Lodato @ 2009-05-03 1:18 UTC (permalink / raw) To: Michael Witten Cc: Björn Steinbrink, Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields 2009/5/2 Michael Witten <mfwitten@gmail.com>: > I also dislike the use of 'name' rather than 'hash'; a name is > something provided by the user, but a hash is something computed. The > use of sha[-]1 is even more egregious. What about "identifier" as a compromise between "hash" and "name"? This is really what we're talking about - a way of identifying objects. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-05-03 1:18 ` Mark Lodato @ 2009-05-03 1:26 ` Michael Witten 0 siblings, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-05-03 1:26 UTC (permalink / raw) To: Mark Lodato Cc: Björn Steinbrink, Jeff King, Daniel Barkalow, Johan Herland, git, David Abrahams, J. Bruce Fields On Sat, May 2, 2009 at 20:18, Mark Lodato <lodatom@gmail.com> wrote: > 2009/5/2 Michael Witten <mfwitten@gmail.com>: >> I also dislike the use of 'name' rather than 'hash'; a name is >> something provided by the user, but a hash is something computed. The >> use of sha[-]1 is even more egregious. > > What about "identifier" as a compromise between "hash" and "name"? > This is really what we're talking about - a way of identifying > objects. It's the same problem, in my opinion. '[Cryptographic] hash' says so much more and still remains quite generic. Also, continuing with 'sha1' doesn't seem satisfactory: http://marc.info/?l=git&m=124068702303042&w=2 ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 21:38 ` Jeff King 2009-04-24 22:18 ` Michael Witten @ 2009-04-24 23:21 ` Daniel Barkalow 2009-04-24 23:25 ` Jeff King 2009-04-24 23:29 ` Michael Witten 2009-04-25 0:19 ` David Abrahams 2 siblings, 2 replies; 90+ messages in thread From: Daniel Barkalow @ 2009-04-24 23:21 UTC (permalink / raw) To: Jeff King Cc: Johan Herland, Michael Witten, git, David Abrahams, J. Bruce Fields On Fri, 24 Apr 2009, Jeff King wrote: > On Fri, Apr 24, 2009 at 05:34:00PM -0400, Daniel Barkalow wrote: > > > I'd say that blobs and trees are an implementation detail of "the full > > content of a version of the project", not something conceptually > > important. Likewise, the date representation used in commits isn't > > I disagree. I think it's important to note that trees and blobs have a > name, and you can refer to them. Once you know that, the fact that you > can do: > > git show master > git show master:Documentation > git show master:Makefile > > just makes sense. You are always just specifying an object, but the type > is different for each (and show "does the right thing" based on object > type). > > No, that isn't critical for understanding how _commit_ operations work, > but I think that is exactly the sort of conceptual knowledge that let > people use git more fully. Yeah, I'll agree with that. They're good to explain as "these are things git can tell you about", but they're not relevant to the discussion of "what is history". (And, actually, I think git has a few usability warts due to relying too much on command line arguments being objects; it would be quite nice if "git blame 1a2b3c:Makefile" worked despite this technically being incoherent.) -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:21 ` Daniel Barkalow @ 2009-04-24 23:25 ` Jeff King 2009-04-26 23:41 ` Björn Steinbrink 2009-04-24 23:29 ` Michael Witten 1 sibling, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-24 23:25 UTC (permalink / raw) To: Daniel Barkalow Cc: Johan Herland, Michael Witten, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 07:21:26PM -0400, Daniel Barkalow wrote: > (And, actually, I think git has a few usability warts due to relying too > much on command line arguments being objects; it would be quite nice if > "git blame 1a2b3c:Makefile" worked despite this technically being > incoherent.) Yeah, I think another is that "git show master:file" will not do CRLF or other filters, and "git diff master:file other:file" will not respect diff settings. I think all of those could be solved by path lookup attaching a "here is a pathname I used to get to this object" string, which can then be accessed as appropriate. It is not all that different conceptually than what "git rev-list --objects" does. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:25 ` Jeff King @ 2009-04-26 23:41 ` Björn Steinbrink 0 siblings, 0 replies; 90+ messages in thread From: Björn Steinbrink @ 2009-04-26 23:41 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Johan Herland, Michael Witten, git, David Abrahams, J. Bruce Fields On 2009.04.24 19:25:31 -0400, Jeff King wrote: > On Fri, Apr 24, 2009 at 07:21:26PM -0400, Daniel Barkalow wrote: > > > (And, actually, I think git has a few usability warts due to relying too > > much on command line arguments being objects; it would be quite nice if > > "git blame 1a2b3c:Makefile" worked despite this technically being > > incoherent.) > > Yeah, I think another is that "git show master:file" will not do CRLF or > other filters, and "git diff master:file other:file" will not respect > diff settings. I think all of those could be solved by path lookup > attaching a "here is a pathname I used to get to this object" string, > which can then be accessed as appropriate. > > It is not all that different conceptually than what "git rev-list > --objects" does. It's also something that hash-object already does in some way. To apply e.g. attributes to content that you supply via stdin. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:21 ` Daniel Barkalow 2009-04-24 23:25 ` Jeff King @ 2009-04-24 23:29 ` Michael Witten 2009-04-27 0:00 ` Björn Steinbrink 1 sibling, 1 reply; 90+ messages in thread From: Michael Witten @ 2009-04-24 23:29 UTC (permalink / raw) To: Daniel Barkalow Cc: Jeff King, Johan Herland, git, David Abrahams, J. Bruce Fields On Fri, Apr 24, 2009 at 18:21, Daniel Barkalow <barkalow@iabervon.org> wrote: > "git blame 1a2b3c:Makefile" worked despite this technically being > incoherent. It seems to work on my end, and it's perfectly coherent if you consider git-blame to be overloaded to handle both pointers and addresses (or references and object names, if you prefer). ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 23:29 ` Michael Witten @ 2009-04-27 0:00 ` Björn Steinbrink 0 siblings, 0 replies; 90+ messages in thread From: Björn Steinbrink @ 2009-04-27 0:00 UTC (permalink / raw) To: Michael Witten Cc: Daniel Barkalow, Jeff King, Johan Herland, git, David Abrahams, J. Bruce Fields On 2009.04.24 18:29:22 -0500, Michael Witten wrote: > On Fri, Apr 24, 2009 at 18:21, Daniel Barkalow <barkalow@iabervon.org> wrote: > > "git blame 1a2b3c:Makefile" worked despite this technically being > > incoherent. > > It seems to work on my end, and it's perfectly coherent if you > consider git-blame to be overloaded to handle both pointers and > addresses (or references and object names, if you prefer). Fails for me. And it's technically incoherent in that it makes no sense to use blame with a blob object. 1a2b3c:Makefile identifies "just" a blob object. And that has no parents and no history, just contents. Only the commit objects have the references that connect them to form a history. For example, you could have a history like this: A---B---C---D---E And a file "foo" that has the same contents for A and E. Then "A:foo" and "E:foo" lead to the same blob object, and you can't uniquely go from that blob object to any commit object. So technically, you can't tell if "git blame E:foo" means "git blame E foo" or "git blame A foo" (and you can add a bunch of complexity by having, for example, a second file with a different name that had the same content at some point). To make that coherent, you must change the definition of the <tree-ish>:<path> syntax so that the context in which the path is resolved is kept, it must no longer just identify an object, but something more complex. Björn ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 21:38 ` Jeff King 2009-04-24 22:18 ` Michael Witten 2009-04-24 23:21 ` Daniel Barkalow @ 2009-04-25 0:19 ` David Abrahams 2009-04-25 0:26 ` Michael Witten 2009-04-25 0:35 ` Jeff King 2 siblings, 2 replies; 90+ messages in thread From: David Abrahams @ 2009-04-25 0:19 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Johan Herland, Michael Witten, git, J. Bruce Fields On Apr 24, 2009, at 5:38 PM, Jeff King wrote: > On Fri, Apr 24, 2009 at 05:34:00PM -0400, Daniel Barkalow wrote: > >> I'd say that blobs and trees are an implementation detail of "the >> full >> content of a version of the project", not something conceptually >> important. Likewise, the date representation used in commits isn't > > I disagree. I think it's important to note that trees and blobs have a > name, and you can refer to them. Once you know that, the fact that you > can do: > > git show master > git show master:Documentation > git show master:Makefile > > just makes sense. You are always just specifying an object, but the > type > is different for each (and show "does the right thing" based on object > type). I don't believe you need to know about trees and blobs to make sense of that. Those are just directories and files. The whole idea that trees are a more-general thing that could be used to represent something other than directory structure and blobs could be used to represent something other than file contents is way below most peoples' need-to-know threshold. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 0:19 ` David Abrahams @ 2009-04-25 0:26 ` Michael Witten 2009-04-25 0:35 ` Jeff King 1 sibling, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-04-25 0:26 UTC (permalink / raw) To: David Abrahams Cc: Jeff King, Daniel Barkalow, Johan Herland, git, J. Bruce Fields On Fri, Apr 24, 2009 at 19:19, David Abrahams <dave@boostpro.com> wrote: >> git show master >> git show master:Documentation >> git show master:Makefile >> >> just makes sense. You are always just specifying an object, but the type >> is different for each (and show "does the right thing" based on object >> type). > > I don't believe you need to know about trees and blobs to make sense of > that. Those are just directories and files. I still think the key is that commits and blobs and trees are all objects, and the important things are the concepts of objects, object addresses, object pointers, and handles (or, what everyone else calls objects, object names, references, and symbolic references). Also, you've mixed in the theory of file system addressing in with the theory of git addressing. I think it's important to realize that the tool 'git show' is actually providing a translation between the two worlds. There's not really any need for paths to be considered a fundamental git concept; simply, git tools know how to translate between both worlds. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 0:19 ` David Abrahams 2009-04-25 0:26 ` Michael Witten @ 2009-04-25 0:35 ` Jeff King 2009-04-25 0:53 ` David Abrahams 1 sibling, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-25 0:35 UTC (permalink / raw) To: David Abrahams Cc: Daniel Barkalow, Johan Herland, Michael Witten, git, J. Bruce Fields On Fri, Apr 24, 2009 at 08:19:18PM -0400, David Abrahams wrote: >> git show master >> git show master:Documentation >> git show master:Makefile >> > I don't believe you need to know about trees and blobs to make sense of > that. Those are just directories and files. The whole idea that trees > are a more-general thing that could be used to represent something other > than directory structure and blobs could be used to represent something > other than file contents is way below most peoples' need-to-know > threshold. Actually, it is not the generally of trees that I think is interesting there, but the generality of _objects_. That is, each of those things is a first-class object, and has a unique name by which it can be referred. The examples above are just _one_ of the ways you can refer to the same objects. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 0:35 ` Jeff King @ 2009-04-25 0:53 ` David Abrahams 2009-04-29 6:34 ` Jeff King 0 siblings, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-25 0:53 UTC (permalink / raw) To: Jeff King Cc: Daniel Barkalow, Johan Herland, Michael Witten, git, J. Bruce Fields On Apr 24, 2009, at 8:35 PM, Jeff King wrote: > On Fri, Apr 24, 2009 at 08:19:18PM -0400, David Abrahams wrote: > >>> git show master >>> git show master:Documentation >>> git show master:Makefile >>> >> I don't believe you need to know about trees and blobs to make >> sense of >> that. Those are just directories and files. The whole idea that >> trees >> are a more-general thing that could be used to represent something >> other >> than directory structure and blobs could be used to represent >> something >> other than file contents is way below most peoples' need-to-know >> threshold. > > Actually, it is not the generally of trees that I think is interesting > there, but the generality of _objects_. That is, each of those > things is > a first-class object, and has a unique name by which it can be > referred. I'm sorry, but I think most people would find that so unremarkable that making a big deal about it would lead to "what am I missing here" confusion. Maybe a person who's exclusively used CVS (or older) technologies before coming to Git would be happy to know that, but it's sort of obvious. In CVS the lack of first-class directories sticks out like a sore thumb. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-25 0:53 ` David Abrahams @ 2009-04-29 6:34 ` Jeff King 2009-04-29 13:27 ` David Abrahams 0 siblings, 1 reply; 90+ messages in thread From: Jeff King @ 2009-04-29 6:34 UTC (permalink / raw) To: David Abrahams; +Cc: git On Fri, Apr 24, 2009 at 08:53:37PM -0400, David Abrahams wrote: >> Actually, it is not the generally of trees that I think is interesting >> there, but the generality of _objects_. That is, each of those things is >> a first-class object, and has a unique name by which it can be >> referred. > > I'm sorry, but I think most people would find that so unremarkable that > making a big deal about it would lead to "what am I missing here" > confusion. Maybe a person who's exclusively used CVS (or older) > technologies before coming to Git would be happy to know that, but it's > sort of obvious. In CVS the lack of first-class directories sticks out > like a sore thumb. Sadly, I was away from email all weekend and so missed the ensuing storm in this thread. :) However, I did want to respond to this one point. To me (and I am talking from personal experience, so it really may be _just_ me), an important part of understanding git was understanding the object storage. That is, half of the idea of git is a big database of content-addressable objects. The _other_ half is the actual VCS built on top of it. ;) And by understanding that, and the places where objects refer to each other (commits point to other commits and to trees, trees point to blobs, blobs are always leaves), I find it easier to understand what each operation is doing. And that if I'm unsure of something, I can always inspect it at many levels. I don't know. Maybe that is too low-level for most people. I did end up working on git, so perhaps I am inordinately interested. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-29 6:34 ` Jeff King @ 2009-04-29 13:27 ` David Abrahams 2009-04-29 14:05 ` Jeff King 0 siblings, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-29 13:27 UTC (permalink / raw) To: Jeff King; +Cc: git On Apr 29, 2009, at 2:34 AM, Jeff King wrote: > On Fri, Apr 24, 2009 at 08:53:37PM -0400, David Abrahams wrote: > >>> Actually, it is not the generally of trees that I think is >>> interesting >>> there, but the generality of _objects_. That is, each of those >>> things is >>> a first-class object, and has a unique name by which it can be >>> referred. >> >> I'm sorry, but I think most people would find that so unremarkable >> that >> making a big deal about it would lead to "what am I missing here" >> confusion. Maybe a person who's exclusively used CVS (or older) >> technologies before coming to Git would be happy to know that, but >> it's >> sort of obvious. In CVS the lack of first-class directories sticks >> out >> like a sore thumb. > > Sadly, I was away from email all weekend and so missed the ensuing > storm > in this thread. :) However, I did want to respond to this one point. > > To me (and I am talking from personal experience, so it really may be > _just_ me), an important part of understanding git was understanding > the > object storage. That is, half of the idea of git is a big database of > content-addressable objects. Absolutely, it's important to know that everything is content- addressable (which essentially communicates the same important information as "the object's id is a hash of its contents"). I was trying to say that the fact that each one is a "first-class" object and has a unique name is not particularly remarkable. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-29 13:27 ` David Abrahams @ 2009-04-29 14:05 ` Jeff King 0 siblings, 0 replies; 90+ messages in thread From: Jeff King @ 2009-04-29 14:05 UTC (permalink / raw) To: David Abrahams; +Cc: git On Wed, Apr 29, 2009 at 09:27:11AM -0400, David Abrahams wrote: >> object storage. That is, half of the idea of git is a big database of >> content-addressable objects. > > Absolutely, it's important to know that everything is content-addressable > (which essentially communicates the same important information as "the > object's id is a hash of its contents"). I was trying to say that the > fact that each one is a "first-class" object and has a unique name is not > particularly remarkable. I see. I consider those concepts inextricably linked. But I suppose you could explain one without the other. Anyway, thanks for the perspective. -Peff ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-23 18:37 ` Michael Witten 2009-04-23 20:16 ` Jeff King @ 2009-04-24 2:29 ` J. Bruce Fields 2009-04-24 2:34 ` Michael Witten 2009-04-24 4:06 ` David Abrahams 1 sibling, 2 replies; 90+ messages in thread From: J. Bruce Fields @ 2009-04-24 2:29 UTC (permalink / raw) To: Michael Witten; +Cc: David Abrahams, git On Thu, Apr 23, 2009 at 01:37:05PM -0500, Michael Witten wrote: > On Thu, Apr 23, 2009 at 12:57, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Wed, Apr 22, 2009 at 03:38:52PM -0400, David Abrahams wrote: > >> > >> http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#how-to-check-out > >> covers "git reset" way too early, IMO, before one has the conceptual > >> foundation necessary to understand what it means to "modify the current > >> branch to point at v2.6.17". If this operation must be covered this > >> early in the manual, it should probably not be until > >> http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#manipulating-branches > > > > I agree; we should suggest just a git-checkout (to a detached HEAD) > > instead, though that needs a little explanation so people aren't scared > > by the warning message it gives. > > Everyone talks about "before one has the conceptual foundation > necessary to understand". Well, here's an idea: The git documentation > should start with the concepts! > > Why don't the docs start out defining blobs and trees and the object > database and references into that database? The reason everything is > so confusing is that the understanding is brushed under the tutorial > rug. People need to learn how to think before they can effectively > learn to start doing. OK, but let's not over-generalize: the person that just wants to figure out whether the driver for their network card was fixed in today's network devel tree shouldn't have to sit through a discussion of the object database. And even among readers that are in it for the long haul, I think many people will react better to something that gives them at least a little concrete how-to information up front. So the goal was always to find a tutorial route through the material that would allow us to introduce the concepts as we go along. And I agree that I haven't succeeded at that--patches welcomed, including patches that, say, move more of the current chapter 7 to an earlier place. (But this has to be done carefully, and I'd still rather it not be the *very* first thing.) I've unfortunately had a lot less time to work on this, but am happy to at least help review patches. --b. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 2:29 ` J. Bruce Fields @ 2009-04-24 2:34 ` Michael Witten 2009-04-24 4:06 ` David Abrahams 1 sibling, 0 replies; 90+ messages in thread From: Michael Witten @ 2009-04-24 2:34 UTC (permalink / raw) To: J. Bruce Fields; +Cc: David Abrahams, git On Thu, Apr 23, 2009 at 21:29, J. Bruce Fields <bfields@fieldses.org> wrote: > OK, but let's not over-generalize: the person that just wants to figure > out whether the driver for their network card was fixed in today's > network devel tree shouldn't have to sit through a discussion of the > object database. And even among readers that are in it for the long > haul, I think many people will react better to something that gives them > at least a little concrete how-to information up front. A quick shell synopsis is probably what you want then. Beyond that, casual users should be ignored; quick instructions are usually provided by each project anyway. ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 2:29 ` J. Bruce Fields 2009-04-24 2:34 ` Michael Witten @ 2009-04-24 4:06 ` David Abrahams 2009-04-24 14:10 ` J. Bruce Fields 1 sibling, 1 reply; 90+ messages in thread From: David Abrahams @ 2009-04-24 4:06 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Michael Witten, git On Apr 23, 2009, at 10:29 PM, J. Bruce Fields wrote: > On Thu, Apr 23, 2009 at 01:37:05PM -0500, Michael Witten wrote: >> On Thu, Apr 23, 2009 at 12:57, J. Bruce Fields >> <bfields@fieldses.org> wrote: >>> On Wed, Apr 22, 2009 at 03:38:52PM -0400, David Abrahams wrote: >>>> >>>> http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#how-to-check-out >>>> covers "git reset" way too early, IMO, before one has the >>>> conceptual >>>> foundation necessary to understand what it means to "modify the >>>> current >>>> branch to point at v2.6.17". If this operation must be covered >>>> this >>>> early in the manual, it should probably not be until >>>> http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#manipulating-branches >>> >>> I agree; we should suggest just a git-checkout (to a detached HEAD) >>> instead, though that needs a little explanation so people aren't >>> scared >>> by the warning message it gives. >> >> Everyone talks about "before one has the conceptual foundation >> necessary to understand". Well, here's an idea: The git documentation >> should start with the concepts! >> >> Why don't the docs start out defining blobs and trees and the object >> database and references into that database? The reason everything is >> so confusing is that the understanding is brushed under the tutorial >> rug. People need to learn how to think before they can effectively >> learn to start doing. > > OK, but let's not over-generalize: the person that just wants to > figure > out whether the driver for their network card was fixed in today's > network devel tree shouldn't have to sit through a discussion of the > object database. Those people don't need a VCS. They should download a snapshot or use a web interface. Seriously. There's no way you can make even the best-designed VCS simple enough to justify the time it takes to learn enough just to use it for that. > And even among readers that are in it for the long > haul, I think many people will react better to something that gives > them > at least a little concrete how-to information up front. People (well, people like me) should get a brief "hello, world" demo up front, to give them a feel for the flavor of the system, but [important:] it shouldn't attempt to be instructive. Fundamental concepts are next. How-to information can come after that, or after the reference information. > So the goal was always to find a tutorial route through the material > that would allow us to introduce the concepts as we go along. Maybe that will work for some people, but it *really* won't work for me. You can't start throwing around terms of art without defining them unless you want to raise more questions than you're answering. I would be surprised if it wasn't the same for many tech people. -- David Abrahams BoostPro Computing http://boostpro.com ^ permalink raw reply [flat|nested] 90+ messages in thread
* Re: [doc] User Manual Suggestion 2009-04-24 4:06 ` David Abrahams @ 2009-04-24 14:10 ` J. Bruce Fields 0 siblings, 0 replies; 90+ messages in thread From: J. Bruce Fields @ 2009-04-24 14:10 UTC (permalink / raw) To: David Abrahams; +Cc: Michael Witten, git On Fri, Apr 24, 2009 at 12:06:12AM -0400, David Abrahams wrote: > > On Apr 23, 2009, at 10:29 PM, J. Bruce Fields wrote: > >> On Thu, Apr 23, 2009 at 01:37:05PM -0500, Michael Witten wrote: >>> On Thu, Apr 23, 2009 at 12:57, J. Bruce Fields >>> <bfields@fieldses.org> wrote: >>> Why don't the docs start out defining blobs and trees and the object >>> database and references into that database? The reason everything is >>> so confusing is that the understanding is brushed under the tutorial >>> rug. People need to learn how to think before they can effectively >>> learn to start doing. >> >> OK, but let's not over-generalize: the person that just wants to >> figure >> out whether the driver for their network card was fixed in today's >> network devel tree shouldn't have to sit through a discussion of the >> object database. > > Those people don't need a VCS. They should download a snapshot or use a > web interface. Seriously. There's no way you can make even the > best-designed VCS simple enough to justify the time it takes to learn > enough just to use it for that. > >> And even among readers that are in it for the long >> haul, I think many people will react better to something that gives >> them >> at least a little concrete how-to information up front. > > People (well, people like me) should get a brief "hello, world" demo up > front, to give them a feel for the flavor of the system, but > [important:] it shouldn't attempt to be instructive. Fundamental > concepts are next. How-to information can come after that, or after the > reference information. > >> So the goal was always to find a tutorial route through the material >> that would allow us to introduce the concepts as we go along. > > Maybe that will work for some people, but it *really* won't work for me. > You can't start throwing around terms of art without defining them unless > you want to raise more questions than you're answering. I would be > surprised if it wasn't the same for many tech people. I agree that (with rare exceptions) terms shouldn't be used before they're defined. I don't agree with all of the above, but I think we could come to a satisfactory compromise. I'll see if I can find a few hours this weekend to at least sketch a new organization. But, as I've said, I'm short on time and could really use some help. --b. ^ permalink raw reply [flat|nested] 90+ messages in thread
end of thread, other threads:[~2009-05-03 1:49 UTC | newest]
Thread overview: 90+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-22 19:38 [doc] User Manual Suggestion David Abrahams
2009-04-23 17:57 ` J. Bruce Fields
2009-04-23 18:37 ` Michael Witten
2009-04-23 20:16 ` Jeff King
2009-04-23 20:45 ` Michael Witten
2009-04-23 21:31 ` David Abrahams
2009-04-24 0:31 ` Michael Witten
2009-04-24 14:18 ` Jeff King
2009-04-24 14:20 ` J. Bruce Fields
2009-04-24 17:28 ` David Abrahams
2009-04-24 18:15 ` Jeff King
2009-04-24 19:00 ` David Abrahams
2009-04-24 20:24 ` Jeff King
2009-04-24 21:06 ` David Abrahams
2009-04-24 22:45 ` Björn Steinbrink
2009-04-25 0:39 ` David Abrahams
2009-04-26 23:35 ` Björn Steinbrink
2009-04-24 14:11 ` Jeff King
2009-04-24 14:30 ` Michael Witten
2009-04-24 14:33 ` Michael Witten
2009-04-24 15:04 ` Jeff King
2009-04-24 15:18 ` Michael Witten
2009-04-24 17:38 ` J. Bruce Fields
2009-04-24 18:27 ` Jeff King
2009-04-24 18:35 ` J. Bruce Fields
[not found] ` <34BD51FF-0908-48A8-BBBC-E27B0EFB32E5@boostpro.com>
2009-04-24 18:52 ` J. Bruce Fields
2009-04-25 10:35 ` Felipe Contreras
2009-04-24 19:12 ` Michael Witten
2009-04-23 21:26 ` David Abrahams
2009-04-23 22:51 ` Johan Herland
2009-04-24 0:30 ` Michael Witten
2009-04-24 20:30 ` Johan Herland
2009-04-24 21:34 ` Daniel Barkalow
2009-04-24 21:38 ` Jeff King
2009-04-24 22:18 ` Michael Witten
2009-04-24 22:25 ` Michael Witten
2009-04-24 23:11 ` Daniel Barkalow
2009-04-24 23:14 ` Jeff King
2009-04-24 23:18 ` Michael Witten
2009-04-24 23:31 ` Michael Witten
2009-04-24 23:35 ` Jeff King
2009-04-25 0:19 ` Michael Witten
2009-04-25 10:18 ` Felipe Contreras
2009-04-24 23:26 ` Michael Witten
2009-04-25 18:55 ` Daniel Barkalow
2009-04-25 19:16 ` Michael Witten
2009-04-25 19:24 ` Felipe Contreras
2009-04-25 19:36 ` David Abrahams
2009-04-25 20:53 ` Felipe Contreras
2009-04-26 11:28 ` Björn Steinbrink
2009-04-26 13:55 ` David Abrahams
2009-04-26 17:56 ` Björn Steinbrink
2009-04-26 20:17 ` David Abrahams
2009-04-26 22:25 ` Björn Steinbrink
2009-04-27 1:41 ` David Abrahams
2009-04-27 16:30 ` David Abrahams
2009-04-27 16:52 ` Michael Witten
2009-04-26 16:36 ` Michael Witten
2009-04-26 18:12 ` Björn Steinbrink
2009-04-26 20:20 ` David Abrahams
2009-04-25 0:41 ` David Abrahams
2009-04-24 23:16 ` Björn Steinbrink
2009-04-25 0:01 ` Michael Witten
2009-04-25 0:48 ` David Abrahams
2009-04-26 22:42 ` Björn Steinbrink
2009-05-02 15:53 ` Björn Steinbrink
2009-05-02 18:36 ` Michael Witten
2009-05-02 21:11 ` Björn Steinbrink
2009-05-02 23:13 ` Michael Witten
2009-05-02 23:32 ` Björn Steinbrink
2009-05-03 1:10 ` Michael Witten
2009-05-03 1:48 ` Björn Steinbrink
2009-05-03 1:18 ` Mark Lodato
2009-05-03 1:26 ` Michael Witten
2009-04-24 23:21 ` Daniel Barkalow
2009-04-24 23:25 ` Jeff King
2009-04-26 23:41 ` Björn Steinbrink
2009-04-24 23:29 ` Michael Witten
2009-04-27 0:00 ` Björn Steinbrink
2009-04-25 0:19 ` David Abrahams
2009-04-25 0:26 ` Michael Witten
2009-04-25 0:35 ` Jeff King
2009-04-25 0:53 ` David Abrahams
2009-04-29 6:34 ` Jeff King
2009-04-29 13:27 ` David Abrahams
2009-04-29 14:05 ` Jeff King
2009-04-24 2:29 ` J. Bruce Fields
2009-04-24 2:34 ` Michael Witten
2009-04-24 4:06 ` David Abrahams
2009-04-24 14:10 ` J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).