From: Dirk Gouders <dirk@gouders.net>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Dirk Gouders <dirk@gouders.net>
Subject: [PATCH v2 0/1] Documentation/user-manual.txt: try to clarify on object hashes
Date: Tue, 12 Mar 2024 11:41:55 +0100 [thread overview]
Message-ID: <20240312104238.4920-1-dirk@gouders.net> (raw)
In-Reply-To: <cover.1709240261.git.dirk@gouders.net>
This is the second round of adding a hashing example to user-manual.txt.
---
Changes in v2:
- Do not go into detail about hashing in the history.
- Change code according to coding guidelines.
- Fix a typo (s/asume/assume/) and change the wording of that sentence.
- Write Git instead of `git`.
- To fit the whole document, change sample content to "Hello world", lentgh 12.
- Add verification of hash using `git hash-object`.
- Provide for empty lines around code blocks.
---
Dirk Gouders (1):
Documentation/user-manual.txt: example for generating object hashes
Documentation/user-manual.txt | 36 +++++++++++++++++++++++++++++++++--
1 file changed, 34 insertions(+), 2 deletions(-)
Range-diff against v1:
1: 6995f866e7 ! 1: 568c59d69f Documentation/user-manual.txt: example for generating object hashes
@@ Metadata
## Commit message ##
Documentation/user-manual.txt: example for generating object hashes
- If someone spends the time to work through the documentation, the
- subject "hashes" can lead to contradictions:
+ Add a simple example on how object hashes can be generated manually.
- The README of the initial commit states hashes are generated from
- compressed data (which changed very soon), whereas
- Documentation/user-manual.txt says they are generated from original
- data.
-
- Don't give doubts a chance: clarify this and present a simple example
- on how object hashes can be generated manually.
+ Further, because the document suggests to have a look at the initial
+ commit, clarify that some details changed since that time.
Signed-off-by: Dirk Gouders <dirk@gouders.net>
## Documentation/user-manual.txt ##
-@@ Documentation/user-manual.txt: that is used to name the object is the hash of the original data
+@@ Documentation/user-manual.txt: that not only specifies their type, but also provides size information
+ about the data in the object. It's worth noting that the SHA-1 hash
+ that is used to name the object is the hash of the original data
plus this header, so `sha1sum` 'file' does not match the object name
- for 'file'.
-
-+Starting with the initial commit, hashing was done on the compressed
-+data and the file README of that commit explicitely states this:
-+
-+"The SHA1 hash is always the hash of the _compressed_ object, not the
-+original one."
+-for 'file'.
++for 'file' (the earliest versions of Git hashed slightly differently
++but the conclusion is still the same).
+
-+This changed soon after that with commit
-+d98b46f8d9a3 (Do SHA1 hash _before_ compression.). Unfortunately, the
-+commit message doesn't provide the detailed reasoning.
++The following is a short example that demonstrates how these hashes
++can be generated manually:
+
-+The following is a short example that demonstrates how hashes can be
-+generated manually:
++Let's assume a small text file with some simple content:
+
-+Let's asume a small text file with the content "Hello git.\n"
+-------------------------------------------------
-+$ cat > hello.txt <<EOF
-+Hello git.
-+EOF
++$ echo "Hello world" >hello.txt
+-------------------------------------------------
+
-+We can now manually generate the hash `git` would use for this file:
++We can now manually generate the hash Git would use for this file:
+
+- The object we want the hash for is of type "blob" and its size is
-+ 11 bytes.
++ 12 bytes.
+
+- Prepend the object header to the file content and feed this to
-+ sha1sum(1):
++ `sha1sum`:
+
+-------------------------------------------------
-+$ printf "blob 11\0" | cat - hello.txt | sha1sum
-+7217614ba6e5f4e7db2edaa2cdf5fb5ee4358b57 .
++$ { printf "blob 12\0"; cat hello.txt; } | sha1sum
++802992c4220de19a90767f3000a79a31b98d0df7 -
+-------------------------------------------------
+
++This manually constructed hash can be verified using `git hash-object`
++which of course hides the addition of the header:
++
++-------------------------------------------------
++$ git hash-object hello.txt
++802992c4220de19a90767f3000a79a31b98d0df7
++-------------------------------------------------
+
As a result, the general consistency of an object can always be tested
independently of the contents or the type of the object: all objects can
- be validated by verifying that (a) their hashes match the content of the
+@@ Documentation/user-manual.txt: $ git switch --detach e83c5163
+ ----------------------------------------------------
+
+ The initial revision lays the foundation for almost everything Git has
+-today, but is small enough to read in one sitting.
++today (even though details may differ in a few places), but is small
++enough to read in one sitting.
+
+ Note that terminology has changed since that revision. For example, the
+ README in that revision uses the word "changeset" to describe what we
--
2.43.0
next prev parent reply other threads:[~2024-03-12 10:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-29 20:57 [PATCH 0/1] Documentation/user-manual.txt: try to clarify on object hashes Dirk Gouders
2024-02-29 13:05 ` [PATCH 1/1] Documentation/user-manual.txt: example for generating " Dirk Gouders
2024-02-29 21:37 ` Junio C Hamano
2024-02-29 22:35 ` Dirk Gouders
2024-02-29 22:57 ` Junio C Hamano
2024-03-08 6:45 ` Dirk Gouders
2024-03-08 15:24 ` Junio C Hamano
2024-03-08 22:11 ` Dirk Gouders
2024-03-12 10:41 ` Dirk Gouders [this message]
2024-03-12 10:41 ` [PATCH v2 " Dirk Gouders
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240312104238.4920-1-dirk@gouders.net \
--to=dirk@gouders.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).