* [RFE] Add JSON output to git log commands
@ 2025-08-17 20:17 Ron Ziroby Romero
2025-08-17 21:09 ` Kristoffer Haugsbakk
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Ron Ziroby Romero @ 2025-08-17 20:17 UTC (permalink / raw)
To: git
I would like to add JSON output to the git log command.
## Motivation
Machine parsing of git log output is prevalent, but git only provides
human-readable output. Having git output JSON directly solves problems
with the format option or third-party tools. Git has the information
in a machine-readable format. It should output in a machine-readable
format. JSON is ubiquitous and easy to generate, and therefore, it
makes sense to output JSON.
The author of one of the third-party tools says that JSON output is
the natural evolution of the Unix philosophy and should be done
natively for all tools[4].
## Current behaviour
Git log can output human-readable output in several ways. However,
outputting in JSON requires third-party tools or hacking pretty
output.
## Proposed enhancement
Add a –pretty=json flag to output logs in JSON format.
## Alternatives
### Why natively?
The `jc` command parses git log output to convert to JSON[3]. However,
it post processes and has some difficulty with time zones in dates.
Also, the author considers it to be a stopgap until Unix tools can be
adopted to output JSON natively[4], which is what I'm proposing.
In a TIL post[2], Simon Willison showcases a method using pretty
output with nulls piped to jq. This method uses the pretty command to
get delimited output. However, this script doesn't handle all the
output from git log.
Tools like git-log2json[1] parse git output post-hoc, rather than
producing structured output from git.
Providing git log in JSON format will allow us to go to the source,
where we have the log output in machine readable form, and output
directly in machine readable form directly, without going through an
intermediate format
### Why JSON?
JSON is a sufficient and popular output format. It is sufficient in
that it can represent all the fields of git log in a way that allows
for special characters like quotes, newlines, and control characters.
It is also popular. Every language has libraries to parse JSON,
including the command-line utility jq, which can read and mutate JSON.
## Use cases
The JSON can be used by tools or piped into jq to extract and
manipulate the data. Scripts can be written to work with the JSON
output.
## Design outline
* Add a `PRETTY_JSON` constant.
* Create a pretty-json.c file to output JSON log information
* Modify pretty.c to call pretty-json to output JSON when the flag is set.
* Use existing utility functions written in the existing source to
output the JSON.
## Example output
Here’s a sample with two commits:
```JSON
{
"commits": [
{
"commit": "3857aae53f3633b7de63ad640737c657387ae0c6",
"refs": [
"HEAD",
"refs/remotes/origin/main",
"refs/remotes/origin/HEAD"
],
"author": {
"name": "Somebody J. Example",
"email": "somebody@example.com",
"date": "2024-09-25T18:23:49-07:00",
"timestamp": "1727313829"
},
"committer": {
"name": "Somebody Else",
"email": "somebody.else@example.org",
"date": "2024-09-25T18:24:52-07:00",
"timestamp": "1727313892"
},
"message": "Do a thing\n"
},
{
"commit": "1522467d13a8fe29eb32209f175722df41e224b6",
"merge": [
"f92c61aef0190641e01294dad3b891b28113e1d5",
"7ffcbafbf32185da7dccb4b3f49b871f24ab58c4"
],
"author": {
"name": "Somebody J. Example",
"email": "somebody@example.com",
"date": "2024-09-25T18:24:52-07:00",
"timestamp": "1727313892"
},
"committer": {
"name": "Somebody Else",
"email": "somebody.else@example.org",
"date": "2024-09-25T18:24:52-07:00",
"timestamp": "1727313892"
},
"message": "Merge something\n\n* This,\n* That, and\n* The other\n"
}
]
}
```
## References
> [1] Context-Driven Testing Toolkit, git-log2json: Convert git log to JSON, GitHub repository, https://github.com/context-driven-testing-toolkit/git-log2json
> [2] Simon Willison, “Convert git log output to JSON using jq,” til.simonwillison.net, March 22 2023. https://til.simonwillison.net/jq/git-log-json
> [3] Kelly Brazil, jc.parsers.git_log: JSON parser for git log, jc documentation, version 1.5. Retrieved via GitHub Pages, https://kellyjonbrazil.github.io/jc/docs/parsers/git_log.html
> [4] Kelly Brazil, Bringing the Unix Philosophy to the 21st Century, Brazil’s Blog, November 26 2019. https://blog.kellybrazil.com/2019/11/26/bringing-the-unix-philosophy-to-the-21st-century/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFE] Add JSON output to git log commands
2025-08-17 20:17 [RFE] Add JSON output to git log commands Ron Ziroby Romero
@ 2025-08-17 21:09 ` Kristoffer Haugsbakk
2025-08-17 21:28 ` D. Ben Knoble
2025-08-17 22:11 ` brian m. carlson
2025-08-17 22:54 ` Junio C Hamano
2 siblings, 1 reply; 5+ messages in thread
From: Kristoffer Haugsbakk @ 2025-08-17 21:09 UTC (permalink / raw)
To: Ron Ziroby Romero, git
On Sun, Aug 17, 2025, at 22:17, Ron Ziroby Romero wrote:
> I would like to add JSON output to the git log command.
Previously: https://lore.kernel.org/git/CAGW8g7=21pPAgCixjpayEvmw_ns-hcB4e59NP476TKtCRXHPXQ@mail.gmail.com/
> ...
> ## Design outline
>
> * Add a `PRETTY_JSON` constant.
> * Create a pretty-json.c file to output JSON log information
> * Modify pretty.c to call pretty-json to output JSON when the flag is set.
> * Use existing utility functions written in the existing source to
> output the JSON.
I’m guessing that the existing `json-writer.h` is relevant.
--
Kristoffer Haugsbakk
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFE] Add JSON output to git log commands
2025-08-17 21:09 ` Kristoffer Haugsbakk
@ 2025-08-17 21:28 ` D. Ben Knoble
0 siblings, 0 replies; 5+ messages in thread
From: D. Ben Knoble @ 2025-08-17 21:28 UTC (permalink / raw)
To: Kristoffer Haugsbakk; +Cc: Ron Ziroby Romero, git
On Sun, Aug 17, 2025 at 5:09 PM Kristoffer Haugsbakk
<kristofferhaugsbakk@fastmail.com> wrote:
>
> On Sun, Aug 17, 2025, at 22:17, Ron Ziroby Romero wrote:
> > I would like to add JSON output to the git log command.
>
> Previously: https://lore.kernel.org/git/CAGW8g7=21pPAgCixjpayEvmw_ns-hcB4e59NP476TKtCRXHPXQ@mail.gmail.com/
In particular, I would guess that deciding how to handle "raw bytes"
(e.g., paths that are not necessarily UTF-8 strings) is important.
JSON uses Unicode characters:
A string is a sequence of zero or more Unicode characters,
wrapped in double quotes, using backslash escapes. A character is
represented as a single character string. A string is very much like a
C or Java string.
https://www.json.org/json-en.html
(even though the railroad diagram says "codepoints"; I find the whole
thing a bit muddy—what exactly is representable in JSON strings?
Excepting a few encoding details, that completely describes the language.
??? which details?)
--
D. Ben Knoble
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFE] Add JSON output to git log commands
2025-08-17 20:17 [RFE] Add JSON output to git log commands Ron Ziroby Romero
2025-08-17 21:09 ` Kristoffer Haugsbakk
@ 2025-08-17 22:11 ` brian m. carlson
2025-08-17 22:54 ` Junio C Hamano
2 siblings, 0 replies; 5+ messages in thread
From: brian m. carlson @ 2025-08-17 22:11 UTC (permalink / raw)
To: Ron Ziroby Romero; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 2706 bytes --]
On 2025-08-17 at 20:17:46, Ron Ziroby Romero wrote:
> I would like to add JSON output to the git log command.
>
> ## Motivation
>
> Machine parsing of git log output is prevalent, but git only provides
> human-readable output. Having git output JSON directly solves problems
> with the format option or third-party tools. Git has the information
> in a machine-readable format. It should output in a machine-readable
> format. JSON is ubiquitous and easy to generate, and therefore, it
> makes sense to output JSON.
Git provides plenty of machine-readable formats, to be clear. They're
not typically structured in a standard way like JSON or CBOR, but many
forges and other tools do successfully parse Git output with a variety
of tools.
> The author of one of the third-party tools says that JSON output is
> the natural evolution of the Unix philosophy and should be done
> natively for all tools[4].
>
> ## Current behaviour
>
> Git log can output human-readable output in several ways. However,
> outputting in JSON requires third-party tools or hacking pretty
> output.
>
> ## Proposed enhancement
>
> Add a –pretty=json flag to output logs in JSON format.
I'd like to hear how you plan to deal with non-UTF-8 byte strings since
JSON must always be valid Unicode. Most data in Git is only by
convention UTF-8 and can actually be in other encodings or no encoding
at all: refs, commit messages[0], and author and committer idents.
What would be a good idea is to add a byte string entry to the JSON
writer and use it for these formats. If the data is not valid UTF-8, or
if it contains a % sign, then you URL-encode it. Other encodings are
possible as well, but not JSON escapes[1].
Other good options would be to use CBOR instead, since it provides
native byte strings.
Bad options would be to use U+FFFD, since that makes the output useless
when you hit one of these cases (and I can tell you from $DAYJOB that
they're not that uncommon) and to just shovel bytes into the output and
let the reader be sad (which will definitely make the output useless as
well as result in angry bug reports to the list).
As a note, I think you want `--pretty`, not `-pretty` (we use two dashes
for long options).
[0] Yes, they declare an encoding, but it isn't always correct and the
encoding someone used is not always available on every system. I saw
someone in the Linux kernel history write "latin1", which is not a valid
encoding according to Ruby, which I was using to parse it.
[1] `\u00ff` represents U+00FF, which is equivalent to the byte sequence
0xc3 0xbf, not 0xff.
--
brian m. carlson (they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFE] Add JSON output to git log commands
2025-08-17 20:17 [RFE] Add JSON output to git log commands Ron Ziroby Romero
2025-08-17 21:09 ` Kristoffer Haugsbakk
2025-08-17 22:11 ` brian m. carlson
@ 2025-08-17 22:54 ` Junio C Hamano
2 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2025-08-17 22:54 UTC (permalink / raw)
To: Ron Ziroby Romero; +Cc: git
Ron Ziroby Romero <ziroby@gmail.com> writes:
> ## Design outline
>
> * Add a `PRETTY_JSON` constant.
> * Create a pretty-json.c file to output JSON log information
> * Modify pretty.c to call pretty-json to output JSON when the flag is set.
> * Use existing utility functions written in the existing source to
> output the JSON.
Is this limited to only giving another serialization format to what
is in `git cat-file commit` output for sequence of commits, which is
what I see in the example below?
Within that limited scope, I am curious what your plan is to deal
with header elements like "encoding", "gpgsig", "mergetag", etc.
And outside that scope, I am not sure what the most useful output
would be for things outside what is in each of the commit object.
E.g., various "diff" output, e.g. --stat, -p, --name-status...
Leaving that outside the scope would be a very clean way out to
avoid confusing design issues ;-)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-08-17 22:54 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-17 20:17 [RFE] Add JSON output to git log commands Ron Ziroby Romero
2025-08-17 21:09 ` Kristoffer Haugsbakk
2025-08-17 21:28 ` D. Ben Knoble
2025-08-17 22:11 ` brian m. carlson
2025-08-17 22:54 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).