All of lore.kernel.org
 help / color / mirror / Atom feed
From: phillip.wood123@gmail.com
To: Tachera W sasi <tacherasasi@gmail.com>, phillip.wood@dunelm.org.uk
Cc: Tach via GitGitGadget <gitgitgadget@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] status: add --json output format to git status
Date: Wed, 30 Jul 2025 11:08:43 +0100	[thread overview]
Message-ID: <2da722a3-2ed0-4f2b-a623-df0dc4aaad77@gmail.com> (raw)
In-Reply-To: <CA+m0rScismG8VVf8mYS=+_AjwgrgX8GNZrFD+8-0EaeCDYPgbg@mail.gmail.com>

Hi Tachera

On 30/07/2025 10:15, Tachera W sasi wrote:
> 
> For v2, I will:
>    * Rework the patch to use `json-writer.h` so escaping and JSON 
> compliance are handled
>      properly.

That will handle the escaping problems but we still have the problem 
that filenames and refnames are not guaranteed to be utf-8 encoded. I 
think the only way to handle that is to base64 encode them so that the 
caller can retrieve the raw bytes. See 
https://lore.kernel.org/git/ZvXMSKaUWWA-MG9J@tapette.crustytoothpaste.net/ 
for a previous discussion about this.

>    * Add tests covering filenames with quotes, newlines, and non-UTF-8 
> cases.
>    * Include tests under `t/` to prevent future regressions.

That's great, you should update Documentation/git-status.adoc as well to 
document the new option and output format

Thanks

Phillip

> Thanks again for your review and for pointing me towards the proper helpers.
> 
> Best regards,
> Tachera Sasi
> 
> On Wed, Jul 30, 2025 at 12:01 PM Phillip Wood <phillip.wood123@gmail.com 
> <mailto:phillip.wood123@gmail.com>> wrote:
> 
>     On 30/07/2025 07:27, Tach via GitGitGadget wrote:
>      > From: tacherasasi <tacherasasi@gmail.com
>     <mailto:tacherasasi@gmail.com>>
>      >
>      > Add a new --json flag to 'git status' that outputs repository state
>      > in a structured JSON format. This enables reliable machine parsing
>      > of status information for tools and automation.
>      >
>      > The JSON output includes:
>      > - Branch information (name, detached state, ahead/behind counts)
>      > - Staged files array
>      > - Unstaged files array
>      > - Untracked files array
>      > - Ignored files array (with --ignored flag)
>      >
>      > Implementation details:
>      > - Add STATUS_FORMAT_JSON to wt_status_format enum
>      > - Add JSON output option to git status and git commit
>      > - Implement JSON formatting helpers for arrays and branch info
>      > - Structure output for easy parsing and future extensibility
>      >
>      > Example:
>      >    $ git status --json
>      >    {
>      >      "branch": {
>      >        "current": "main",
>      >        "detached": false
>      >      },
>      >      "staged": ["file1.txt"],
>      >      "unstaged": ["file2.txt"],
>      >      "untracked": ["file3.txt"]
>      >    }
>      >
>      > This provides a robust alternative to parsing traditional output
>      > formats, making it easier to build reliable tools and automation
>      > around Git status information.
> 
>     What problem are you having paring the current output? You say this
>     patch provides a robust alternative but I'm afraid it does not seem to
>     handle filenames containing newlines or double quotes and it assumes
>     they are utf-8 encoded. It also seems to print a trailing comma if an
>     array has more than one item which is not allowed by the json spec. We
>     already have support for generating json output as documented
>     json-writer.h which will help with most of that but not the filename
>     encoding problem. New features should also be accompanied by tests to
>     prevent future regressions. I'm not opposed to adding json output if it
>     is robust and handles non utf-8 filenames but it would be helpful to
>     understand what problems you are having parsing the output of "git
>     status --porcelain[=v2]".
> 
>     Thanks
> 
>     Phillip
> 
>      > Signed-off-by: Tachera Sasi <tachera@ekilie.com
>     <mailto:tachera@ekilie.com>>
>      > Signed-off-by: tacherasasi <tacherasasi@gmail.com
>     <mailto:tacherasasi@gmail.com>>
>      > ---
>      >      status: add --json output format to git status
>      >
>      > Published-As: https://github.com/gitgitgadget/git/releases/tag/
>     pr-1937%2FtacheraSasi%2Ffeature%2Fjson-status-output-v1 <https://
>     github.com/gitgitgadget/git/releases/tag/
>     pr-1937%2FtacheraSasi%2Ffeature%2Fjson-status-output-v1>
>      > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git
>     <https://github.com/gitgitgadget/git> pr-1937/tacheraSasi/feature/
>     json-status-output-v1
>      > Pull-Request: https://github.com/gitgitgadget/git/pull/1937
>     <https://github.com/gitgitgadget/git/pull/1937>
>      >
>      >   builtin/commit.c |  9 ++++-
>      >   wt-status.c      | 98 +++++++++++++++++++++++++++++++++++++++++
>     +++++++
>      >   wt-status.h      |  1 +
>      >   3 files changed, 107 insertions(+), 1 deletion(-)
>      >
>      > diff --git a/builtin/commit.c b/builtin/commit.c
>      > index fba0dded64a..f1db4fdfd9a 100644
>      > --- a/builtin/commit.c
>      > +++ b/builtin/commit.c
>      > @@ -1540,6 +1540,9 @@ struct repository *repo UNUSED)
>      >               OPT_SET_INT(0, "long", &status_format,
>      >                           N_("show status in long format (default)"),
>      >                           STATUS_FORMAT_LONG),
>      > +             OPT_SET_INT(0, "json", &status_format,
>      > +                 N_("show status in JSON format"),
>      > +                 STATUS_FORMAT_JSON),
>      >               OPT_BOOL('z', "null", &s.null_termination,
>      >                        N_("terminate entries with NUL")),
>      >               {
>      > @@ -1603,7 +1606,8 @@ struct repository *repo UNUSED)
>      >                      prefix, argv);
>      >
>      >       if (status_format != STATUS_FORMAT_PORCELAIN &&
>      > -         status_format != STATUS_FORMAT_PORCELAIN_V2)
>      > +         status_format != STATUS_FORMAT_PORCELAIN_V2 &&
>      > +         status_format != STATUS_FORMAT_JSON)
>      >               progress_flag = REFRESH_PROGRESS;
>      >       repo_read_index(the_repository);
>      >       refresh_index(the_repository->index,
>      > @@ -1735,6 +1739,9 @@ int cmd_commit(int argc,
>      >               OPT_SET_INT(0, "long", &status_format,
>      >                           N_("show status in long format (default)"),
>      >                           STATUS_FORMAT_LONG),
>      > +             OPT_SET_INT(0, "json", &status_format,
>      > +                 N_("show status in JSON format"),
>      > +                 STATUS_FORMAT_JSON),
>      >               OPT_BOOL('z', "null", &s.null_termination,
>      >                        N_("terminate entries with NUL")),
>      >               OPT_BOOL(0, "amend", &amend, N_("amend previous
>     commit")),
>      > diff --git a/wt-status.c b/wt-status.c
>      > index 454601afa15..7192fb4d057 100644
>      > --- a/wt-status.c
>      > +++ b/wt-status.c
>      > @@ -2564,6 +2564,101 @@ static void wt_porcelain_v2_print(struct
>     wt_status *s)
>      >       }
>      >   }
>      >
>      > +
>      > +static void wt_json_print_string_array(struct wt_status *s,
>     const char *name, struct string_list *list)
>      > +{
>      > +     int i;
>      > +     fprintf(s->fp, "  \"%s\": [", name);
>      > +     for (i = 0; i < list->nr; i++) {
>      > +             if (i > 0)
>      > +                     fprintf(s->fp, ", ");
>      > +             fprintf(s->fp, "\"%s\"", list->items[i].string);
>      > +     }
>      > +     fprintf(s->fp, "]");
>      > +}
>      > +
>      > +static void wt_json_print_change_array(struct wt_status *s,
>     const char *name, int change_type)
>      > +{
>      > +     int i;
>      > +     struct string_list files = STRING_LIST_INIT_DUP;
>      > +
>      > +     for (i = 0; i < s->change.nr <http://change.nr>; i++) {
>      > +             struct wt_status_change_data *d;
>      > +             struct string_list_item *it;
>      > +             it = &(s->change.items[i]);
>      > +             d = it->util;
>      > +
>      > +             if ((change_type == WT_STATUS_UPDATED && d-
>      >index_status &&
>      > +                  d->index_status != DIFF_STATUS_UNMERGED) ||
>      > +                 (change_type == WT_STATUS_CHANGED && d-
>      >worktree_status &&
>      > +                  d->worktree_status != DIFF_STATUS_UNMERGED)) {
>      > +                     string_list_append(&files, it->string);
>      > +             }
>      > +     }
>      > +
>      > +     wt_json_print_string_array(s, name, &files);
>      > +     string_list_clear(&files, 0);
>      > +}
>      > +
>      > +static void wt_json_print_branch_info(struct wt_status *s)
>      > +{
>      > +     struct branch *branch;
>      > +     const char *branch_name;
>      > +     int ahead = 0, behind = 0;
>      > +
>      > +     fprintf(s->fp, "  \"branch\": {\n");
>      > +
>      > +     if (s->branch && !s->is_initial) {
>      > +             if (!strcmp(s->branch, "HEAD")) {
>      > +                     fprintf(s->fp, "    \"current\": \"HEAD\",\n");
>      > +                     fprintf(s->fp, "    \"detached\": true");
>      > +             } else {
>      > +                     if (skip_prefix(s->branch, "refs/heads/",
>     &branch_name)) {
>      > +                             fprintf(s->fp, "    \"current\":
>     \"%s\",\n", branch_name);
>      > +                             fprintf(s->fp, "    \"detached\":
>     false");
>      > +
>      > +                             branch = branch_get(branch_name);
>      > +                             if (branch && branch->merge &&
>     branch->merge[0] && branch->merge[0]->dst) {
>      > +                                     if (!
>     stat_tracking_info(branch, &ahead, &behind, NULL, 0, 0)) {
>      > +                                             fprintf(s->fp, ",
>     \n    \"ahead\": %d,\n    \"behind\": %d", ahead, behind);
>      > +                                     }
>      > +                             }
>      > +                     } else {
>      > +                             fprintf(s->fp, "    \"current\":
>     \"%s\",\n", s->branch);
>      > +                             fprintf(s->fp, "    \"detached\":
>     false");
>      > +                     }
>      > +             }
>      > +     } else {
>      > +             fprintf(s->fp, "    \"current\": null,\n");
>      > +             fprintf(s->fp, "    \"detached\": false");
>      > +     }
>      > +
>      > +     fprintf(s->fp, "\n  }");
>      > +}
>      > +
>      > +static void wt_json_status_print(struct wt_status *s)
>      > +{
>      > +     fprintf(s->fp, "{\n");
>      > +
>      > +     wt_json_print_branch_info(s);
>      > +     fprintf(s->fp, ",\n");
>      > +
>      > +     wt_json_print_change_array(s, "staged", WT_STATUS_UPDATED);
>      > +     fprintf(s->fp, ",\n");
>      > +
>      > +     wt_json_print_change_array(s, "unstaged", WT_STATUS_CHANGED);
>      > +     fprintf(s->fp, ",\n");
>      > +
>      > +     wt_json_print_string_array(s, "untracked", &s->untracked);
>      > +
>      > +     if (s->ignored.nr <http://ignored.nr> > 0) {
>      > +             fprintf(s->fp, ",\n");
>      > +             wt_json_print_string_array(s, "ignored", &s->ignored);
>      > +     }
>      > +
>      > +     fprintf(s->fp, "\n}\n");
>      > +}
>      > +
>      >   void wt_status_print(struct wt_status *s)
>      >   {
>      >       trace2_data_intmax("status", s->repo, "count/changed", s-
>      >change.nr <http://change.nr>);
>      > @@ -2583,6 +2678,9 @@ void wt_status_print(struct wt_status *s)
>      >       case STATUS_FORMAT_PORCELAIN_V2:
>      >               wt_porcelain_v2_print(s);
>      >               break;
>      > +     case STATUS_FORMAT_JSON:
>      > +             wt_json_status_print(s);
>      > +             break;
>      >       case STATUS_FORMAT_UNSPECIFIED:
>      >               BUG("finalize_deferred_config() should have been
>     called");
>      >               break;
>      > diff --git a/wt-status.h b/wt-status.h
>      > index 4e377ce62b8..e929af832b2 100644
>      > --- a/wt-status.h
>      > +++ b/wt-status.h
>      > @@ -74,6 +74,7 @@ enum wt_status_format {
>      >       STATUS_FORMAT_SHORT,
>      >       STATUS_FORMAT_PORCELAIN,
>      >       STATUS_FORMAT_PORCELAIN_V2,
>      > +     STATUS_FORMAT_JSON,
>      >
>      >       STATUS_FORMAT_UNSPECIFIED
>      >   };
>      >
>      > base-commit: cb3b40381e1d5ee32dde96521ad7cfd68eb308a6
> 


  parent reply	other threads:[~2025-07-30 10:08 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-30  6:27 [PATCH] status: add --json output format to git status Tach via GitGitGadget
2025-07-30  9:01 ` Phillip Wood
     [not found]   ` <CA+m0rScismG8VVf8mYS=+_AjwgrgX8GNZrFD+8-0EaeCDYPgbg@mail.gmail.com>
2025-07-30 10:08     ` phillip.wood123 [this message]
2025-07-30 14:58 ` Junio C Hamano
2025-08-04 13:49   ` Phillip Wood
2025-08-04 13:54     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2da722a3-2ed0-4f2b-a623-df0dc4aaad77@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=tacherasasi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.