git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
@ 2025-03-31 14:51 JAYATHEERTH K
  2025-03-31 14:59 ` JAYATHEERTH K
  2025-04-03 10:23 ` Patrick Steinhardt
  0 siblings, 2 replies; 12+ messages in thread
From: JAYATHEERTH K @ 2025-03-31 14:51 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, karthik nayak, Ghanshyam Thakkar,
	JAYATHEERTH K

# Proposal for GSOC 2025 to Git
**Machine-Readable Repository Information Query Tool**

## Contact Details
* **Name**: K Jayatheerth
* **Email**: jayatheerthkulkarni2005@gmail.com
* **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
* **GitHub**: [GitHub](https://github.com/jayatheerthkulkarni)

## **Synopsis**
This project aims to develop a dedicated Git command that interfaces
with Git’s internal APIs to produce structured JSON output,
particularly for repository metadata. By offering a clean,
machine-readable format, this tool will improve automation, scripting,
and integration with other developer tools.

## **Benefits to the Community**
### **1. Simplifies Automation and Scripting**
- Many Git commands output **human-readable text**, making automation
**error-prone** and **dependent on fragile parsing**.
- This project introduces **structured JSON output**, allowing scripts
and tools to consume repository metadata **directly and reliably**.
- No more **awkward text parsing**, `grep` hacks, or brittle `awk/sed`
pipelines—just **clean, structured data**.

### **2. Eliminates the Overuse of `git rev-parse`**
- `git rev-parse` is widely misused for extracting metadata, despite
being intended primarily for **parsing revisions**.
- Developers often **repurpose** it because there’s **no dedicated
alternative** for metadata queries.
- This project **corrects that gap** by introducing a **purpose-built
command** that is **cleaner, more intuitive, and extensible**.

### **3. Optimizes CI/CD Pipelines**
- CI/CD systems currently need **multiple Git commands** and
associated parsing logic to fetch basic metadata:

```bash
# Example: Gathering just a few common pieces of info
BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "DETACHED")
COMMIT=$(git rev-parse HEAD)
REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "no-origin")
# ... often requiring more commands and error handling logic.
```
- The proposed command aims to **replace these multiple calls** with a
**single, efficient query** returning comprehensive, structured JSON
data.
- This **simplifies pipeline scripts**, reduces process overhead, and
makes CI/CD configurations **cleaner and more robust**.

## Deliverables

This project will introduce a new Git command, tentatively named `git
metadata`, to provide reliable, machine-readable repository
information.

The key deliverables for this GSoC project include:

1. **Core `git metadata` Command:**
* A new `builtin/metadata.c` command integrated into the Git source code.
* Implementation primarily in C, utilizing existing internal Git APIs
for retrieving repository information efficiently and accurately.

2. **Default JSON Output:**
* The command will output a structured JSON object by default.
* **Initial Core Fields:**
* `repository`: Path to `.git` directory, worktree root, `is_bare` status.
* `head`: Current commit SHA (full), current reference
(`refs/heads/main`, `refs/tags/v1.0`, or detached HEAD commit), short
symbolic name (`main`, `v1.0`, or `DETACHED`).
* `remotes`: A map of remote names to their fetch and push URLs.
* *(Stretch Goal):* Basic `is_dirty` flag based on a quick index/HEAD
check (not full worktree scan).

3. **Basic Output Control:**
* *(If time permits / Stretch Goal)* Implement simple flags to control
output, e.g.:
* `--remotes-only`: Output only the `remotes` section of the JSON.
* `--head-only`: Output only the `head` section.
* `--json-errors`: Ensure that errors encountered during execution
(e.g., not in a Git repository) are reported in a structured JSON
format.

4. **Extensible Design:**
* The internal structure and JSON schema will be designed with future
extensions in mind (e.g., adding submodule info, specific config
values, tags later).

5. **Comprehensive Documentation:**
* A clear man page (`git-metadata.txt`) explaining the command's
purpose, usage, options, and JSON output format.
* Comments within the code explaining implementation details.

6. **Robust Test Suite:**
* A new test script (`t/tXXXX-metadata.sh`) using Git's test framework.
* Tests covering various repository states: standard repo, bare repo,
detached HEAD, unborn branch, repo with no remotes, etc.
* Tests validating the JSON output structure and content.

**Out of Scope for GSoC (Potential Future Work):**
* Complex status reporting (full `git status` equivalent, detailed
submodule status).
* Real-time monitoring (`--watch`).
* Comparing metadata between revisions (`--diff`).
* Alternative output formats (`--format=shell`).
* Querying arbitrary configuration values or extensive commit details
beyond HEAD.

## Technical Details

This section outlines the proposed technical approach for implementing
the core deliverables:

1. **Core `git metadata` Command & Default JSON Output:**
* **Entry Point:** Implement the command logic within a new
`builtin/metadata.c` file, defining the `cmd_metadata(...)` function
as the entry point, following Git's builtin command structure.
* **Repository Access:** The `cmd_metadata` function will operate on
the `struct repository*` provided by the command invocation
infrastructure.
* **Repository Info:**
* Retrieve the path to the `.git` directory using `repo->gitdir` (or
`get_git_dir()` if needed).
* Determine if the repository is bare using `repo->is_bare`.
* **HEAD Info:**
* Resolve the `HEAD` reference using `refs_resolve_ref_unsafe("HEAD",
RESOLVE_REF_READING, &head_oid, &head_ref_name, &head_flags)`. This
will provide the full commit OID (`head_oid`) and the full reference
name (`head_ref_name`, e.g., `"refs/heads/main"`).
* Determine the conventional short symbolic name (e.g., `"main"`,
`"v1.0"`, or `"(HEAD detached at <sha>)"`) by investigating and
utilizing existing Git functions like `refs_shorten_unambiguous_ref()`
or similar logic found in commands like `git status` or `git branch`.
Using low-level string functions like `strchr` will be avoided for
robustness.
* **Remotes Info:**
* Utilize functions from `remote.h`/`remote.c` (e.g., `remote_get`,
iterate through configured remotes) to get the list of remote names.
* For each remote, query its fetch and push URLs using Git's
configuration API (e.g., `git_config_get_string` for keys like
`remote.<name>.url` and `remote.<name>.pushurl`). Handle cases where
push URL is not explicitly set.
* **JSON Generation:**
* *(Primary Strategy):* Investigate integrating a minimal,
dependency-free, GPLv2-compatible C JSON library (e.g., cJSON, subject
to community approval) for robust JSON construction and escaping.
* *(Fallback Strategy):* If a library is not feasible, manually
construct the JSON string using Git's `strbuf` API (`strbuf_addf`,
`strbuf_addch`, `strbuf_add_json_string`, etc.), paying careful
attention to correct JSON syntax and proper escaping of string values.

2. **Documentation:**
* Create `Documentation/git-metadata.txt` following the structure and
style of existing Git man pages (e.g., `git-rev-parse.txt`,
`git-branch.txt`).
* Clearly document the command's purpose, all options (including
stretch goals if implemented), and provide a detailed description of
the default JSON output schema with examples.

3. **Testing:**
* Create a new test script `t/tXXXX-metadata.sh` using Git's
shell-based test framework (`test-lib.sh`).
* Include test cases covering:
* Standard repositories.
* Bare repositories.
* Repositories with detached HEAD state.
* Repositories on an unborn branch.
* Repositories with no remotes, one remote, multiple remotes.
* Remotes with different fetch/push URL configurations.
* Validation of the JSON output structure and specific field values
using tools like `jq` or simple `grep` checks within the tests.
* Testing of error conditions and the `--json-errors` flag output (if
implemented).

## Detailed Project Timeline


**Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)**

* **Focus:** Demonstrate continued interest and deepen understanding
while awaiting results.
* **Official GSoC Milestone:** April 8, 2025 - Proposal Deadline.
* **Activities:**
* **(April 9 - April 21):** Deep dive into Git's source code
structure, focusing specifically on areas identified in the proposal's
Technical Details:
* `builtin/` directory structure and command handling.
* `repository.h`, `refs.h`, `remote.h`, `config.c`, `strbuf.h`.
* How existing commands like `git status`, `git branch`, `git
rev-parse`, `git remote -v` access underlying data.
* **(April 22 - May 7):**
* Monitor the Git mailing list for discussions related to repository
information, command output formats, or JSON usage.
* Refine understanding of Git's testing framework as I've not done a
deep dive into tests(`t/test-lib.sh`). Try running and understanding
existing tests relevant to refs, remotes, or configuration.
* Review Git's contribution guidelines (`SubmittingPatches`, coding
style) again since most of my microproject time was related to
documentation.
* Try to start some more microprojects or actively converse in other patches.

**Phase 1: Finalize the requirements (May 8 - May 26, 2025 Approx.)**

* **Focus:** Finalize plans with mentors, setup, deep dive into specifics.
* **Official GSoC Milestone:** May 8, 2025 - Accepted Projects Announced.
* **Activities:**
* **(Week 1: May 8 - May 12):**
* Discuss the project proposal in detail, clarifying scope,
priorities, and mentor expectations.
* Finalize the decision on the JSON generation strategy (library vs.
`strbuf`) based on mentor feedback and feasibility assessment.
* Confirm the initial target JSON schema.
* **(Week 2: May 13 - May 19):**
* Perform a deep dive into the *specific* functions identified for use
(e.g., `resolve_ref_unsafe`, `shorten_unambiguous_ref`, remote access
functions, config API, chosen JSON method).
* Start outlining the structure of `builtin/metadata.c`.
* **(Week 3: May 20 - May 26):**
* Begin writing the basic skeleton of `builtin/metadata.c` and the
initial test file `t/tXXXX-metadata.sh`.
* Post first blog update summarizing Initial plan.

**Phase 2: Core Implementation & Setup (Coding Weeks 1-4: May 27 -
June 23, 2025 Approx.)**

* **Focus:** Implement the basic command structure and retrieve core
repository/HEAD information.
* **Activities:**
* **(Week 1: May 27 - June 2):** Implement `cmd_metadata` skeleton,
argument parsing (if any initially), repository struct access.
Implement retrieval of `.git` path and `is_bare` status. Integrate
chosen JSON generation approach (setup library or `strbuf` helpers).
* **(Week 2: June 3 - June 9):** Implement HEAD resolution (commit
SHA, full ref name). Implement logic for determining the short
symbolic name using appropriate Git functions. Integrate HEAD info
into JSON output.
* **(Week 3: June 10 - June 16):** Write initial test cases in
`t/tXXXX-metadata.sh` covering basic invocation, bare repos, and
detached HEAD states. Refine JSON output structure.
* **(Week 4: June 17 - June 23):** Prepare and submit the first set of
patches covering core repo/HEAD functionality to the mailing list.
Address initial feedback. Write blog post update.

**Phase 3: Adding Remotes & Refinement (Coding Weeks 5-8: June 24 -
July 21, 2025 Approx.)**

* **Focus:** Add remote information retrieval and expand testing
significantly. Aim for demonstrable core functionality by Midterm.
* **GSoC Milestone:** Midterm Evaluations.
* **Activities:**
* **(Week 5: June 24 - June 30):** Research and implement logic to
list remote names. Implement logic to query fetch/push URLs for each
remote using the config API.
* **(Week 6: July 1 - July 7):** Integrate remote information into the
JSON output structure. Handle edge cases (no remotes, missing push
URL).
* **(Week 7: July 8 - July 14):** Significantly expand the test suite:
add tests for various remote configurations, unborn branches. Refine
existing tests based on feedback. Start drafting the man page
(`Documentation/git-metadata.txt`).
* **(Week 8: July 15 - July 21):** Prepare and submit patches for
remote functionality. Ensure core command (`repo`, `head`, `remotes`
info) is stable and well-tested for Midterm Evaluation. Code cleanup
based on reviews. Write blog post update and prepare Midterm
Evaluation submission.

**Phase 4: Documentation, Polish & Stretch Goals (Coding Weeks 9-12:
July 22 - Aug 18, 2025 Approx.)**

* **Focus:** Finalize documentation, implement error handling, address
feedback, attempt stretch goals if feasible.
* **Activities:**
* **(Week 9: July 22 - July 28):** Complete the first draft of the man
page, detailing usage, JSON schema, and options. Implement the
`--json-errors` functionality for structured error reporting. Add
tests for error cases.
* **(Week 10: July 29 - Aug 4):** *Begin Stretch Goals (Conditional):*
If core work is stable and time permits, start implementing
`--head-only` / `--remotes-only` flags or the basic `is_dirty` check.
Add tests for any implemented stretch goals.
* **(Week 11: Aug 5 - Aug 11):** Thorough code cleanup, address all
outstanding review comments on submitted patches. Ensure documentation
is comprehensive and accurate. Final pass on test suite coverage.
* **(Week 12: Aug 12 - Aug 18):** Prepare and submit final patches
incorporating documentation, error handling, and any completed stretch
goals. Final code freeze for GSoC evaluation purposes. Write blog post
update summarizing final phase.

**Phase 5: Final Evaluation & Wrap-up (Aug 19 - Nov 19, 2025)**

* **Focus:** Final submissions, respond to late feedback, ensure
project completion.
* **GSoC Milestone:** Final Evaluations likely occur early in this period.
* **Official GSoC Milestone:** November 19, 2025 - Program End Date.
* **Activities:**
* **(Late Aug - Sept):** continue for any incompletions and follow up
for next set of projects(Stretch goals)
* **(Oct - Nov 19):** Monitor mailing list for patch status. Write
final GSoC project summary blog post. Continue engaging with the
community if interested in further contributions beyond GSoC.

## Past Communication and Microproject
* **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
This blog contains a detailed communication description and blog of my
microproject experience.
* First Introduction to the Git Mailing list: [first
Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@mail.gmail.com/t/#u)
* First patch to the git mailing list: [First
Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@gmail.com/t/#u)
* Most recent series of patches and back and forth with feedbacks:
[Main mail thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t)

I've been maintaing the blog and will maintain the blogs of all the
communication of mine to the git mailing list.


Thank You,
Jayatheerth

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-03-31 14:51 [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool JAYATHEERTH K
@ 2025-03-31 14:59 ` JAYATHEERTH K
  2025-04-03 10:23 ` Patrick Steinhardt
  1 sibling, 0 replies; 12+ messages in thread
From: JAYATHEERTH K @ 2025-03-31 14:59 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, karthik nayak, Ghanshyam Thakkar

Here is a doc version of the above proposal [1]
Looking forward to any feedback!!

1- https://docs.google.com/document/d/1q06OHRo0fQluoZuSN5j_wgezT74JgMWYOk8O21ixyUY/edit?usp=sharing

Thank you,
Jay

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-03-31 14:51 [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool JAYATHEERTH K
  2025-03-31 14:59 ` JAYATHEERTH K
@ 2025-04-03 10:23 ` Patrick Steinhardt
  2025-04-03 14:10   ` JAYATHEERTH K
  1 sibling, 1 reply; 12+ messages in thread
From: Patrick Steinhardt @ 2025-04-03 10:23 UTC (permalink / raw)
  To: JAYATHEERTH K; +Cc: git, karthik nayak, Ghanshyam Thakkar

On Mon, Mar 31, 2025 at 08:21:27PM +0530, JAYATHEERTH K wrote:
> ## **Synopsis**
> This project aims to develop a dedicated Git command that interfaces
> with Git’s internal APIs to produce structured JSON output,
> particularly for repository metadata. By offering a clean,
> machine-readable format, this tool will improve automation, scripting,
> and integration with other developer tools.
> 
> ## **Benefits to the Community**
> ### **1. Simplifies Automation and Scripting**
> - Many Git commands output **human-readable text**, making automation
> **error-prone** and **dependent on fragile parsing**.
> - This project introduces **structured JSON output**, allowing scripts
> and tools to consume repository metadata **directly and reliably**.
> - No more **awkward text parsing**, `grep` hacks, or brittle `awk/sed`
> pipelines—just **clean, structured data**.
> 
> ### **2. Eliminates the Overuse of `git rev-parse`**
> - `git rev-parse` is widely misused for extracting metadata, despite
> being intended primarily for **parsing revisions**.
> - Developers often **repurpose** it because there’s **no dedicated
> alternative** for metadata queries.
> - This project **corrects that gap** by introducing a **purpose-built
> command** that is **cleaner, more intuitive, and extensible**.
> 
> ### **3. Optimizes CI/CD Pipelines**
> - CI/CD systems currently need **multiple Git commands** and
> associated parsing logic to fetch basic metadata:
> 
> ```bash
> # Example: Gathering just a few common pieces of info
> BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "DETACHED")
> COMMIT=$(git rev-parse HEAD)
> REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "no-origin")
> # ... often requiring more commands and error handling logic.
> ```
> - The proposed command aims to **replace these multiple calls** with a
> **single, efficient query** returning comprehensive, structured JSON
> data.
> - This **simplifies pipeline scripts**, reduces process overhead, and
> makes CI/CD configurations **cleaner and more robust**.

I already saw this in another proposal, which indicates that the project
idea might be a bit underspecced. In any case, the goal of the project
isn't to write a single tool that is able to surface _all_ information
for a Git repository. It's rather that we want to surface low-level
information around the repository itself.

The basic intent is to give the options listed in git-rev-list(1) under
the section "Options for Files" a better home. We have a bunch of
command line options there that allow us to parse environment variables,
paths, repository formats and other low-level stuff. But these aren't
really a good fit for git-rev-parse(1) itself because that tool was
intended to be about parsing revisions. So this is one of those
organically grown commands that has started to accumulate all kinds of
unrelated options that didn't have a better home elswhere.

So the scope of the project is somewhat more limited compared to what
you propose here. As that impacts a lot of the implementation details as
well as the project timeline I'm not going to comment on these now.

> ## Detailed Project Timeline
> 
> 
> **Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)**
> 
> * **Focus:** Demonstrate continued interest and deepen understanding
> while awaiting results.
> * **Official GSoC Milestone:** April 8, 2025 - Proposal Deadline.
> * **Activities:**
> * **(April 9 - April 21):** Deep dive into Git's source code
> structure, focusing specifically on areas identified in the proposal's
> Technical Details:
> * `builtin/` directory structure and command handling.
> * `repository.h`, `refs.h`, `remote.h`, `config.c`, `strbuf.h`.
> * How existing commands like `git status`, `git branch`, `git
> rev-parse`, `git remote -v` access underlying data.
> * **(April 22 - May 7):**
> * Monitor the Git mailing list for discussions related to repository
> information, command output formats, or JSON usage.
> * Refine understanding of Git's testing framework as I've not done a
> deep dive into tests(`t/test-lib.sh`). Try running and understanding
> existing tests relevant to refs, remotes, or configuration.
> * Review Git's contribution guidelines (`SubmittingPatches`, coding
> style) again since most of my microproject time was related to
> documentation.
> * Try to start some more microprojects or actively converse in other patches.

Note that microprojects are supposed to be finished before submitting
your proposal. They are used for us mentors to figure out whether
candidates would be a good fit or not. So ideally, you would prominently
link to one or more of your finished microprojects in the proposal
itself already.

> **Phase 4: Documentation, Polish & Stretch Goals (Coding Weeks 9-12:
> July 22 - Aug 18, 2025 Approx.)**
> 
> * **Focus:** Finalize documentation, implement error handling, address
> feedback, attempt stretch goals if feasible.
> * **Activities:**
> * **(Week 9: July 22 - July 28):** Complete the first draft of the man
> page, detailing usage, JSON schema, and options. Implement the
> `--json-errors` functionality for structured error reporting. Add
> tests for error cases.
> * **(Week 10: July 29 - Aug 4):** *Begin Stretch Goals (Conditional):*
> If core work is stable and time permits, start implementing
> `--head-only` / `--remotes-only` flags or the basic `is_dirty` check.
> Add tests for any implemented stretch goals.
> * **(Week 11: Aug 5 - Aug 11):** Thorough code cleanup, address all
> outstanding review comments on submitted patches. Ensure documentation
> is comprehensive and accurate. Final pass on test suite coverage.
> * **(Week 12: Aug 12 - Aug 18):** Prepare and submit final patches
> incorporating documentation, error handling, and any completed stretch
> goals. Final code freeze for GSoC evaluation purposes. Write blog post
> update summarizing final phase.

One thing that I also mentioned to others: instead of planning for one
big batch of load, I would strongly recommend to plan your work in
smaller batches. You should ideally have multiple self-contained batches
of work that you can submit as early as possible while still bringing
some value to the project. This ensures that you can get feedback from
the bigger community early on.

> ## Past Communication and Microproject
> * **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
> This blog contains a detailed communication description and blog of my
> microproject experience.
> * First Introduction to the Git Mailing list: [first
> Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@mail.gmail.com/t/#u)
> * First patch to the git mailing list: [First
> Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@gmail.com/t/#u)
> * Most recent series of patches and back and forth with feedbacks:
> [Main mail thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t)
> 
> I've been maintaing the blog and will maintain the blogs of all the
> communication of mine to the git mailing list.

ah, you do have a microproject. As this is part of the prerequisites I
would like to propose to have this more prominently visible.

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-03 10:23 ` Patrick Steinhardt
@ 2025-04-03 14:10   ` JAYATHEERTH K
  2025-04-03 14:35     ` JAYATHEERTH K
  2025-04-04  9:13     ` Patrick Steinhardt
  0 siblings, 2 replies; 12+ messages in thread
From: JAYATHEERTH K @ 2025-04-03 14:10 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, karthik nayak, Ghanshyam Thakkar

On Thu, Apr 3, 2025 at 3:53 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Mon, Mar 31, 2025 at 08:21:27PM +0530, JAYATHEERTH K wrote:
> > ## **Synopsis**
> > This project aims to develop a dedicated Git command that interfaces
> > with Git’s internal APIs to produce structured JSON output,
> > particularly for repository metadata. By offering a clean,
> > machine-readable format, this tool will improve automation, scripting,
> > and integration with other developer tools.
> >
> > ## **Benefits to the Community**
> > ### **1. Simplifies Automation and Scripting**
> > - Many Git commands output **human-readable text**, making automation
> > **error-prone** and **dependent on fragile parsing**.
> > - This project introduces **structured JSON output**, allowing scripts
> > and tools to consume repository metadata **directly and reliably**.
> > - No more **awkward text parsing**, `grep` hacks, or brittle `awk/sed`
> > pipelines—just **clean, structured data**.
> >
> > ### **2. Eliminates the Overuse of `git rev-parse`**
> > - `git rev-parse` is widely misused for extracting metadata, despite
> > being intended primarily for **parsing revisions**.
> > - Developers often **repurpose** it because there’s **no dedicated
> > alternative** for metadata queries.
> > - This project **corrects that gap** by introducing a **purpose-built
> > command** that is **cleaner, more intuitive, and extensible**.
> >
> > ### **3. Optimizes CI/CD Pipelines**
> > - CI/CD systems currently need **multiple Git commands** and
> > associated parsing logic to fetch basic metadata:
> >
> > ```bash
> > # Example: Gathering just a few common pieces of info
> > BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "DETACHED")
> > COMMIT=$(git rev-parse HEAD)
> > REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "no-origin")
> > # ... often requiring more commands and error handling logic.
> > ```
> > - The proposed command aims to **replace these multiple calls** with a
> > **single, efficient query** returning comprehensive, structured JSON
> > data.
> > - This **simplifies pipeline scripts**, reduces process overhead, and
> > makes CI/CD configurations **cleaner and more robust**.
>
> I already saw this in another proposal, which indicates that the project
> idea might be a bit underspecced. In any case, the goal of the project

Hey Patrick, thank you for letting me know
I actually have been working on this proposal for a while now.
I also sent an e-mail regarding this specific project right before
GSOC proposals started. As far as I can see this project was not
previously discussed therefore I picked this.

https://lore.kernel.org/git/CA+rGoLdvY+JdgdzgE04EJoF9KGUpd39+2S_AgpFyucP38mdFgA@mail.gmail.com/

I'm not sure how to proceed in this situation. I think I need some
advice from your side on this.

> isn't to write a single tool that is able to surface _all_ information
> for a Git repository. It's rather that we want to surface low-level
> information around the repository itself.
>
> The basic intent is to give the options listed in git-rev-list(1) under
> the section "Options for Files" a better home. We have a bunch of
> command line options there that allow us to parse environment variables,
> paths, repository formats and other low-level stuff. But these aren't
> really a good fit for git-rev-parse(1) itself because that tool was
> intended to be about parsing revisions. So this is one of those
> organically grown commands that has started to accumulate all kinds of
> unrelated options that didn't have a better home elswhere.
>

Ok that clears a lot of things.

> So the scope of the project is somewhat more limited compared to what
> you propose here. As that impacts a lot of the implementation details as
> well as the project timeline I'm not going to comment on these now.
>

I think some parts of this proposal still hold scope like the CJSON
discussion part, repository details etc, into this revised plan, but I
think I will send a revised proposal covering the changes in detail.

> > ## Detailed Project Timeline
> >
> >
> > **Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)**
> >
> > * **Focus:** Demonstrate continued interest and deepen understanding
> > while awaiting results.
> > * **Official GSoC Milestone:** April 8, 2025 - Proposal Deadline.
> > * **Activities:**
> > * **(April 9 - April 21):** Deep dive into Git's source code
> > structure, focusing specifically on areas identified in the proposal's
> > Technical Details:
> > * `builtin/` directory structure and command handling.
> > * `repository.h`, `refs.h`, `remote.h`, `config.c`, `strbuf.h`.
> > * How existing commands like `git status`, `git branch`, `git
> > rev-parse`, `git remote -v` access underlying data.
> > * **(April 22 - May 7):**
> > * Monitor the Git mailing list for discussions related to repository
> > information, command output formats, or JSON usage.
> > * Refine understanding of Git's testing framework as I've not done a
> > deep dive into tests(`t/test-lib.sh`). Try running and understanding
> > existing tests relevant to refs, remotes, or configuration.
> > * Review Git's contribution guidelines (`SubmittingPatches`, coding
> > style) again since most of my microproject time was related to
> > documentation.
> > * Try to start some more microprojects or actively converse in other patches.
>
> Note that microprojects are supposed to be finished before submitting
> your proposal. They are used for us mentors to figure out whether
> candidates would be a good fit or not. So ideally, you would prominently
> link to one or more of your finished microprojects in the proposal
> itself already.
>

I see you've noticed below that I've been active in a microproject. I
will move it up and make it more noticable thank you for pointing it
out!!

> > **Phase 4: Documentation, Polish & Stretch Goals (Coding Weeks 9-12:
> > July 22 - Aug 18, 2025 Approx.)**
> >
> > * **Focus:** Finalize documentation, implement error handling, address
> > feedback, attempt stretch goals if feasible.
> > * **Activities:**
> > * **(Week 9: July 22 - July 28):** Complete the first draft of the man
> > page, detailing usage, JSON schema, and options. Implement the
> > `--json-errors` functionality for structured error reporting. Add
> > tests for error cases.
> > * **(Week 10: July 29 - Aug 4):** *Begin Stretch Goals (Conditional):*
> > If core work is stable and time permits, start implementing
> > `--head-only` / `--remotes-only` flags or the basic `is_dirty` check.
> > Add tests for any implemented stretch goals.
> > * **(Week 11: Aug 5 - Aug 11):** Thorough code cleanup, address all
> > outstanding review comments on submitted patches. Ensure documentation
> > is comprehensive and accurate. Final pass on test suite coverage.
> > * **(Week 12: Aug 12 - Aug 18):** Prepare and submit final patches
> > incorporating documentation, error handling, and any completed stretch
> > goals. Final code freeze for GSoC evaluation purposes. Write blog post
> > update summarizing final phase.
>
> One thing that I also mentioned to others: instead of planning for one
> big batch of load, I would strongly recommend to plan your work in
> smaller batches. You should ideally have multiple self-contained batches
> of work that you can submit as early as possible while still bringing
> some value to the project. This ensures that you can get feedback from
> the bigger community early on.
>

Ok, so I will reshape my timeline in a way where I specify my patches
while converging them to a bigger project at the end.

> > ## Past Communication and Microproject
> > * **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
> > This blog contains a detailed communication description and blog of my
> > microproject experience.
> > * First Introduction to the Git Mailing list: [first
> > Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@mail.gmail.com/t/#u)
> > * First patch to the git mailing list: [First
> > Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@gmail.com/t/#u)
> > * Most recent series of patches and back and forth with feedbacks:
> > [Main mail thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t)
> >
> > I've been maintaing the blog and will maintain the blogs of all the
> > communication of mine to the git mailing list.
>
> ah, you do have a microproject. As this is part of the prerequisites I
> would like to propose to have this more prominently visible.
>
> Thanks!
>

Thanks again this helps a lot.

> Patrick

Thank you,
Jay

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-03 14:10   ` JAYATHEERTH K
@ 2025-04-03 14:35     ` JAYATHEERTH K
  2025-04-05 19:42       ` Karthik Nayak
  2025-04-04  9:13     ` Patrick Steinhardt
  1 sibling, 1 reply; 12+ messages in thread
From: JAYATHEERTH K @ 2025-04-03 14:35 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, karthik nayak, Ghanshyam Thakkar

# Proposal for GSOC 2025 to Git
**Refactoring `git rev-parse`: A Dedicated Command for Repository Information**

## Contact Details
* **Name**: K Jayatheerth
* **Email**: jayatheerthkulkarni2005@gmail.com
* **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
* **GitHub**: [GitHub](https://github.com/jayatheerthkulkarni)

## Prerequisites & Experience


As part of the GSoC application prerequisites, I have engaged with the
Git community with a microproject involving documentation changes.
This provided valuable experience with Git's codebase, contribution
workflow (patch submission, feedback cycles), and communication via
the mailing list.

* **Microproject Patch Series:** [Main mail
thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t)
(Link to the most relevant thread demonstrating interaction and
successful patch refinement)
* **Initial Patch:** [First
Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@gmail.com/t/#u)
* **Mailing List Introduction:** [First
Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@mail.gmail.com/t/#u)
* **Blog:** My GSoC blog details these interactions:
[Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)

## **Synopsis**

This project focuses on **refactoring Git by creating a dedicated
command (tentatively named `git repo-info`) to house the low-level
repository, path, and format-related query options currently misplaced
under the "OPTIONS FOR FILES" section of `git-rev-parse(1)`**. This
new command will provide a more logical and maintainable location for
this functionality, allowing `git rev-parse` to better focus on its
core purpose of parsing revisions, thus improving Git's internal
organization and command structure clarity.

## **Benefits to the Community**

### **1. Improves `git rev-parse` Clarity and Maintainability**
- `git rev-parse` has accumulated various options unrelated to its
primary purpose of parsing revisions, particularly those for querying
low-level repository state and paths.
- This project **directly addresses this issue** by migrating these
options to a dedicated command, making `git rev-parse` cleaner and
easier to understand and maintain.
- Provides a **clearer separation of concerns** within Git's command suite.

### **2. Provides Reliable Access for Automation and Scripting**
- Scripts often need fundamental repository information like the
top-level directory path (`--show-toplevel`), the `.git` directory
location (`--git-dir`), or repository state (`--is-bare-repository`).
- Currently, scripts rely on `git rev-parse` for this, mixing
low-level repo queries with revision parsing calls.
- The new `git repo-info` command will offer a **stable, dedicated
interface** for retrieving this specific low-level information, making
scripts **cleaner and more robust** by calling the command designed
explicitly for these tasks.
- The default output will mimic the **existing, simple text format**
of the `rev-parse` options, ensuring compatibility for scripts
migrating to the new command.

### **3. Enhances CI/CD Pipeline Foundations**
- CI/CD pipelines frequently need to establish context by determining
the repository root or `.git` directory location early in their
execution.
- Using the dedicated `git repo-info` command for these foundational
queries **simplifies the initial setup steps** in pipeline scripts
compared to using the overloaded `git rev-parse`.

## Deliverables

Acknowledging the needs that the project scope is focused on
refactoring `git rev-parse`, this project will introduce a new Git
command, tentatively named `git repo-info`, serving as the designated
home for specific low-level query options.

The key deliverables for this GSoC project include:

1. **New Core Command: `git repo-info`**
* A new `builtin/repo-info.c` command integrated into the Git source code.
* Implementation primarily in C, leveraging existing internal Git APIs
and logic currently within `rev-parse.c` to implement the relocated
options.

2. **Relocated `rev-parse` Options:**
* Implementation of the core functionality behind the following
options from `git-rev-parse(1)`'s "OPTIONS FOR FILES" section within
the new `git repo-info` command:
* **Path Queries:** `--show-cdup`, `--show-prefix`, `--show-toplevel`,
`--show-superproject-working-tree`
* **Directory Queries:** `--git-dir`, `--git-common-dir`,
`--resolve-git-dir <path>`
* **State/Format Queries:** `--is-inside-git-dir`,
`--is-inside-work-tree`, `--is-bare-repository`,
`--is-shallow-repository`
* **Index File Query:** `--shared-index-path`

3. **Default Output Format (Text-Based):**
* The command's default output for each implemented option will
**match the current plain text output** produced by `git rev-parse`
for that same option, ensuring backward compatibility for scripts
migrating to the new command. Output will primarily be via standard C
functions like `printf` or `puts`.

4. **Comprehensive Documentation:**
* A clear man page (`git-repo-info.adoc`) explaining the new command's
purpose and detailing the usage and output of each implemented option.
* Updates to `git-rev-parse.adoc` to clearly **deprecate** the
relocated options (or mark them as aliases for compatibility) and
point users to the new `git repo-info` command.

5. **Robust Test Suite:**
* A new test script (`t/tXXXX-repo-info.sh`) using Git's test
framework (`test-lib.sh`).
* Tests specifically validating the output of `git repo-info --option`
against the output of `git rev-parse --option` across various
repository states (standard repo, bare repo, inside `.git`, inside
worktree, submodules, shallow clone etc.) to ensure functional parity.

6. **(Stretch Goal / Potential Future Work): Structured Output**
* If time permits after successfully implementing, documenting, and
testing the core text-based functionality, investigate adding a
`--format=json` option to provide a structured JSON output containing
the results of the requested queries. This is explicitly a secondary
goal, contingent on completing the primary refactoring task.

**Out of Scope for GSoC (Based on Refined Goal):**
* Querying high-level metadata like current branch name, HEAD commit
details (beyond `--is-shallow-repository`), remote URLs, tags, or
arbitrary configuration values.
* Complex status reporting (worktree dirtiness).
* Real-time monitoring or comparing metadata between revisions.
* Implementing JSON output as the *primary* feature.

## Technical Details

This section outlines the proposed technical approach for implementing
the `git repo-info` command and relocating the specified options:

1. **Core `git repo-info` Command Implementation:**
* **Entry Point:** Create `builtin/repo-info.c` with
`cmd_repo_info(...)` function. Parse options using Git's
`parse-options` API.
* **Repository Context:** Utilize the standard `repo` structure and
`startup_info` provided by Git's infrastructure. Setup the repository
context similar to how `cmd_rev_parse` does it if needed (e.g., using
`setup_git_directory_gently`).
* **Reusing Logic:** Analyze the implementation of the target options
within `builtin/rev-parse.c`. Extract and adapt the relevant C
functions and logic (related to path manipulation using `prefix_path`,
`real_pathcmp`; repository state checks using
`is_bare_repository_cfg`, `is_inside_git_dir`, `is_inside_work_tree`;
accessing `startup_info`, `git_path`, etc.) into `builtin/repo-info.c`
or potentially shared helper functions if appropriate.
* **Specific Option Implementation:**
* `--show-toplevel`, `--show-cdup`, `--show-prefix`: Rely on the
`prefix` calculated during setup and path manipulation functions.
* `--git-dir`, `--git-common-dir`: Access `repo->gitdir`,
`repo->commondir` or use functions like `get_git_dir()`,
`get_common_dir()`. `--resolve-git-dir` will involve path resolution
relative to the provided argument.
* `--is-*` flags: Call existing helper functions like
`is_bare_repository_cfg()`, `is_inside_git_dir()`,
`is_inside_work_tree()`. `--is-shallow-repository` involves checking
`repo->is_shallow`.
* `--shared-index-path`: Access path information related to split
indexes if enabled.
* **Output Generation:** Use standard C `printf("%s\n", ...)` or
`puts(...)` to print the resulting string (path, "true"/"false", etc.)
to standard output, matching `rev-parse`'s current behavior. Boolean
flags typically exit `0` for true and `1` for false without output,
this behavior should be preserved.

2. **Documentation:**
* Create `Documentation/git-repo-info.adoc` using AsciiDoc format,
modeling it after existing man pages. Detail each option, its purpose,
and expected output.
* Modify `Documentation/git-rev-parse.adoc`, adding notes to the
relevant options indicating they are better handled by `git repo-info`
and potentially marking them for deprecation in a future Git version.

3. **Testing:**
* Create `t/tXXXX-repo-info.sh` using `test-lib.sh`.
* Structure tests using `test_expect_success` blocks.
* Utilize helper functions like `test_create_repo`, `cd repo`,
`test_cmp` to compare the output of `git repo-info --option` directly
against `git rev-parse --option` (for options producing output) or
against expected exit codes (for boolean flags).
* Cover edge cases like running outside a repository, in a bare
repository, deep within a worktree, within the `.git` directory, and
in repositories with submodules or worktrees.

4. **(Stretch Goal) JSON Output Implementation:**
* If attempted, add a `--format=json` option using `parse-options`.
* Collect results from the requested options internally.
* Use either an approved embedded C JSON library or Git's `strbuf` API
(with helpers like `strbuf_add_json_string`) to construct a JSON
object mapping option names (or descriptive keys) to their
corresponding values. Print the final JSON string to standard output.
Add specific tests for JSON output validation.

## Detailed Project Timeline

**Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)**

* **Focus:** Demonstrate continued interest and deepen understanding
*specifically of `rev-parse`'s internals* while awaiting results.
* **Activities:**
* **(April 9 - April 21):** Deep dive into `builtin/rev-parse.c`,
identifying the exact code blocks implementing the "OPTIONS FOR
FILES". Trace how they use `startup_info`, `prefix`, path functions,
and repository flags.
* **(April 22 - May 7):** Continue monitoring the mailing list. Refine
understanding of Git's testing framework, specifically focusing on
tests for `rev-parse` options (e.g., `t1006-cat-file.sh`,
`t5601-clone.sh` might use some flags). Review contribution
guidelines.

**Phase 1: Final Planning (May 8 - May 26, 2025 Approx.)**

* **Focus:** Formal introductions, confirm final scope & plan, setup.
* **Activities:**
* **(Week 1: May 8 - May 12):** Introduction with mentor(s). Confirm
the exact list of `rev-parse` options to be migrated. Discuss the
preferred approach for handling deprecation in `rev-parse` docs/code.
Discuss potential for shared helper functions vs. direct code
migration.
* **(Week 2: May 13 - May 19):** Set up dev environment. Deep dive
into the agreed-upon functions/code blocks within `rev-parse.c`.
Outline the basic structure for `builtin/repo-info.c` and the test
script `t/tXXXX-repo-info.sh`.
* **(Week 3: May 20 - May 26):** Implement the basic `cmd_repo_info`
skeleton, option parsing setup, and repository setup boilerplate.
Write initial "no-op" tests. Post first blog update.

**Phase 2: Implementation in Batches (Coding Weeks 1-8: May 27 - July
21, 2025 Approx.)**

* **Focus:** Implement options in logical groups, test thoroughly,
submit patches early and often.
* **GSoC Milestone:** Midterm Evaluations occur around Week 8.
* **Activities:**
* **(Batch 1 / Weeks 1-2: May 27 - June 9):** Implement basic path
queries: `--show-toplevel`, `--show-prefix`, `--show-cdup`. Add tests
comparing output with `rev-parse`. **Submit Patch Series 1**.
* **(Batch 2 / Weeks 3-4: June 10 - June 23):** Implement directory
queries: `--git-dir`, `--git-common-dir`, `--resolve-git-dir <path>`.
Add tests. **Submit Patch Series 2**. Write blog post update.
* **(Batch 3 / Weeks 5-6: June 24 - July 7):** Implement boolean state
queries: `--is-bare-repository`, `--is-inside-git-dir`,
`--is-inside-work-tree`. Add tests checking exit codes and behavior in
various locations. **Submit Patch Series 3**.
* **(Batch 4 / Weeks 7-8: July 8 - July 21):** Implement remaining
queries: `--is-shallow-repository`, `--shared-index-path`,
`--show-superproject-working-tree`. Add comprehensive tests covering
interactions (e.g., in submodules, shallow clones). **Submit Patch
Series 4**. Prepare for Midterm evaluation; ensure submitted batches
demonstrate core progress. Write blog post update.

**Phase 3: Documentation & Final Polish (Coding Weeks 9-12: July 22 -
Aug 18, 2025 Approx.)**

* **Focus:** Create documentation, address feedback on all patches,
refine implementation, potentially attempt stretch goal.
* **Activities:**
* **(Week 9: July 22 - July 28):** Write the first complete draft of
the man page for `git-repo-info`. Draft the necessary updates for
`git-rev-parse.adoc` (deprecation notices). **Submit Patch Series 5
(Documentation)**.
* **(Week 10: July 29 - Aug 4):** Focus on addressing review comments
on **all** previous patch series. Refactor code based on feedback.
Ensure test suite is robust and covers feedback points.
* **(Week 11: Aug 5 - Aug 11):** *Stretch Goal (Conditional):* If core
functionality and docs are stable and reviewed positively, begin
investigating/implementing `--format=json`. Add specific JSON tests if
implemented. Otherwise, focus on further code cleanup and test
hardening.
* **(Week 12: Aug 12 - Aug 18):** Prepare and submit final versions of
all patch series, incorporating all feedback. Final testing pass.
Write blog post update summarizing progress and final state. Code
freeze for final evaluation.

**Phase 4: Final Evaluation & Wrap-up (Aug 19 - Nov 19, 2025)**

* **Focus:** Final submissions, respond to late feedback, ensure
project completion.
* **Official GSoC Milestone:** November 19, 2025 - Program End Date.
* **Activities:**
* **(Late Aug - Sept):** Submit final GSoC evaluations. Actively
respond to any further comments on submitted patches from the
community/maintainers, aiming for merge readiness.
* **(Oct - Nov 19):** Monitor mailing list for patch status. Write
final GSoC project summary blog post. Continue engaging with the
community if interested in further contributions beyond GSoC.



Thank You,
Jayatheerth

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-03 14:10   ` JAYATHEERTH K
  2025-04-03 14:35     ` JAYATHEERTH K
@ 2025-04-04  9:13     ` Patrick Steinhardt
  2025-04-04 13:22       ` JAYATHEERTH K
  1 sibling, 1 reply; 12+ messages in thread
From: Patrick Steinhardt @ 2025-04-04  9:13 UTC (permalink / raw)
  To: JAYATHEERTH K; +Cc: git, karthik nayak, Ghanshyam Thakkar

On Thu, Apr 03, 2025 at 07:40:33PM +0530, JAYATHEERTH K wrote:
> On Thu, Apr 3, 2025 at 3:53 PM Patrick Steinhardt <ps@pks.im> wrote:
> > On Mon, Mar 31, 2025 at 08:21:27PM +0530, JAYATHEERTH K wrote:
> > I already saw this in another proposal, which indicates that the project
> > idea might be a bit underspecced. In any case, the goal of the project
> 
> Hey Patrick, thank you for letting me know
> I actually have been working on this proposal for a while now.
> I also sent an e-mail regarding this specific project right before
> GSOC proposals started. As far as I can see this project was not
> previously discussed therefore I picked this.
> 
> https://lore.kernel.org/git/CA+rGoLdvY+JdgdzgE04EJoF9KGUpd39+2S_AgpFyucP38mdFgA@mail.gmail.com/
> 
> I'm not sure how to proceed in this situation. I think I need some
> advice from your side on this.

I think I don't quite understand what "this situation" refers to. Do you
mean that there are multiple proposals for this project now? If so, that
is perfectly fine and expected. There's only a finite number of projects
and a larger number of students, so some of the projects will have
mutliple applicants.

In the end we will pick the student who seems to be the best match based
on both the proposal, the microproject and any other interactions with
the community.

Patrick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-04  9:13     ` Patrick Steinhardt
@ 2025-04-04 13:22       ` JAYATHEERTH K
  0 siblings, 0 replies; 12+ messages in thread
From: JAYATHEERTH K @ 2025-04-04 13:22 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, karthik nayak, Ghanshyam Thakkar

On Fri, Apr 4, 2025 at 2:43 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Thu, Apr 03, 2025 at 07:40:33PM +0530, JAYATHEERTH K wrote:
> > On Thu, Apr 3, 2025 at 3:53 PM Patrick Steinhardt <ps@pks.im> wrote:
> > > On Mon, Mar 31, 2025 at 08:21:27PM +0530, JAYATHEERTH K wrote:
> > > I already saw this in another proposal, which indicates that the project
> > > idea might be a bit underspecced. In any case, the goal of the project
> >
> > Hey Patrick, thank you for letting me know
> > I actually have been working on this proposal for a while now.
> > I also sent an e-mail regarding this specific project right before
> > GSOC proposals started. As far as I can see this project was not
> > previously discussed therefore I picked this.
> >
> > https://lore.kernel.org/git/CA+rGoLdvY+JdgdzgE04EJoF9KGUpd39+2S_AgpFyucP38mdFgA@mail.gmail.com/
> >
> > I'm not sure how to proceed in this situation. I think I need some
> > advice from your side on this.
>
> I think I don't quite understand what "this situation" refers to. Do you
> mean that there are multiple proposals for this project now? If so, that
> is perfectly fine and expected. There's only a finite number of projects
> and a larger number of students, so some of the projects will have
> mutliple applicants.
>
> In the end we will pick the student who seems to be the best match based
> on both the proposal, the microproject and any other interactions with
> the community.
>
Understood, I will carry on as is.
I've sent the updated proposal above in this thread with the revised
scope of the project.

> Patrick

- Jayatheerth

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-03 14:35     ` JAYATHEERTH K
@ 2025-04-05 19:42       ` Karthik Nayak
  2025-04-06  5:40         ` JAYATHEERTH K
  0 siblings, 1 reply; 12+ messages in thread
From: Karthik Nayak @ 2025-04-05 19:42 UTC (permalink / raw)
  To: JAYATHEERTH K, Patrick Steinhardt; +Cc: git, Ghanshyam Thakkar

[-- Attachment #1: Type: text/plain, Size: 17341 bytes --]

JAYATHEERTH K <jayatheerthkulkarni2005@gmail.com> writes:

> # Proposal for GSOC 2025 to Git
> **Refactoring `git rev-parse`: A Dedicated Command for Repository Information**
>
> ## Contact Details
> * **Name**: K Jayatheerth
> * **Email**: jayatheerthkulkarni2005@gmail.com
> * **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
> * **GitHub**: [GitHub](https://github.com/jayatheerthkulkarni)
>
> ## Prerequisites & Experience
>
>
> As part of the GSoC application prerequisites, I have engaged with the
> Git community with a microproject involving documentation changes.
> This provided valuable experience with Git's codebase, contribution
> workflow (patch submission, feedback cycles), and communication via
> the mailing list.
>
> * **Microproject Patch Series:** [Main mail
> thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t)
> (Link to the most relevant thread demonstrating interaction and
> successful patch refinement)
> * **Initial Patch:** [First
> Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@gmail.com/t/#u)
> * **Mailing List Introduction:** [First
> Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@mail.gmail.com/t/#u)
> * **Blog:** My GSoC blog details these interactions:
> [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
>

It would be nice to give a small brief about your microproject here and
perhaps its current status.

> ## **Synopsis**
>
> This project focuses on **refactoring Git by creating a dedicated
> command (tentatively named `git repo-info`) to house the low-level

I wonder if 'git info' itself would be a good name, we could default the
command to list all prominent information about a repository. This would
be useful instead of scripts invoking 'git rev-parse
--is-bare-repository' followed by a 'git rev-parse --is-inside-git-dir'
and so on. But we can discuss this later.

> repository, path, and format-related query options currently misplaced
> under the "OPTIONS FOR FILES" section of `git-rev-parse(1)`**. This
> new command will provide a more logical and maintainable location for
> this functionality, allowing `git rev-parse` to better focus on its
> core purpose of parsing revisions, thus improving Git's internal
> organization and command structure clarity.
>
> ## **Benefits to the Community**
>
> ### **1. Improves `git rev-parse` Clarity and Maintainability**
> - `git rev-parse` has accumulated various options unrelated to its
> primary purpose of parsing revisions, particularly those for querying
> low-level repository state and paths.
> - This project **directly addresses this issue** by migrating these
> options to a dedicated command, making `git rev-parse` cleaner and
> easier to understand and maintain.
> - Provides a **clearer separation of concerns** within Git's command suite.
>
> ### **2. Provides Reliable Access for Automation and Scripting**
> - Scripts often need fundamental repository information like the
> top-level directory path (`--show-toplevel`), the `.git` directory
> location (`--git-dir`), or repository state (`--is-bare-repository`).
> - Currently, scripts rely on `git rev-parse` for this, mixing
> low-level repo queries with revision parsing calls.
> - The new `git repo-info` command will offer a **stable, dedicated
> interface** for retrieving this specific low-level information, making
> scripts **cleaner and more robust** by calling the command designed
> explicitly for these tasks.
> - The default output will mimic the **existing, simple text format**
> of the `rev-parse` options, ensuring compatibility for scripts
> migrating to the new command.
>
> ### **3. Enhances CI/CD Pipeline Foundations**
> - CI/CD pipelines frequently need to establish context by determining
> the repository root or `.git` directory location early in their
> execution.
> - Using the dedicated `git repo-info` command for these foundational
> queries **simplifies the initial setup steps** in pipeline scripts
> compared to using the overloaded `git rev-parse`.
>

We must note that using 'git rev-parse' isn't sub-optimal. Matter of
fact, if you look at 'builtin/rev-parse.c', you'll see that each of
these flags are under an if..else clause. The goal of this project is
more to provide a clean interface and a home for such information
queries.

As such the main goals of the project, are very design oriented, as
you've also mentioned, I would say:
1. What should the command be called?
2. What sub-commands should it support?
3. What options from 'git-rev-parse(1)' do we need to port, do they need
   to be renamed?
4. What other options can we possibly provide?

> ## Deliverables
>
> Acknowledging the needs that the project scope is focused on
> refactoring `git rev-parse`, this project will introduce a new Git
> command, tentatively named `git repo-info`, serving as the designated
> home for specific low-level query options.
>
> The key deliverables for this GSoC project include:
>
> 1. **New Core Command: `git repo-info`**
> * A new `builtin/repo-info.c` command integrated into the Git source code.
> * Implementation primarily in C, leveraging existing internal Git APIs
> and logic currently within `rev-parse.c` to implement the relocated
> options.
>
> 2. **Relocated `rev-parse` Options:**
> * Implementation of the core functionality behind the following
> options from `git-rev-parse(1)`'s "OPTIONS FOR FILES" section within
> the new `git repo-info` command:
> * **Path Queries:** `--show-cdup`, `--show-prefix`, `--show-toplevel`,
> `--show-superproject-working-tree`
> * **Directory Queries:** `--git-dir`, `--git-common-dir`,
> `--resolve-git-dir <path>`
> * **State/Format Queries:** `--is-inside-git-dir`,
> `--is-inside-work-tree`, `--is-bare-repository`,
> `--is-shallow-repository`
> * **Index File Query:** `--shared-index-path`
>

Perhaps we want to breakdown so we can have:

git info path [--cdup | --prefix | --toplevel ... ]
git info repo [--is-bare | --is-shallow]

and so on...

> 3. **Default Output Format (Text-Based):**
> * The command's default output for each implemented option will
> **match the current plain text output** produced by `git rev-parse`
> for that same option, ensuring backward compatibility for scripts
> migrating to the new command. Output will primarily be via standard C
> functions like `printf` or `puts`.

Since we'll keep the existing options within 'git-rev-parse(1)', we'll
stay backwards compatible. Scripts which want to move to the new
command, would anyway need to change the command, so there is no
backward compatibility there. But, having the default to be human
readable like the current implementation does make sense.

>
> 4. **Comprehensive Documentation:**
> * A clear man page (`git-repo-info.adoc`) explaining the new command's
> purpose and detailing the usage and output of each implemented option.
> * Updates to `git-rev-parse.adoc` to clearly **deprecate** the
> relocated options (or mark them as aliases for compatibility) and
> point users to the new `git repo-info` command.
>
> 5. **Robust Test Suite:**
> * A new test script (`t/tXXXX-repo-info.sh`) using Git's test
> framework (`test-lib.sh`).
> * Tests specifically validating the output of `git repo-info --option`
> against the output of `git rev-parse --option` across various
> repository states (standard repo, bare repo, inside `.git`, inside
> worktree, submodules, shallow clone etc.) to ensure functional parity.
>
> 6. **(Stretch Goal / Potential Future Work): Structured Output**
> * If time permits after successfully implementing, documenting, and
> testing the core text-based functionality, investigate adding a
> `--format=json` option to provide a structured JSON output containing
> the results of the requested queries. This is explicitly a secondary
> goal, contingent on completing the primary refactoring task.

Many of the plumbing commands in Git provide NUL-terminated output, I'm
curious whether we should consider that over JSON.

> **Out of Scope for GSoC (Based on Refined Goal):**
> * Querying high-level metadata like current branch name, HEAD commit
> details (beyond `--is-shallow-repository`), remote URLs, tags, or
> arbitrary configuration values.

We should make sure that we don't overload this new command too.

> * Complex status reporting (worktree dirtiness).
> * Real-time monitoring or comparing metadata between revisions.
> * Implementing JSON output as the *primary* feature.
>
> ## Technical Details
>
> This section outlines the proposed technical approach for implementing
> the `git repo-info` command and relocating the specified options:
>
> 1. **Core `git repo-info` Command Implementation:**
> * **Entry Point:** Create `builtin/repo-info.c` with
> `cmd_repo_info(...)` function. Parse options using Git's
> `parse-options` API.
> * **Repository Context:** Utilize the standard `repo` structure and
> `startup_info` provided by Git's infrastructure. Setup the repository
> context similar to how `cmd_rev_parse` does it if needed (e.g., using
> `setup_git_directory_gently`).
> * **Reusing Logic:** Analyze the implementation of the target options
> within `builtin/rev-parse.c`. Extract and adapt the relevant C
> functions and logic (related to path manipulation using `prefix_path`,
> `real_pathcmp`; repository state checks using
> `is_bare_repository_cfg`, `is_inside_git_dir`, `is_inside_work_tree`;
> accessing `startup_info`, `git_path`, etc.) into `builtin/repo-info.c`
> or potentially shared helper functions if appropriate.
> * **Specific Option Implementation:**
> * `--show-toplevel`, `--show-cdup`, `--show-prefix`: Rely on the
> `prefix` calculated during setup and path manipulation functions.
> * `--git-dir`, `--git-common-dir`: Access `repo->gitdir`,
> `repo->commondir` or use functions like `get_git_dir()`,
> `get_common_dir()`. `--resolve-git-dir` will involve path resolution
> relative to the provided argument.
> * `--is-*` flags: Call existing helper functions like
> `is_bare_repository_cfg()`, `is_inside_git_dir()`,
> `is_inside_work_tree()`. `--is-shallow-repository` involves checking
> `repo->is_shallow`.
> * `--shared-index-path`: Access path information related to split
> indexes if enabled.
> * **Output Generation:** Use standard C `printf("%s\n", ...)` or
> `puts(...)` to print the resulting string (path, "true"/"false", etc.)
> to standard output, matching `rev-parse`'s current behavior. Boolean
> flags typically exit `0` for true and `1` for false without output,
> this behavior should be preserved.
>
> 2. **Documentation:**
> * Create `Documentation/git-repo-info.adoc` using AsciiDoc format,
> modeling it after existing man pages. Detail each option, its purpose,
> and expected output.
> * Modify `Documentation/git-rev-parse.adoc`, adding notes to the
> relevant options indicating they are better handled by `git repo-info`
> and potentially marking them for deprecation in a future Git version.
>
> 3. **Testing:**
> * Create `t/tXXXX-repo-info.sh` using `test-lib.sh`.
> * Structure tests using `test_expect_success` blocks.
> * Utilize helper functions like `test_create_repo`, `cd repo`,
> `test_cmp` to compare the output of `git repo-info --option` directly
> against `git rev-parse --option` (for options producing output) or
> against expected exit codes (for boolean flags).
> * Cover edge cases like running outside a repository, in a bare
> repository, deep within a worktree, within the `.git` directory, and
> in repositories with submodules or worktrees.
>
> 4. **(Stretch Goal) JSON Output Implementation:**
> * If attempted, add a `--format=json` option using `parse-options`.
> * Collect results from the requested options internally.
> * Use either an approved embedded C JSON library or Git's `strbuf` API
> (with helpers like `strbuf_add_json_string`) to construct a JSON
> object mapping option names (or descriptive keys) to their
> corresponding values. Print the final JSON string to standard output.
> Add specific tests for JSON output validation.
>
> ## Detailed Project Timeline
>
> **Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)**
>
> * **Focus:** Demonstrate continued interest and deepen understanding
> *specifically of `rev-parse`'s internals* while awaiting results.
> * **Activities:**
> * **(April 9 - April 21):** Deep dive into `builtin/rev-parse.c`,
> identifying the exact code blocks implementing the "OPTIONS FOR
> FILES". Trace how they use `startup_info`, `prefix`, path functions,
> and repository flags.
> * **(April 22 - May 7):** Continue monitoring the mailing list. Refine
> understanding of Git's testing framework, specifically focusing on
> tests for `rev-parse` options (e.g., `t1006-cat-file.sh`,
> `t5601-clone.sh` might use some flags). Review contribution
> guidelines.
>
> **Phase 1: Final Planning (May 8 - May 26, 2025 Approx.)**
>
> * **Focus:** Formal introductions, confirm final scope & plan, setup.
> * **Activities:**
> * **(Week 1: May 8 - May 12):** Introduction with mentor(s). Confirm
> the exact list of `rev-parse` options to be migrated. Discuss the
> preferred approach for handling deprecation in `rev-parse` docs/code.
> Discuss potential for shared helper functions vs. direct code
> migration.
> * **(Week 2: May 13 - May 19):** Set up dev environment. Deep dive
> into the agreed-upon functions/code blocks within `rev-parse.c`.
> Outline the basic structure for `builtin/repo-info.c` and the test
> script `t/tXXXX-repo-info.sh`.
> * **(Week 3: May 20 - May 26):** Implement the basic `cmd_repo_info`
> skeleton, option parsing setup, and repository setup boilerplate.
> Write initial "no-op" tests. Post first blog update.
>
> **Phase 2: Implementation in Batches (Coding Weeks 1-8: May 27 - July
> 21, 2025 Approx.)**
>
> * **Focus:** Implement options in logical groups, test thoroughly,
> submit patches early and often.
> * **GSoC Milestone:** Midterm Evaluations occur around Week 8.
> * **Activities:**
> * **(Batch 1 / Weeks 1-2: May 27 - June 9):** Implement basic path
> queries: `--show-toplevel`, `--show-prefix`, `--show-cdup`. Add tests
> comparing output with `rev-parse`. **Submit Patch Series 1**.
> * **(Batch 2 / Weeks 3-4: June 10 - June 23):** Implement directory
> queries: `--git-dir`, `--git-common-dir`, `--resolve-git-dir <path>`.
> Add tests. **Submit Patch Series 2**. Write blog post update.
> * **(Batch 3 / Weeks 5-6: June 24 - July 7):** Implement boolean state
> queries: `--is-bare-repository`, `--is-inside-git-dir`,
> `--is-inside-work-tree`. Add tests checking exit codes and behavior in
> various locations. **Submit Patch Series 3**.
> * **(Batch 4 / Weeks 7-8: July 8 - July 21):** Implement remaining
> queries: `--is-shallow-repository`, `--shared-index-path`,
> `--show-superproject-working-tree`. Add comprehensive tests covering
> interactions (e.g., in submodules, shallow clones). **Submit Patch
> Series 4**. Prepare for Midterm evaluation; ensure submitted batches
> demonstrate core progress. Write blog post update.
>

It is nice to see breakdown in batches.

> **Phase 3: Documentation & Final Polish (Coding Weeks 9-12: July 22 -
> Aug 18, 2025 Approx.)**
>
> * **Focus:** Create documentation, address feedback on all patches,
> refine implementation, potentially attempt stretch goal.

I would say documentation should go hand-in-hand with each patch series
that you send. Ideally every patch series should leave the code base in
a usable state. Pushing documentation to the end, would mean that if the
project is stopped midway, we'd have a running command in Git with no
documentation about what it does.

> * **Activities:**
> * **(Week 9: July 22 - July 28):** Write the first complete draft of
> the man page for `git-repo-info`. Draft the necessary updates for
> `git-rev-parse.adoc` (deprecation notices). **Submit Patch Series 5
> (Documentation)**.
> * **(Week 10: July 29 - Aug 4):** Focus on addressing review comments
> on **all** previous patch series. Refactor code based on feedback.
> Ensure test suite is robust and covers feedback points.
> * **(Week 11: Aug 5 - Aug 11):** *Stretch Goal (Conditional):* If core
> functionality and docs are stable and reviewed positively, begin
> investigating/implementing `--format=json`. Add specific JSON tests if
> implemented. Otherwise, focus on further code cleanup and test
> hardening.
> * **(Week 12: Aug 12 - Aug 18):** Prepare and submit final versions of
> all patch series, incorporating all feedback. Final testing pass.
> Write blog post update summarizing progress and final state. Code
> freeze for final evaluation.
>
> **Phase 4: Final Evaluation & Wrap-up (Aug 19 - Nov 19, 2025)**
>
> * **Focus:** Final submissions, respond to late feedback, ensure
> project completion.
> * **Official GSoC Milestone:** November 19, 2025 - Program End Date.
> * **Activities:**
> * **(Late Aug - Sept):** Submit final GSoC evaluations. Actively
> respond to any further comments on submitted patches from the
> community/maintainers, aiming for merge readiness.
> * **(Oct - Nov 19):** Monitor mailing list for patch status. Write
> final GSoC project summary blog post. Continue engaging with the
> community if interested in further contributions beyond GSoC.
>
>
>
> Thank You,
> Jayatheerth

Thanks for your proposal!

- Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-05 19:42       ` Karthik Nayak
@ 2025-04-06  5:40         ` JAYATHEERTH K
  2025-04-06  6:09           ` JAYATHEERTH K
  0 siblings, 1 reply; 12+ messages in thread
From: JAYATHEERTH K @ 2025-04-06  5:40 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Patrick Steinhardt, git, Ghanshyam Thakkar

On Sun, Apr 6, 2025 at 1:12 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> JAYATHEERTH K <jayatheerthkulkarni2005@gmail.com> writes:
>
> > # Proposal for GSOC 2025 to Git
> > **Refactoring `git rev-parse`: A Dedicated Command for Repository Information**
> >
> > ## Contact Details
> > * **Name**: K Jayatheerth
> > * **Email**: jayatheerthkulkarni2005@gmail.com
> > * **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
> > * **GitHub**: [GitHub](https://github.com/jayatheerthkulkarni)
> >
> > ## Prerequisites & Experience
> >
> >
> > As part of the GSoC application prerequisites, I have engaged with the
> > Git community with a microproject involving documentation changes.
> > This provided valuable experience with Git's codebase, contribution
> > workflow (patch submission, feedback cycles), and communication via
> > the mailing list.
> >
> > * **Microproject Patch Series:** [Main mail
> > thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t)
> > (Link to the most relevant thread demonstrating interaction and
> > successful patch refinement)
> > * **Initial Patch:** [First
> > Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@gmail.com/t/#u)
> > * **Mailing List Introduction:** [First
> > Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@mail.gmail.com/t/#u)
> > * **Blog:** My GSoC blog details these interactions:
> > [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
> >
>
> It would be nice to give a small brief about your microproject here and
> perhaps its current status.
>

Sure I will add that.

> > ## **Synopsis**
> >
> > This project focuses on **refactoring Git by creating a dedicated
> > command (tentatively named `git repo-info`) to house the low-level
>
> I wonder if 'git info' itself would be a good name, we could default the
> command to list all prominent information about a repository. This would
> be useful instead of scripts invoking 'git rev-parse

Yeah even git info doesn't overlap with any builtin/third
party(popular commands) as far as I can search.

> --is-bare-repository' followed by a 'git rev-parse --is-inside-git-dir'
> and so on. But we can discuss this later.
>

True, but for the proposal I listed some commands in detail and
technical details with batch wise division of timeline as required.

> > repository, path, and format-related query options currently misplaced
> > under the "OPTIONS FOR FILES" section of `git-rev-parse(1)`**. This
> > new command will provide a more logical and maintainable location for
> > this functionality, allowing `git rev-parse` to better focus on its
> > core purpose of parsing revisions, thus improving Git's internal
> > organization and command structure clarity.
> >
> > ## **Benefits to the Community**
> >
> > ### **1. Improves `git rev-parse` Clarity and Maintainability**
> > - `git rev-parse` has accumulated various options unrelated to its
> > primary purpose of parsing revisions, particularly those for querying
> > low-level repository state and paths.
> > - This project **directly addresses this issue** by migrating these
> > options to a dedicated command, making `git rev-parse` cleaner and
> > easier to understand and maintain.
> > - Provides a **clearer separation of concerns** within Git's command suite.
> >
> > ### **2. Provides Reliable Access for Automation and Scripting**
> > - Scripts often need fundamental repository information like the
> > top-level directory path (`--show-toplevel`), the `.git` directory
> > location (`--git-dir`), or repository state (`--is-bare-repository`).
> > - Currently, scripts rely on `git rev-parse` for this, mixing
> > low-level repo queries with revision parsing calls.
> > - The new `git repo-info` command will offer a **stable, dedicated
> > interface** for retrieving this specific low-level information, making
> > scripts **cleaner and more robust** by calling the command designed
> > explicitly for these tasks.
> > - The default output will mimic the **existing, simple text format**
> > of the `rev-parse` options, ensuring compatibility for scripts
> > migrating to the new command.
> >
> > ### **3. Enhances CI/CD Pipeline Foundations**
> > - CI/CD pipelines frequently need to establish context by determining
> > the repository root or `.git` directory location early in their
> > execution.
> > - Using the dedicated `git repo-info` command for these foundational
> > queries **simplifies the initial setup steps** in pipeline scripts
> > compared to using the overloaded `git rev-parse`.
> >
>
> We must note that using 'git rev-parse' isn't sub-optimal. Matter of
> fact, if you look at 'builtin/rev-parse.c', you'll see that each of
> these flags are under an if..else clause. The goal of this project is
> more to provide a clean interface and a home for such information
> queries.
>

I've gone through the code in rev-parse.c and yes I found the if blocks,
I'm taking that you mean that git rev-parse is not meant to do these
tasks, but it is good at what it does.
If this is the case I think it makes the project easier, because we
have to port similar code and test and document it carefully.

> As such the main goals of the project, are very design oriented, as
> you've also mentioned, I would say:
> 1. What should the command be called?
> 2. What sub-commands should it support?
> 3. What options from 'git-rev-parse(1)' do we need to port, do they need
>    to be renamed?
> 4. What other options can we possibly provide?
>

Yup, I think I've not explored the 3 -> do they need to be renamed?
Maybe, that's also because the list of commands to port is still a bit
ambiguous, we can discuss the internal names once we create a detailed
list of commands that we need to port into the `git info` command.

> > ## Deliverables
> >
> > Acknowledging the needs that the project scope is focused on
> > refactoring `git rev-parse`, this project will introduce a new Git
> > command, tentatively named `git repo-info`, serving as the designated
> > home for specific low-level query options.
> >
> > The key deliverables for this GSoC project include:
> >
> > 1. **New Core Command: `git repo-info`**
> > * A new `builtin/repo-info.c` command integrated into the Git source code.
> > * Implementation primarily in C, leveraging existing internal Git APIs
> > and logic currently within `rev-parse.c` to implement the relocated
> > options.
> >
> > 2. **Relocated `rev-parse` Options:**
> > * Implementation of the core functionality behind the following
> > options from `git-rev-parse(1)`'s "OPTIONS FOR FILES" section within
> > the new `git repo-info` command:
> > * **Path Queries:** `--show-cdup`, `--show-prefix`, `--show-toplevel`,
> > `--show-superproject-working-tree`
> > * **Directory Queries:** `--git-dir`, `--git-common-dir`,
> > `--resolve-git-dir <path>`
> > * **State/Format Queries:** `--is-inside-git-dir`,
> > `--is-inside-work-tree`, `--is-bare-repository`,
> > `--is-shallow-repository`
> > * **Index File Query:** `--shared-index-path`
> >
>
> Perhaps we want to breakdown so we can have:
>
> git info path [--cdup | --prefix | --toplevel ... ]
> git info repo [--is-bare | --is-shallow]
>
> and so on...

Ok I will separate these with respect to their usage.

>
> > 3. **Default Output Format (Text-Based):**
> > * The command's default output for each implemented option will
> > **match the current plain text output** produced by `git rev-parse`
> > for that same option, ensuring backward compatibility for scripts
> > migrating to the new command. Output will primarily be via standard C
> > functions like `printf` or `puts`.
>
> Since we'll keep the existing options within 'git-rev-parse(1)', we'll
> stay backwards compatible. Scripts which want to move to the new
> command, would anyway need to change the command, so there is no
> backward compatibility there. But, having the default to be human
> readable like the current implementation does make sense.
>

Ok this raises a new question, if we are not looking towards backward
compatibility and also looking towards machine readability, I think
using JSON from the start would make things flexible, do correct me on
this.

Because once people start to make things compatible with human
readable text they probably won't use JSON again, and JSON might
mitigate some errors. So I think I need some feedback on this.

> >
> > 4. **Comprehensive Documentation:**
> > * A clear man page (`git-repo-info.adoc`) explaining the new command's
> > purpose and detailing the usage and output of each implemented option.
> > * Updates to `git-rev-parse.adoc` to clearly **deprecate** the
> > relocated options (or mark them as aliases for compatibility) and
> > point users to the new `git repo-info` command.
> >
> > 5. **Robust Test Suite:**
> > * A new test script (`t/tXXXX-repo-info.sh`) using Git's test
> > framework (`test-lib.sh`).
> > * Tests specifically validating the output of `git repo-info --option`
> > against the output of `git rev-parse --option` across various
> > repository states (standard repo, bare repo, inside `.git`, inside
> > worktree, submodules, shallow clone etc.) to ensure functional parity.
> >
> > 6. **(Stretch Goal / Potential Future Work): Structured Output**
> > * If time permits after successfully implementing, documenting, and
> > testing the core text-based functionality, investigate adding a
> > `--format=json` option to provide a structured JSON output containing
> > the results of the requested queries. This is explicitly a secondary
> > goal, contingent on completing the primary refactoring task.
>
> Many of the plumbing commands in Git provide NUL-terminated output, I'm
> curious whether we should consider that over JSON.
>

I think we can use a flag to define what type of output we want in the
long term.
I think both of these have their own pros and cons. As I said I think
I need some feedback on the format.

> > **Out of Scope for GSoC (Based on Refined Goal):**
> > * Querying high-level metadata like current branch name, HEAD commit
> > details (beyond `--is-shallow-repository`), remote URLs, tags, or
> > arbitrary configuration values.
>
> We should make sure that we don't overload this new command too.
>

Agreed, I think for this GSOC I will largely focus on porting existing
flags in rev-parse to a new command.

> > * Complex status reporting (worktree dirtiness).
> > * Real-time monitoring or comparing metadata between revisions.
> > * Implementing JSON output as the *primary* feature.
> >

Even with revisioning, I think the above part comes up, I think I will
remove this from the proposal too, I think even this make the command
cluttered.

> > ## Technical Details
> >
> > This section outlines the proposed technical approach for implementing
> > the `git repo-info` command and relocating the specified options:
> >
> > 1. **Core `git repo-info` Command Implementation:**
> > * **Entry Point:** Create `builtin/repo-info.c` with
> > `cmd_repo_info(...)` function. Parse options using Git's
> > `parse-options` API.
> > * **Repository Context:** Utilize the standard `repo` structure and
> > `startup_info` provided by Git's infrastructure. Setup the repository
> > context similar to how `cmd_rev_parse` does it if needed (e.g., using
> > `setup_git_directory_gently`).
> > * **Reusing Logic:** Analyze the implementation of the target options
> > within `builtin/rev-parse.c`. Extract and adapt the relevant C
> > functions and logic (related to path manipulation using `prefix_path`,
> > `real_pathcmp`; repository state checks using
> > `is_bare_repository_cfg`, `is_inside_git_dir`, `is_inside_work_tree`;
> > accessing `startup_info`, `git_path`, etc.) into `builtin/repo-info.c`
> > or potentially shared helper functions if appropriate.
> > * **Specific Option Implementation:**
> > * `--show-toplevel`, `--show-cdup`, `--show-prefix`: Rely on the
> > `prefix` calculated during setup and path manipulation functions.
> > * `--git-dir`, `--git-common-dir`: Access `repo->gitdir`,
> > `repo->commondir` or use functions like `get_git_dir()`,
> > `get_common_dir()`. `--resolve-git-dir` will involve path resolution
> > relative to the provided argument.
> > * `--is-*` flags: Call existing helper functions like
> > `is_bare_repository_cfg()`, `is_inside_git_dir()`,
> > `is_inside_work_tree()`. `--is-shallow-repository` involves checking
> > `repo->is_shallow`.
> > * `--shared-index-path`: Access path information related to split
> > indexes if enabled.
> > * **Output Generation:** Use standard C `printf("%s\n", ...)` or
> > `puts(...)` to print the resulting string (path, "true"/"false", etc.)
> > to standard output, matching `rev-parse`'s current behavior. Boolean
> > flags typically exit `0` for true and `1` for false without output,
> > this behavior should be preserved.
> >
> > 2. **Documentation:**
> > * Create `Documentation/git-repo-info.adoc` using AsciiDoc format,
> > modeling it after existing man pages. Detail each option, its purpose,
> > and expected output.
> > * Modify `Documentation/git-rev-parse.adoc`, adding notes to the
> > relevant options indicating they are better handled by `git repo-info`
> > and potentially marking them for deprecation in a future Git version.
> >
> > 3. **Testing:**
> > * Create `t/tXXXX-repo-info.sh` using `test-lib.sh`.
> > * Structure tests using `test_expect_success` blocks.
> > * Utilize helper functions like `test_create_repo`, `cd repo`,
> > `test_cmp` to compare the output of `git repo-info --option` directly
> > against `git rev-parse --option` (for options producing output) or
> > against expected exit codes (for boolean flags).
> > * Cover edge cases like running outside a repository, in a bare
> > repository, deep within a worktree, within the `.git` directory, and
> > in repositories with submodules or worktrees.
> >
> > 4. **(Stretch Goal) JSON Output Implementation:**
> > * If attempted, add a `--format=json` option using `parse-options`.
> > * Collect results from the requested options internally.
> > * Use either an approved embedded C JSON library or Git's `strbuf` API
> > (with helpers like `strbuf_add_json_string`) to construct a JSON
> > object mapping option names (or descriptive keys) to their
> > corresponding values. Print the final JSON string to standard output.
> > Add specific tests for JSON output validation.
> >
> > ## Detailed Project Timeline
> >
> > **Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)**
> >
> > * **Focus:** Demonstrate continued interest and deepen understanding
> > *specifically of `rev-parse`'s internals* while awaiting results.
> > * **Activities:**
> > * **(April 9 - April 21):** Deep dive into `builtin/rev-parse.c`,
> > identifying the exact code blocks implementing the "OPTIONS FOR
> > FILES". Trace how they use `startup_info`, `prefix`, path functions,
> > and repository flags.
> > * **(April 22 - May 7):** Continue monitoring the mailing list. Refine
> > understanding of Git's testing framework, specifically focusing on
> > tests for `rev-parse` options (e.g., `t1006-cat-file.sh`,
> > `t5601-clone.sh` might use some flags). Review contribution
> > guidelines.
> >
> > **Phase 1: Final Planning (May 8 - May 26, 2025 Approx.)**
> >
> > * **Focus:** Formal introductions, confirm final scope & plan, setup.
> > * **Activities:**
> > * **(Week 1: May 8 - May 12):** Introduction with mentor(s). Confirm
> > the exact list of `rev-parse` options to be migrated. Discuss the
> > preferred approach for handling deprecation in `rev-parse` docs/code.
> > Discuss potential for shared helper functions vs. direct code
> > migration.
> > * **(Week 2: May 13 - May 19):** Set up dev environment. Deep dive
> > into the agreed-upon functions/code blocks within `rev-parse.c`.
> > Outline the basic structure for `builtin/repo-info.c` and the test
> > script `t/tXXXX-repo-info.sh`.
> > * **(Week 3: May 20 - May 26):** Implement the basic `cmd_repo_info`
> > skeleton, option parsing setup, and repository setup boilerplate.
> > Write initial "no-op" tests. Post first blog update.
> >
> > **Phase 2: Implementation in Batches (Coding Weeks 1-8: May 27 - July
> > 21, 2025 Approx.)**
> >
> > * **Focus:** Implement options in logical groups, test thoroughly,
> > submit patches early and often.
> > * **GSoC Milestone:** Midterm Evaluations occur around Week 8.
> > * **Activities:**
> > * **(Batch 1 / Weeks 1-2: May 27 - June 9):** Implement basic path
> > queries: `--show-toplevel`, `--show-prefix`, `--show-cdup`. Add tests
> > comparing output with `rev-parse`. **Submit Patch Series 1**.
> > * **(Batch 2 / Weeks 3-4: June 10 - June 23):** Implement directory
> > queries: `--git-dir`, `--git-common-dir`, `--resolve-git-dir <path>`.
> > Add tests. **Submit Patch Series 2**. Write blog post update.
> > * **(Batch 3 / Weeks 5-6: June 24 - July 7):** Implement boolean state
> > queries: `--is-bare-repository`, `--is-inside-git-dir`,
> > `--is-inside-work-tree`. Add tests checking exit codes and behavior in
> > various locations. **Submit Patch Series 3**.
> > * **(Batch 4 / Weeks 7-8: July 8 - July 21):** Implement remaining
> > queries: `--is-shallow-repository`, `--shared-index-path`,
> > `--show-superproject-working-tree`. Add comprehensive tests covering
> > interactions (e.g., in submodules, shallow clones). **Submit Patch
> > Series 4**. Prepare for Midterm evaluation; ensure submitted batches
> > demonstrate core progress. Write blog post update.
> >
>
> It is nice to see breakdown in batches.
>

Thank you.

> > **Phase 3: Documentation & Final Polish (Coding Weeks 9-12: July 22 -
> > Aug 18, 2025 Approx.)**
> >
> > * **Focus:** Create documentation, address feedback on all patches,
> > refine implementation, potentially attempt stretch goal.
>
> I would say documentation should go hand-in-hand with each patch series
> that you send. Ideally every patch series should leave the code base in
> a usable state. Pushing documentation to the end, would mean that if the
> project is stopped midway, we'd have a running command in Git with no
> documentation about what it does.
>

Ok I think I will add them parallel to the batches above, nice point.

> > * **Activities:**
> > * **(Week 9: July 22 - July 28):** Write the first complete draft of
> > the man page for `git-repo-info`. Draft the necessary updates for
> > `git-rev-parse.adoc` (deprecation notices). **Submit Patch Series 5
> > (Documentation)**.
> > * **(Week 10: July 29 - Aug 4):** Focus on addressing review comments
> > on **all** previous patch series. Refactor code based on feedback.
> > Ensure test suite is robust and covers feedback points.
> > * **(Week 11: Aug 5 - Aug 11):** *Stretch Goal (Conditional):* If core
> > functionality and docs are stable and reviewed positively, begin
> > investigating/implementing `--format=json`. Add specific JSON tests if
> > implemented. Otherwise, focus on further code cleanup and test
> > hardening.
> > * **(Week 12: Aug 12 - Aug 18):** Prepare and submit final versions of
> > all patch series, incorporating all feedback. Final testing pass.
> > Write blog post update summarizing progress and final state. Code
> > freeze for final evaluation.
> >
> > **Phase 4: Final Evaluation & Wrap-up (Aug 19 - Nov 19, 2025)**
> >
> > * **Focus:** Final submissions, respond to late feedback, ensure
> > project completion.
> > * **Official GSoC Milestone:** November 19, 2025 - Program End Date.
> > * **Activities:**
> > * **(Late Aug - Sept):** Submit final GSoC evaluations. Actively
> > respond to any further comments on submitted patches from the
> > community/maintainers, aiming for merge readiness.
> > * **(Oct - Nov 19):** Monitor mailing list for patch status. Write
> > final GSoC project summary blog post. Continue engaging with the
> > community if interested in further contributions beyond GSoC.
> >
> >
> >
> > Thank You,
> > Jayatheerth
>
> Thanks for your proposal!
>
> - Karthik

Thank you again Karthik, will send the updated proposal in this thread soon.

-Jayatheerth

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-06  5:40         ` JAYATHEERTH K
@ 2025-04-06  6:09           ` JAYATHEERTH K
  2025-04-06 18:08             ` Kaartic Sivaraam
  0 siblings, 1 reply; 12+ messages in thread
From: JAYATHEERTH K @ 2025-04-06  6:09 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Patrick Steinhardt, git, Ghanshyam Thakkar

# Proposal for GSOC 2025 to Git
**Machine-Readable Repository Information Query Tool**

## Contact Details
* **Name**: K Jayatheerth
* **Email**: jayatheerthkulkarni2005@gmail.com
* **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)
* **GitHub**: [GitHub](https://github.com/jayatheerthkulkarni)

## Prerequisites & Experience


As part of the GSoC application prerequisites, I have engaged with the
Git community and initiated a microproject. This involved **updating
documentation for `MyFirstContribution.adoc` and update it with modern
codebase**, providing valuable experience with Git's codebase
structure (documentation files), the contribution workflow (patch
submission using `git send-email`, addressing feedback across
versions), and communication via the mailing list.

* **Microproject Status:** v4 submitted, incorporating feedback,
awaiting further review.
* **Microproject Patch Series:** [Main mail
thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t)
(Link to the most relevant thread demonstrating interaction and
successful patch refinement)
* **Initial Patch:** [First
Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@gmail.com/t/#u)
* **Mailing List Introduction:** [First
Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@mail.gmail.com/t/#u)
* **Blog:** My GSoC blog details these interactions:
[Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html)

## **Synopsis**

This project focuses on **refactoring Git by creating a dedicated
command (tentatively named `git info`, subject to further discussion)
to house the low-level repository, path, and format-related query
options currently misplaced under the "OPTIONS FOR FILES" section of
`git-rev-parse(1)`**. This new command, potentially using a subcommand
structure (e.g., `git info path`, `git info repo`), will provide a
more logical and maintainable location for this functionality. This
allows `git rev-parse` to better focus on its core purpose of parsing
revisions, ultimately improving Git's internal organization and
command structure clarity by offering a **cleaner interface** for
these specific queries.

## **Benefits to the Community**

### **1. Improves `git rev-parse` Clarity and Maintainability**
- `git rev-parse` has accumulated various options unrelated to its
primary purpose of parsing revisions, particularly those for querying
low-level repository state and paths.
- This project **directly addresses this issue** by migrating these
options to a dedicated, purpose-built command, making `git rev-parse`
cleaner and easier to understand and maintain.
- Provides a **clearer separation of concerns** within Git's command suite.

### **2. Provides Reliable Access for Automation and Scripting**
- Scripts often need fundamental repository information like the
top-level directory path, the `.git` directory location, or repository
state.
- Currently, scripts rely on `git rev-parse` for this, invoking it for
tasks outside its core revision-parsing role.
- The new `git info` command will offer a **stable, dedicated, and
cleaner interface** for retrieving this specific low-level
information, making scripts **more robust and readable** by calling
the command designed explicitly for these tasks.



## Deliverables

This project will introduce a new Git command, **tentatively named
`git info`**, serving as the designated home for specific low-level
query options migrated from `git rev-parse`. The implementation will
likely adopt a **subcommand structure**.

The key deliverables for this GSoC project include:

1. **New Core Command: `git info` with Subcommands**
* A new `builtin/info.c` command integrated into the Git source code.
* Implementation primarily in C, using `parse-options` to handle
**subcommands** (e.g., `path`, `repo`, `misc`) and their specific
options.
* Leverages existing internal Git APIs and logic currently within `rev-parse.c`.

2. **Relocated `rev-parse` Options under Subcommands:**
* Implementation of the core functionality behind selected options
from `git-rev-parse(1)`'s "OPTIONS FOR FILES" section, organized under
appropriate subcommands within `git info`. *(Specific options and
subcommand grouping subject to final confirmation with mentor)*:
* **`git info path ...` (Example Grouping):**
* `--show-cdup` -> `git info path --cdup` (or similar)
* `--show-prefix` -> `git info path --prefix`
* `--show-toplevel` -> `git info path --toplevel`
* `--show-superproject-working-tree` -> `git info path --superproject-worktree`
* **`git info repo ...` (Example Grouping):**
* `--git-dir` -> `git info repo --git-dir`
* `--git-common-dir` -> `git info repo --common-dir`
* `--resolve-git-dir <path>` -> `git info repo --resolve-dir <path>`
* `--is-bare-repository` -> `git info repo --is-bare`
* `--is-shallow-repository` -> `git info repo --is-shallow`
* **`git info misc ...` (Example Grouping for others):**
* `--is-inside-git-dir` -> `git info misc --inside-gitdir`
* `--is-inside-work-tree` -> `git info misc --inside-worktree`
* `--shared-index-path` -> `git info misc --shared-index-path`
* *(Design Consideration):* Option names within subcommands might be
slightly adjusted for clarity/consistency (e.g., dropping "show-").

3. **Multiple Output Formats:**
* **Default Text Output:** The default output for each implemented
option will be simple, human-readable text, **matching the semantics
and format** produced by the corresponding `git rev-parse` option
(e.g., printing a path string, "true"/"false", or exiting with status
0/1 for boolean checks).
* **NUL Termination (`-z`):** Implement a `-z` option (standard across
many Git plumbing commands) for unambiguous, newline-safe output
suitable for scripting, particularly for path-related options.
* **JSON Output (`--json`):** Implement a `--json` option to provide
structured output, mapping query keys (derived from options) to their
values. This offers maximum flexibility for tools consuming the
information. *(The relative priority and implementation details of
`-z` vs `--json` to be discussed with mentor, but both are considered
core deliverables)*.

4. **Comprehensive Documentation (Incremental):**
* A clear man page (`git-info.adoc`) explaining the new command's
purpose, the subcommand structure, and detailing the usage, options
(including `-z`, `--json`), and output formats for each implemented
feature. **Relevant sections of the man page will be added or updated
within each patch series submitted.**
* Updates to `git-rev-parse.adoc` to clearly **document the
relationship** with `git info` for the migrated options (e.g., noting
that `git info` is the preferred command) and potentially marking them
for deprecation. **These updates will also be included incrementally
with relevant patch series.**

5. **Robust Test Suite (Incremental):**
* A new test script (`t/tXXXX-info.sh`) using Git's test framework
(`test-lib.sh`).
* Tests covering the subcommand structure, each implemented option,
and **all output formats** (`text`, `-z`, `--json`).
* Tests validating behavior across various repository states
(standard, bare, inside `.git`, inside worktree, submodules, shallow
clone etc.). **New tests will be added within each patch series for
the features implemented.**


## Technical Details


1. **Core `git info` Command Implementation:**
* **Entry Point:** Create `builtin/info.c` with `cmd_info(...)`. Use
`parse-options` to parse the **subcommand** first. Based on the
subcommand, invoke a specific helper function (e.g.,
`cmd_info_path()`, `cmd_info_repo()`) which then uses `parse-options`
again to handle the options specific to that subcommand.
* **Repository Context:** Standard setup using `repo` structure,
`startup_info`, and potentially `setup_git_directory_gently`.
* **Reusing Logic:** Adapt logic from `builtin/rev-parse.c` for the
core functionality of each option. This might involve direct code
migration or creating shared helper functions where appropriate.
* **Subcommand Implementation:** Implement helper functions for each
subcommand (`path`, `repo`, `misc`) containing the `parse_options`
calls and logic for the options within that group.
* **Output Generation:**
* **Text (Default):** Use `printf("%s\n", ...)` / `puts(...)` for
string output; print "true"/"false" or use `exit(0)` / `exit(1)` for
boolean checks, mimicking `rev-parse`.
* **NUL (`-z`):** Use `putchar('\0')` or `fwrite(..., 1, 1, stdout)`
instead of newline for string output when `-z` is active. Boolean
checks likely remain exit-code based.
* **JSON (`--json`):** Collect results internally. Use Git's `strbuf`
API (with `strbuf_add_json_string` etc.) or potentially an approved C
JSON library to construct and print a JSON object mapping keys to
values. All requested info within a single invocation should ideally
be combined into one JSON object.

2. **Documentation:**
* Create `Documentation/git-info.adoc`. Structure based on
subcommands. Detail each subcommand and its options, including `-z`
and `--json` behavior.
* Modify `Documentation/git-rev-parse.adoc` to add cross-references
for relevant options.
* **Documentation updates will accompany the code changes in each
patch series.**

3. **Testing:**
* Create `t/tXXXX-info.sh`.
* Use `test_expect_success` with helpers like `test_create_repo`,
`test_cmp`, `test_must_fail`.
* Add tests for:
* Correct subcommand parsing and error handling.
* Each option under its subcommand, comparing **text output** against
`rev-parse` (where applicable) or expected values/exit codes.
* **`-z` output** using appropriate comparison methods (e.g., piping
to `tr '\\0' '\\n'`).
* **`--json` output** using tools like `jq` (if available in test env)
or careful `grep`/`sed` checks for structure and values.
* **Tests will be added incrementally with the features in each patch series.**

## Detailed Project Timeline


**Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)**
*

**Phase 1: Community Bonding & Final Planning (May 8 - May 26, 2025 Approx.)**
* **Focus:** Formal introductions, finalize scope, agree on command
structure, setup.
* **Activities:**
* **(Week 1: May 8 - May 12):** Discuss proposal with mentor(s). Finalize:
* Command name (`git info` or alternative).
* Subcommand structure and grouping of options.
* Exact list of options to port, and any necessary renaming within subcommands.
* Approach for handling relationship with `rev-parse` (deprecation vs.
aliasing vs. simple documentation cross-link).
* Prioritization/approach for implementing `-z` and `--json`.
* **(Week 2: May 13 - May 19):** Set up dev environment. Deep dive
into agreed-upon code blocks in `rev-parse.c`. Outline
`builtin/info.c` structure including subcommand handlers. Outline
initial test script `t/tXXXX-info.sh`.
* **(Week 3: May 20 - May 26):** Implement basic `cmd_info` skeleton,
top-level subcommand parsing, repository setup. Implement one simple
subcommand handler (e.g., `cmd_info_path`) with basic option parsing
structure. Write initial "no-op" / basic structure tests. Post first
blog update.

**Phase 2: Implementation in Batches (Coding Weeks 1-8: May 27 - July
21, 2025 Approx.)**
* **Focus:** Implement options within subcommands, including
documentation and tests for text output first, then potentially add
machine-readable formats. Submit patches early and often.
* **GSoC Milestone:** Midterm Evaluations occur around Week 8.
* **Activities:** *(Structure assumes implementing text output first,
then `-z`/`--json` later in the phase)*
* **(Batch 1 / Weeks 1-2: May 27 - June 9):** Implement `path`
subcommand options (`--toplevel`, `--prefix`, `--cdup`). Implement
**text output**. Add corresponding **tests** and **documentation**
snippets (for `git-info.adoc` and `git-rev-parse.adoc`). **Submit
Patch Series 1**.
* **(Batch 2 / Weeks 3-4: June 10 - June 23):** Implement `repo`
subcommand options (`--git-dir`, `--common-dir`, `--resolve-dir`,
`--is-bare`). Implement **text output**. Add **tests** and
**documentation** snippets. **Submit Patch Series 2**. Write blog post
update.
* **(Batch 3 / Weeks 5-6: June 24 - July 7):** Implement remaining
`repo` (`--is-shallow`) and `misc` subcommand options
(`--inside-gitdir`, `--inside-worktree`, `--shared-index-path`,
`--superproject-worktree` - *adjust subcommand grouping based on final
plan*). Implement **text output**. Add **tests** and
**documentation**. **Submit Patch Series 3**.
* **(Batch 4 / Weeks 7-8: July 8 - July 21):** Implement **`-z` and
`--json` output formats** for all options added in Batches 1-3. Add
comprehensive **tests** for these formats. Update **documentation** to
fully describe `-z` and `--json` behavior. **Submit Patch Series 4**.
Prepare for Midterm evaluation; ensure submitted batches show
substantial progress on core functionality and formats. Write blog
post update.

**Phase 3: Refinement & Final Polish (Coding Weeks 9-12: July 22 - Aug
18, 2025 Approx.)**
* **Focus:** Address feedback on all patches, ensure robustness,
finalize documentation consistency.
* **Activities:**
* **(Week 9: July 22 - July 28):** Focus on addressing review comments
on **all** previous patch series (Code, Tests, Docs). Refactor based
on feedback.
* **(Week 10: July 29 - Aug 4):** Continue addressing feedback. Ensure
the test suite is robust, covers edge cases identified in reviews.
Perform thorough documentation review for consistency and clarity
across the entire man page.
* **(Week 11: Aug 5 - Aug 11):** Final code cleanup. Final pass on
test coverage. *(Stretch Goal Idea):* If all core work is stable and
time permits, potentially explore adding one or two *new*, simple,
agreed-upon repo info queries (not from `rev-parse`) that fit the
command's purpose.
* **(Week 12: Aug 12 - Aug 18):** Prepare and submit final versions of
all patch series, incorporating all feedback. Final self-testing.
Write blog post update summarizing progress and final state. Code
freeze for final evaluation.

**Phase 4: Final Evaluation & Wrap-up (Aug 19 - Nov 19, 2025)**
* Write final GSoC project summary blog post. Continue engaging with
the community in further contributions beyond GSoC.
Thank You,
Jayatheerth

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-06  6:09           ` JAYATHEERTH K
@ 2025-04-06 18:08             ` Kaartic Sivaraam
  2025-04-07  2:32               ` JAYATHEERTH K
  0 siblings, 1 reply; 12+ messages in thread
From: Kaartic Sivaraam @ 2025-04-06 18:08 UTC (permalink / raw)
  To: JAYATHEERTH K, Karthik Nayak; +Cc: Patrick Steinhardt, git, Ghanshyam Thakkar

Hi Jayatheerth,

On 06/04/25 11:39, JAYATHEERTH K wrote:
> # Proposal for GSOC 2025 to Git
> **Machine-Readable Repository Information Query Tool**
> 

Thank you for your proposal! Just wanted to send in a gentle reminder 
that the proposal submission deadline is April 8 18:00 UTC. So, there's 
very less time until the same. You may want to consider submitted your 
current proposal and use the option provided in the GSoC website to 
update the proposal before the due date (if necessary). This would help 
avoid a last minute rush.

Feel free to let me know in case you face any difficulty with submitting 
your proposal.

--
Sivaraam

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool
  2025-04-06 18:08             ` Kaartic Sivaraam
@ 2025-04-07  2:32               ` JAYATHEERTH K
  0 siblings, 0 replies; 12+ messages in thread
From: JAYATHEERTH K @ 2025-04-07  2:32 UTC (permalink / raw)
  To: Kaartic Sivaraam
  Cc: Karthik Nayak, Patrick Steinhardt, git, Ghanshyam Thakkar

Hey Sivaraam,

On Sun, Apr 6, 2025 at 11:38 PM Kaartic Sivaraam
<kaartic.sivaraam@gmail.com> wrote:
>
> Hi Jayatheerth,
>
> On 06/04/25 11:39, JAYATHEERTH K wrote:
> > # Proposal for GSOC 2025 to Git
> > **Machine-Readable Repository Information Query Tool**
> >
>
> Thank you for your proposal! Just wanted to send in a gentle reminder
> that the proposal submission deadline is April 8 18:00 UTC. So, there's
> very less time until the same. You may want to consider submitted your
> current proposal and use the option provided in the GSoC website to
> update the proposal before the due date (if necessary). This would help
> avoid a last minute rush.

Got it, uploaded the current version, based on the feedback will change it.

>
> Feel free to let me know in case you face any difficulty with submitting
> your proposal.
>

Sure, thank you!

> --
> Sivaraam

-Jayatheerth

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-04-07  2:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-31 14:51 [GSOC] [Proposal v1] Machine-Readable Repository Information Query Tool JAYATHEERTH K
2025-03-31 14:59 ` JAYATHEERTH K
2025-04-03 10:23 ` Patrick Steinhardt
2025-04-03 14:10   ` JAYATHEERTH K
2025-04-03 14:35     ` JAYATHEERTH K
2025-04-05 19:42       ` Karthik Nayak
2025-04-06  5:40         ` JAYATHEERTH K
2025-04-06  6:09           ` JAYATHEERTH K
2025-04-06 18:08             ` Kaartic Sivaraam
2025-04-07  2:32               ` JAYATHEERTH K
2025-04-04  9:13     ` Patrick Steinhardt
2025-04-04 13:22       ` JAYATHEERTH K

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).