* [GSoC] Proposal draft: Improve the new git repo command
[not found] <9fc1d23fbc7d46349ac01314fbfc06eb.gsoc-proposal-draft-jerrywang183.ref@yahoo.com>
@ 2026-03-16 11:47 ` Jialong Wang
2026-03-16 20:59 ` Karthik Nayak
2026-03-16 21:05 ` [GSoC] Proposal draft: " Jialong Wang
0 siblings, 2 replies; 6+ messages in thread
From: Jialong Wang @ 2026-03-16 11:47 UTC (permalink / raw)
To: git; +Cc: Jialong Wang
Hi,
I plan to apply to Git for GSoC 2026, and I would like to share a draft
proposal for feedback.
The project I am currently most interested in is improving the new
`git repo` command, with a primary focus on extending `git repo info`
with path-related repository metadata.
My draft is below. I would appreciate feedback on whether this scope
looks reasonable, and which parts of the current `git repo` work would
make the best starting point.
Thanks,
Jialong
---
# Improve `git repo info` by adding repository path metadata
## Name
Jialong Wang
## Email
jerrywang183@yahoo.com
## Preferred project size
175 hours
## About me
My name is Jialong Wang, and I plan to apply to Git for GSoC 2026.
I have been getting familiar with Git’s development workflow by building
Git from source, reading the contribution documents, and working on a
microproject. As part of that process, I prepared and sent a patch to
the Git mailing list.
I am interested in the new `git repo` command because it is user-facing,
but also closely tied to Git’s internal repository model. That makes it
a good fit for the kind of work I want to do: understanding existing
code, discussing design details on the mailing list, and implementing
improvements in small, reviewable patches.
Relevant links:
- Microproject discussion thread:
https://public-inbox.org/git/CAKWWG_nGhD6vqhAS1mkEwBQPrg_YX0+C3-xW=Q3ifFDw4dDviw@mail.gmail.com/
- Microproject patch thread:
https://public-inbox.org/git/20260315231538.68586-1-jerrywang183@yahoo.com/
- SoC 2026 idea page:
https://git.github.io/SoC-2026-Ideas/
## Project summary
I would like to work on improving the new `git repo` command, with a
primary focus on `git repo info`.
The `git repo` command was introduced to provide a cleaner interface for
querying repository metadata. However, several useful path-related
values are still mainly accessed through `git rev-parse` and
`git rev-parse --git-path`. My proposal is to extend `git repo info`
so that it can expose a selected set of those values in a more
structured form.
The goal is not to replace `git rev-parse`, but to make `git repo info`
more useful as a structured interface for repository path metadata.
## Motivation
Today, scripts and tools still often rely on commands such as:
git rev-parse --git-dir
git rev-parse --show-toplevel
git rev-parse --git-path <path>
These commands are useful, but they were not primarily designed as a
structured repository metadata interface.
Since `git repo info` already exists for this purpose, extending it with
path-related values would make repository layout information easier to
query in a cleaner and more consistent way.
I think this is a good GSoC project because it has clear user value, can
be implemented incrementally, and naturally fits Git’s patch-and-review
workflow.
## Current context
I am aware that work on path-related `git repo info` fields has already
started. There have already been patch series for path keys, category
requests, and path formatting. Because of that, I do not want to assume
that the work described on the ideas page is still untouched.
One of my first goals during the bonding period would be to review the
current state of these discussions carefully, identify what remains
open, and refine the project scope based on maintainer feedback. I would
rather build on the current direction than duplicate work that is
already in progress.
I also think this project should be scoped carefully. The ideas page
mentions improvements to both `git repo info` and `git repo structure`,
but for a GSoC project I believe it is more realistic to focus first on
`git repo info` and only expand beyond that if the main work is in good
shape.
## Proposed work
The main objective of this project is to extend `git repo info` with
selected repository path values that are currently obtained through
`git rev-parse` and `git rev-parse --git-path`.
The work will involve:
1. Studying the current implementation of `git repo info`.
2. Comparing its current output with commonly used `git rev-parse`
path queries.
3. Identifying a first small set of missing path-related values to add.
4. Discussing output design on the mailing list, especially where there
are open questions about relative versus absolute paths.
5. Implementing the agreed functionality through small patch series.
6. Adding tests covering the new behavior.
7. Updating documentation if needed.
## Initial scope
The first stage of the project would focus on a small set of commonly
used repository path values, for example:
- `git-dir`
- `common-dir`
- `toplevel`
- `superproject-working-tree`
I think these are a good first target because they are already familiar
to users through `git rev-parse`, and they provide immediate practical
value without requiring a large interface expansion.
Depending on project progress and mailing list feedback, I would then
like to extend support to selected values currently accessed through
`git rev-parse --git-path`, such as:
- index file
- objects directory
- hooks directory
I do not want to promise every possible path-related key up front. I
would rather start with the most straightforward and useful values, get
feedback early, and continue from there.
## My approach to scope and quality
One thing I would like to be careful about is not treating this project
as a simple checklist of fields to add.
I think the quality of the project will depend on three things:
1. choosing a small set of fields that make sense together,
2. agreeing on a consistent path representation,
3. and making sure the result fits naturally into the existing `git repo`
design rather than becoming a thin wrapper over `git rev-parse`.
Because of that, I would prefer to make progress in a few coherent
batches instead of adding many unrelated keys at once.
I also think it is important to keep room for scope reduction. If some
part of the design turns out to be more controversial than expected,
I would prefer to complete a smaller, cleaner set of path fields rather
than stretching the project too broadly.
## Technical approach
The implementation of `git repo` is primarily in `builtin/repo.c`. The
first step would be to understand how `git repo info` currently collects
and prints repository metadata, and how that existing structure can be
extended without making the interface inconsistent.
Many relevant repository paths are already available internally through
helpers such as:
- `repo_get_git_dir()`
- `repo_get_common_dir()`
- `repo_get_work_tree()`
Similarly, `git rev-parse --git-path` already relies on existing path
resolution logic. So the work is not about inventing these values from
scratch, but about exposing a selected subset of them through
`git repo info` in a way that fits its current design.
The first implementation step would be to map existing helpers and path
resolution logic to a small set of `repo info` fields. After that, I
would extend the output code in `builtin/repo.c` to report those fields
in a consistent way.
One of the main design questions is path formatting. The ideas page
explicitly mentions the need to decide between relative and absolute
paths. I do not want to assume the answer in advance. Instead, I would
review the current discussion, compare the behavior of existing
commands, and propose a small, consistent approach on the mailing list.
I also expect that some preparatory cleanup or refactoring may be useful
before adding new fields. If so, I would keep that work minimal and send
it as small separate patches.
## Patch strategy
I expect the implementation to be divided into small patches so that
each change can be reviewed independently.
A likely patch strategy would be:
1. small preparatory cleanup if needed
2. add support for a first path-related key or a very small set of keys
3. extend support with additional related keys
4. add or refine tests for the new behavior
5. update documentation if necessary
If existing in-progress series already cover some of these parts, I
would adjust the breakdown accordingly and focus on what remains useful
and open.
## Tests
Tests would be added to cover the new behavior in common repository
setups.
Depending on the exact scope agreed on, test cases may include:
- ordinary repositories
- linked worktrees
- superproject/submodule cases
- cases where path values differ from simple defaults
I would keep the tests focused on observable behavior instead of
overfitting them to a particular implementation detail.
## What I will not try to do
To keep the project realistic, I do not plan to:
- redesign all of `git repo`
- fully replace `git rev-parse`
- implement every possible repository path query
- work on both `git repo info` and `git repo structure` at full scope in
the same project
The project should stay focused on a well-defined subset of path-related
metadata for `git repo info`.
## Expected deliverables
By the end of the project, I expect to deliver:
- support for a useful set of path-related values in `git repo info`
- tests covering the new functionality
- documentation updates if needed
- one or more patch series discussed and refined on the Git mailing list
## Timeline
### Community bonding period
- Study `builtin/repo.c` and the current `git repo info` implementation
- Review recent and ongoing mailing list discussions related to `git repo`
- Compare current `git repo info` behavior with `git rev-parse`
- Refine the exact scope with mentors and mailing list feedback
### Phase 1
- Implement a first small batch of path-related values
- Send the first patch series
- Address review comments
- Add tests for the first batch
### Phase 2
- Implement additional agreed path values
- Continue design discussion if needed
- Refine implementation and tests based on review feedback
### Phase 3
- Complete remaining agreed work
- Update documentation if necessary
- Rework earlier patches if needed for consistency
- Prepare a final summary of the work
### Buffer time
- Handle review delays
- Fix regressions or edge cases
- Narrow scope if some planned work turns out to be too large
## Risks and mitigation
One risk is that design discussion may take longer than expected,
especially around path representation and output structure.
To reduce that risk, I would keep the patch series small and prioritize
the least controversial values first.
Another risk is overlap with work already in progress. If that happens,
I would adjust the project scope to avoid duplication and focus on what
is still useful and open.
## Why I think I am a good fit
I have already started learning Git’s normal contribution workflow
through a microproject, including building Git from source, running
tests, preparing a patch, and sending it to the mailing list.
This project fits the kind of work I want to do in Git: understanding
existing code, discussing interface details on the mailing list, and
implementing improvements incrementally in small patches.
## References
- SoC 2026 idea page:
https://git.github.io/SoC-2026-Ideas/
- General application information:
https://git.github.io/General-Application-Information/
- `git repo` documentation:
https://git-scm.com/docs/git-repo
- `git rev-parse` documentation:
https://git-scm.com/docs/git-rev-parse
- `git-sizer` project:
https://github.com/github/git-sizer
- Recent patch series adding path-related support to `git repo`:
https://public-inbox.org/git/20260228224252.72788-1-lucasseikioshiro@gmail.com/
- More recent work-in-progress series for category/path keys and
`--path-format`:
https://public-inbox.org/git/pull.2208.v6.git.git.1772428548.gitgitgadget@gmail.com/
- Recent GSoC proposal thread on improving the new `git repo` command:
https://public-inbox.org/git/20260303140732.16886-1-pushkarkumarsingh1970@gmail.com/
- Another recent GSoC proposal thread on improving/extending `git repo`:
https://public-inbox.org/git/CA+rGoLd-1Mb5JG1H1PvE-kyjdznrLVFjwQiMLHtd2ETQ-igmXg@mail.gmail.com/
- Recent proposal thread focused on the same SoC idea:
https://public-inbox.org/git/CAO_P5U3g_+RpnDUmEv_qX-3GVhpxLV97eMxP1apERc0KU_95tQ@mail.gmail.com/
- Recent discussion around `git repo structure` enhancements:
https://public-inbox.org/git/CAO_P5U2f4MD-URre+4ocC=YQ570hr03pZHDk1jvuSOKx4aLOCA@mail.gmail.com/
- Microproject discussion thread:
https://public-inbox.org/git/CAKWWG_nGhD6vqhAS1mkEwBQPrg_YX0+C3-xW=Q3ifFDw4dDviw@mail.gmail.com/
- Microproject patch thread:
https://public-inbox.org/git/20260315231538.68586-1-jerrywang183@yahoo.com/
- Review on the microproject patch thread:
https://public-inbox.org/git/CAOLa=ZTpfHUySnMgCFMnvo2JcRSv8zqFP-cLFSs+Ab5Cy2zsvg@mail.gmail.com/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GSoC] Proposal draft: Improve the new git repo command
2026-03-16 11:47 ` [GSoC] Proposal draft: Improve the new git repo command Jialong Wang
@ 2026-03-16 20:59 ` Karthik Nayak
2026-03-17 0:28 ` Jialong Wang
2026-03-18 20:08 ` [GSoC proposal v3][RFC] " Jialong Wang
2026-03-16 21:05 ` [GSoC] Proposal draft: " Jialong Wang
1 sibling, 2 replies; 6+ messages in thread
From: Karthik Nayak @ 2026-03-16 20:59 UTC (permalink / raw)
To: Jialong Wang, git
[-- Attachment #1: Type: text/plain, Size: 14749 bytes --]
Jialong Wang <jerrywang183@yahoo.com> writes:
> Hi,
>
> I plan to apply to Git for GSoC 2026, and I would like to share a draft
> proposal for feedback.
>
> The project I am currently most interested in is improving the new
> `git repo` command, with a primary focus on extending `git repo info`
> with path-related repository metadata.
>
> My draft is below. I would appreciate feedback on whether this scope
> looks reasonable, and which parts of the current `git repo` work would
> make the best starting point.
>
> Thanks,
> Jialong
>
> ---
>
> # Improve `git repo info` by adding repository path metadata
>
> ## Name
>
> Jialong Wang
>
> ## Email
>
> jerrywang183@yahoo.com
>
> ## Preferred project size
>
> 175 hours
>
> ## About me
>
> My name is Jialong Wang, and I plan to apply to Git for GSoC 2026.
>
> I have been getting familiar with Git’s development workflow by building
> Git from source, reading the contribution documents, and working on a
> microproject. As part of that process, I prepared and sent a patch to
> the Git mailing list.
>
> I am interested in the new `git repo` command because it is user-facing,
> but also closely tied to Git’s internal repository model. That makes it
> a good fit for the kind of work I want to do: understanding existing
> code, discussing design details on the mailing list, and implementing
> improvements in small, reviewable patches.
>
> Relevant links:
>
> - Microproject discussion thread:
> https://public-inbox.org/git/CAKWWG_nGhD6vqhAS1mkEwBQPrg_YX0+C3-xW=Q3ifFDw4dDviw@mail.gmail.com/
> - Microproject patch thread:
> https://public-inbox.org/git/20260315231538.68586-1-jerrywang183@yahoo.com/
> - SoC 2026 idea page:
> https://git.github.io/SoC-2026-Ideas/
>
Perhaps it would be nice to give a few lines about what the microproject
discussion thread and patch thread are about.
It also helps if you can state the current status, maybe look at other
proposals for examples around this.
> ## Project summary
>
> I would like to work on improving the new `git repo` command, with a
> primary focus on `git repo info`.
>
> The `git repo` command was introduced to provide a cleaner interface for
> querying repository metadata. However, several useful path-related
> values are still mainly accessed through `git rev-parse` and
> `git rev-parse --git-path`. My proposal is to extend `git repo info`
> so that it can expose a selected set of those values in a more
> structured form.
>
> The goal is not to replace `git rev-parse`, but to make `git repo info`
> more useful as a structured interface for repository path metadata.
>
> ## Motivation
>
> Today, scripts and tools still often rely on commands such as:
>
> git rev-parse --git-dir
> git rev-parse --show-toplevel
> git rev-parse --git-path <path>
>
> These commands are useful, but they were not primarily designed as a
> structured repository metadata interface.
>
> Since `git repo info` already exists for this purpose, extending it with
> path-related values would make repository layout information easier to
> query in a cleaner and more consistent way.
>
> I think this is a good GSoC project because it has clear user value, can
> be implemented incrementally, and naturally fits Git’s patch-and-review
> workflow.
>
> ## Current context
>
> I am aware that work on path-related `git repo info` fields has already
> started. There have already been patch series for path keys, category
> requests, and path formatting. Because of that, I do not want to assume
> that the work described on the ideas page is still untouched.
>
> One of my first goals during the bonding period would be to review the
> current state of these discussions carefully, identify what remains
> open, and refine the project scope based on maintainer feedback. I would
> rather build on the current direction than duplicate work that is
> already in progress.
>
It would also make sense to have some sense of what that direction may
look like and add that to the proposal.
> I also think this project should be scoped carefully. The ideas page
> mentions improvements to both `git repo info` and `git repo structure`,
> but for a GSoC project I believe it is more realistic to focus first on
> `git repo info` and only expand beyond that if the main work is in good
> shape.
>
That's a fair assessment, what we do like to see is how you plan to
structure the goals and possibly future work into the timeline. Reading
on.
> ## Proposed work
>
> The main objective of this project is to extend `git repo info` with
> selected repository path values that are currently obtained through
> `git rev-parse` and `git rev-parse --git-path`.
>
> The work will involve:
>
> 1. Studying the current implementation of `git repo info`.
> 2. Comparing its current output with commonly used `git rev-parse`
> path queries.
> 3. Identifying a first small set of missing path-related values to add.
> 4. Discussing output design on the mailing list, especially where there
> are open questions about relative versus absolute paths.
> 5. Implementing the agreed functionality through small patch series.
> 6. Adding tests covering the new behavior.
> 7. Updating documentation if needed.
Generally each commit should be self contained with tests and
documentation, so perhaps 5, 6, 7 are a single point with subheadings?
>
> ## Initial scope
>
> The first stage of the project would focus on a small set of commonly
> used repository path values, for example:
>
> - `git-dir`
> - `common-dir`
> - `toplevel`
> - `superproject-working-tree`
>
> I think these are a good first target because they are already familiar
> to users through `git rev-parse`, and they provide immediate practical
> value without requiring a large interface expansion.
>
> Depending on project progress and mailing list feedback, I would then
> like to extend support to selected values currently accessed through
> `git rev-parse --git-path`, such as:
>
> - index file
> - objects directory
> - hooks directory
>
> I do not want to promise every possible path-related key up front. I
> would rather start with the most straightforward and useful values, get
> feedback early, and continue from there.
>
> ## My approach to scope and quality
>
> One thing I would like to be careful about is not treating this project
> as a simple checklist of fields to add.
>
> I think the quality of the project will depend on three things:
>
> 1. choosing a small set of fields that make sense together,
> 2. agreeing on a consistent path representation,
> 3. and making sure the result fits naturally into the existing `git repo`
> design rather than becoming a thin wrapper over `git rev-parse`.
>
> Because of that, I would prefer to make progress in a few coherent
> batches instead of adding many unrelated keys at once.
>
> I also think it is important to keep room for scope reduction. If some
> part of the design turns out to be more controversial than expected,
> I would prefer to complete a smaller, cleaner set of path fields rather
> than stretching the project too broadly.
>
> ## Technical approach
>
> The implementation of `git repo` is primarily in `builtin/repo.c`. The
> first step would be to understand how `git repo info` currently collects
> and prints repository metadata, and how that existing structure can be
> extended without making the interface inconsistent.
>
> Many relevant repository paths are already available internally through
> helpers such as:
>
> - `repo_get_git_dir()`
> - `repo_get_common_dir()`
> - `repo_get_work_tree()`
>
> Similarly, `git rev-parse --git-path` already relies on existing path
> resolution logic. So the work is not about inventing these values from
> scratch, but about exposing a selected subset of them through
> `git repo info` in a way that fits its current design.
>
> The first implementation step would be to map existing helpers and path
> resolution logic to a small set of `repo info` fields. After that, I
> would extend the output code in `builtin/repo.c` to report those fields
> in a consistent way.
>
> One of the main design questions is path formatting. The ideas page
> explicitly mentions the need to decide between relative and absolute
> paths. I do not want to assume the answer in advance. Instead, I would
> review the current discussion, compare the behavior of existing
> commands, and propose a small, consistent approach on the mailing list.
>
> I also expect that some preparatory cleanup or refactoring may be useful
> before adding new fields. If so, I would keep that work minimal and send
> it as small separate patches.
>
Something I would like to see is how we can leverage the existing tests
for `git-rev-parse(1)` and use them.
> ## Patch strategy
>
> I expect the implementation to be divided into small patches so that
> each change can be reviewed independently.
>
> A likely patch strategy would be:
>
> 1. small preparatory cleanup if needed
> 2. add support for a first path-related key or a very small set of keys
> 3. extend support with additional related keys
> 4. add or refine tests for the new behavior
> 5. update documentation if necessary
>
> If existing in-progress series already cover some of these parts, I
> would adjust the breakdown accordingly and focus on what remains useful
> and open.
>
> ## Tests
>
> Tests would be added to cover the new behavior in common repository
> setups.
>
> Depending on the exact scope agreed on, test cases may include:
>
> - ordinary repositories
> - linked worktrees
> - superproject/submodule cases
> - cases where path values differ from simple defaults
>
> I would keep the tests focused on observable behavior instead of
> overfitting them to a particular implementation detail.
>
> ## What I will not try to do
>
> To keep the project realistic, I do not plan to:
>
> - redesign all of `git repo`
> - fully replace `git rev-parse`
> - implement every possible repository path query
> - work on both `git repo info` and `git repo structure` at full scope in
> the same project
>
> The project should stay focused on a well-defined subset of path-related
> metadata for `git repo info`.
>
> ## Expected deliverables
>
> By the end of the project, I expect to deliver:
>
> - support for a useful set of path-related values in `git repo info`
> - tests covering the new functionality
> - documentation updates if needed
wouldn't documentation be definitely needed? ;)
> - one or more patch series discussed and refined on the Git mailing list
>
> ## Timeline
>
> ### Community bonding period
>
> - Study `builtin/repo.c` and the current `git repo info` implementation
> - Review recent and ongoing mailing list discussions related to `git repo`
> - Compare current `git repo info` behavior with `git rev-parse`
> - Refine the exact scope with mentors and mailing list feedback
>
> ### Phase 1
>
> - Implement a first small batch of path-related values
> - Send the first patch series
> - Address review comments
> - Add tests for the first batch
>
> ### Phase 2
>
> - Implement additional agreed path values
> - Continue design discussion if needed
> - Refine implementation and tests based on review feedback
>
> ### Phase 3
>
> - Complete remaining agreed work
> - Update documentation if necessary
> - Rework earlier patches if needed for consistency
> - Prepare a final summary of the work
>
> ### Buffer time
>
> - Handle review delays
> - Fix regressions or edge cases
> - Narrow scope if some planned work turns out to be too large
>
We generally do timelines in terms of weeks of GSoC. So it would be nice
to see that mapping over the phases mentioned here.
> ## Risks and mitigation
>
> One risk is that design discussion may take longer than expected,
> especially around path representation and output structure.
>
> To reduce that risk, I would keep the patch series small and prioritize
> the least controversial values first.
>
> Another risk is overlap with work already in progress. If that happens,
> I would adjust the project scope to avoid duplication and focus on what
> is still useful and open.
>
> ## Why I think I am a good fit
>
> I have already started learning Git’s normal contribution workflow
> through a microproject, including building Git from source, running
> tests, preparing a patch, and sending it to the mailing list.
>
> This project fits the kind of work I want to do in Git: understanding
> existing code, discussing interface details on the mailing list, and
> implementing improvements incrementally in small patches.
>
> ## References
>
> - SoC 2026 idea page:
> https://git.github.io/SoC-2026-Ideas/
>
> - General application information:
> https://git.github.io/General-Application-Information/
>
> - `git repo` documentation:
> https://git-scm.com/docs/git-repo
>
> - `git rev-parse` documentation:
> https://git-scm.com/docs/git-rev-parse
>
> - `git-sizer` project:
> https://github.com/github/git-sizer
>
> - Recent patch series adding path-related support to `git repo`:
> https://public-inbox.org/git/20260228224252.72788-1-lucasseikioshiro@gmail.com/
>
> - More recent work-in-progress series for category/path keys and
> `--path-format`:
> https://public-inbox.org/git/pull.2208.v6.git.git.1772428548.gitgitgadget@gmail.com/
>
> - Recent GSoC proposal thread on improving the new `git repo` command:
> https://public-inbox.org/git/20260303140732.16886-1-pushkarkumarsingh1970@gmail.com/
>
> - Another recent GSoC proposal thread on improving/extending `git repo`:
> https://public-inbox.org/git/CA+rGoLd-1Mb5JG1H1PvE-kyjdznrLVFjwQiMLHtd2ETQ-igmXg@mail.gmail.com/
>
> - Recent proposal thread focused on the same SoC idea:
> https://public-inbox.org/git/CAO_P5U3g_+RpnDUmEv_qX-3GVhpxLV97eMxP1apERc0KU_95tQ@mail.gmail.com/
>
> - Recent discussion around `git repo structure` enhancements:
> https://public-inbox.org/git/CAO_P5U2f4MD-URre+4ocC=YQ570hr03pZHDk1jvuSOKx4aLOCA@mail.gmail.com/
>
> - Microproject discussion thread:
> https://public-inbox.org/git/CAKWWG_nGhD6vqhAS1mkEwBQPrg_YX0+C3-xW=Q3ifFDw4dDviw@mail.gmail.com/
>
> - Microproject patch thread:
> https://public-inbox.org/git/20260315231538.68586-1-jerrywang183@yahoo.com/
>
> - Review on the microproject patch thread:
> https://public-inbox.org/git/CAOLa=ZTpfHUySnMgCFMnvo2JcRSv8zqFP-cLFSs+Ab5Cy2zsvg@mail.gmail.com/
Regards,
Karthik
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GSoC] Proposal draft: Improve the new git repo command
2026-03-16 11:47 ` [GSoC] Proposal draft: Improve the new git repo command Jialong Wang
2026-03-16 20:59 ` Karthik Nayak
@ 2026-03-16 21:05 ` Jialong Wang
1 sibling, 0 replies; 6+ messages in thread
From: Jialong Wang @ 2026-03-16 21:05 UTC (permalink / raw)
To: git; +Cc: karthik.188
Thanks for the feedback.
This is very helpful. I will revise the proposal to make the current
direction, the relationship to my microproject work, and the timeline
more concrete.
Thanks,
Jialong
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GSoC] Proposal draft: Improve the new git repo command
2026-03-16 20:59 ` Karthik Nayak
@ 2026-03-17 0:28 ` Jialong Wang
2026-03-18 20:08 ` [GSoC proposal v3][RFC] " Jialong Wang
1 sibling, 0 replies; 6+ messages in thread
From: Jialong Wang @ 2026-03-17 0:28 UTC (permalink / raw)
To: git; +Cc: karthik.188, Jialong Wang
Hi Karthik,
Thanks for the detailed feedback. I revised the proposal draft to make the current status, intended scope, patch breakdown, and use of existing tests clearer. The updated draft is below.
Improve Git Repo Info By Adding Repository Path Metadata
Name
Jialong Wang
Email
jerrywang183@yahoo.com
Preferred project size
175 hours
About me
My name is Jialong Wang, and I plan to apply to Git for GSoC 2026.
I have been getting familiar with Git's development workflow by building
Git from source, reading the contribution documents, and working on a
microproject. My microproject focused on improving corrupt patch
location reporting in git apply and git am. That work has already gone
through mailing list review, including comments from Karthik Nayak and
Junio C Hamano, and it gave me direct experience with rerolling
patches, updating tests, and using CI to catch gaps that I had missed
locally.
My broader programming experience has mainly involved systems-oriented
software, where I have had to read existing code, trace behavior
through unfamiliar paths, and make targeted changes without disrupting
surrounding logic.
At this point, my recent Git contributions around this microproject are:
1. an initial microproject patch series to report the location of
corrupt patches more clearly
2. a follow-up patch to report input locations in header parsing errors
in apply.c
3. a follow-up patch to report input locations in binary and garbage
patch error paths in apply.c
This has also helped me get comfortable with the normal Git workflow of
starting with a small change, responding to review, and then continuing
with a logically related follow-up.
I am interested in the new git repo command because it is user-facing,
but also closely tied to Git's internal repository model. That makes it
a good fit for the kind of work I want to do: understanding existing
code, discussing design details on the mailing list, and implementing
improvements in small, reviewable patches.
Relevant links
Microproject discussion thread
This thread asked whether improving corrupt patch location reporting was
a suitable microproject and helped me choose the work.
https://public-inbox.org/git/CAKWWG_nGhD6vqhAS1mkEwBQPrg_YX0+C3-xW=Q3ifFDw4dDviw@mail.gmail.com/
Microproject patch thread
This thread contains the patch itself, review, and rerolls for the
corrupt patch location reporting work.
https://public-inbox.org/git/20260315231538.68586-1-jerrywang183@yahoo.com/
Follow-up patch
This follow-up patch extends the same idea to header parsing errors in
apply.c.
https://public-inbox.org/git/20260316195847.92386-1-jerrywang183@yahoo.com/
Second follow-up patch
This follow-up patch extends the same idea to binary and garbage patch
error paths in apply.c and has also been sent to the mailing list. I
will add the public archive link once it is indexed.
Subject: [GSoC PATCH] apply: report input location in binary and garbage patch errors
SoC 2026 idea page
https://git.github.io/SoC-2026-Ideas/
Project summary
I would like to work on improving the new git repo command, with a
primary focus on git repo info.
The git repo command was introduced to provide a cleaner interface for
querying repository metadata. However, several useful path-related
values are still mainly accessed through git rev-parse and
git rev-parse --git-path. My proposal is to extend git repo info so
that it can expose a selected set of those values in a more structured
form.
The goal is not to replace git rev-parse, but to make git repo info
more useful as a structured interface for repository path metadata.
Motivation
Today, scripts and tools still often rely on commands such as:
git rev-parse --git-dir
git rev-parse --show-toplevel
git rev-parse --git-path <path>
These commands are useful, but they were not primarily designed as a
structured repository metadata interface.
Since git repo info already exists for this purpose, extending it with
path-related values would make repository layout information easier to
query in a cleaner and more consistent way.
I think this is a good GSoC project because it has clear user value, can
be implemented incrementally, and naturally fits Git's patch-and-review
workflow.
Current context
I am aware that work on path-related git repo info fields has already
started. There have already been patch series for path keys, category
requests, and path formatting. Because of that, I do not want to assume
that the work described on the ideas page is still untouched.
One of my first goals during the bonding period would be to review the
current state of these discussions carefully, identify what remains
open, and refine the project scope based on maintainer feedback. I would
rather build on the current direction than duplicate work that is
already in progress.
My recent apply.c follow-up patches are separate from git repo itself,
but they have already helped me get comfortable with Git's mailing list
process, with responding to maintainer comments, and with organizing
small changes into follow-up patches instead of overloading a single
series. I expect to approach git repo work in the same way.
At the moment, the direction that seems most realistic to me is to
start with a small set of layout-related fields that already have clear
equivalents in git rev-parse, such as git-dir, common-dir, toplevel,
and superproject-working-tree. I would prefer to begin there before
taking on broader questions such as category-wide output or possible
git repo structure extensions.
More concretely, if the current discussions do not point in a different
direction, I would expect my first implementation work to focus on a
small initial series that adds one or a few of these layout-related
fields to git repo info, together with tests and documentation for the
same behavior. I would treat that first series as the point where the
community can judge whether the field naming, path representation, and
overall shape of the interface look right before I continue to a second
batch.
In other words, my current preference is:
1. first settle a small batch of repo info path fields with clear
rev-parse equivalents
2. then extend to a second batch of agreed path-related values from
rev-parse --git-path
3. only after that consider whether category keys or other nearby repo
info improvements are worth taking on as stretch work
I also think this project should be scoped carefully. The ideas page
mentions improvements to both git repo info and git repo structure, but
for a GSoC project I believe it is more realistic to focus first on
git repo info and only expand beyond that if the main work is in good
shape.
Proposed work
The main objective of this project is to extend git repo info with
selected repository path values that are currently obtained through
git rev-parse and git rev-parse --git-path.
I expect the work to proceed in four connected parts:
1. Review the current implementation and ongoing mailing list
discussions, then narrow the initial scope to a first small batch of
path-related fields.
2. Discuss output design on the mailing list, especially where there are
open questions about relative versus absolute paths and how the new
fields should fit the existing interface.
3. Implement the agreed functionality through small patch series, with
each patch or small patch group carrying its own tests and any
documentation updates for the user-visible behavior.
4. If the first batch is in good shape, extend support to a second
agreed batch of path-related fields.
Initial scope
The first stage of the project would focus on a small set of commonly
used repository path values, for example:
git-dir
common-dir
toplevel
superproject-working-tree
I think these are a good first target because they are already familiar
to users through git rev-parse, and they provide immediate practical
value without requiring a large interface expansion.
If I had to choose an initial implementation order today, I would most
likely start with git-dir, common-dir, and toplevel first, because they
seem like the most direct and broadly useful candidates. I would then
look at superproject-working-tree and selected git-path style values
after the first review round, rather than trying to push all of them in
the same initial series.
Depending on project progress and mailing list feedback, I would then
like to extend support to selected values currently accessed through
git rev-parse --git-path, such as:
index file
objects directory
hooks directory
I do not want to promise every possible path-related key up front. I
would rather start with the most straightforward and useful values, get
feedback early, and continue from there.
If the core path-related work is in good shape, possible later work
could include small extensions around category keys or closely related
repo info behavior. I do not want to commit to that up front, but I do
want the timeline to make room for it as stretch work rather than as a
core deliverable.
My approach to scope and quality
One thing I would like to be careful about is not treating this project
as a simple checklist of fields to add.
I think the quality of the project will depend on three things:
1. choosing a small set of fields that make sense together
2. agreeing on a consistent path representation
3. making sure the result fits naturally into the existing git repo
design rather than becoming a thin wrapper over git rev-parse
Because of that, I would prefer to make progress in a few coherent
batches instead of adding many unrelated keys at once.
I also think it is important to keep room for scope reduction. If some
part of the design turns out to be more controversial than expected,
I would prefer to complete a smaller, cleaner set of path fields rather
than stretching the project too broadly.
Technical approach
The implementation of git repo is primarily in builtin/repo.c. The
first step would be to understand how git repo info currently collects
and prints repository metadata, and how that existing structure can be
extended without making the interface inconsistent.
Many relevant repository paths are already available internally through
helpers such as:
repo_get_git_dir()
repo_get_common_dir()
repo_get_work_tree()
Similarly, git rev-parse --git-path already relies on existing path
resolution logic. So the work is not about inventing these values from
scratch, but about exposing a selected subset of them through
git repo info in a way that fits its current design.
The first implementation step would be to map existing helpers and path
resolution logic to a small set of repo info fields. After that, I
would extend the output code in builtin/repo.c to report those fields
in a consistent way.
In the current implementation, git repo info is handled by
cmd_repo_info() in builtin/repo.c. The currently supported keys are
defined in repo_info_field[], and the command prints values through
print_fields() and print_all_fields().
A likely first implementation step would be to add new entries to
repo_info_field[] for the first batch of path-related keys, backed by
new getter functions that fit alongside existing ones such as
get_layout_bare(), get_layout_shallow(), get_object_format(), and
get_references_format(). The main user-facing path through the command
would still remain cmd_repo_info() together with the existing
print_fields() and print_all_fields() flow, so my goal would be to
extend that structure rather than introduce a separate special case for
path values.
For the initial batch, my expectation is that the implementation will
mostly look like:
1. identify which existing repository or path helper corresponds to the
field to be exposed
2. add a getter that matches the shape expected by repo_info_field[]
3. register the new field in repo_info_field[]
4. update the output and tests to cover the new field
5. update the documentation for the new field and its path format
In practical terms, I expect the first series to stay close to the
existing structure in builtin/repo.c rather than try to redesign it. If
the early fields are backed cleanly by helpers such as
repo_get_git_dir(), repo_get_common_dir(), or repo_get_work_tree(), I
would prefer to start there and let review on those patches shape the
approach for later fields.
For path-related values, the main work would not be inventing new data,
but deciding which existing repository and path helpers should back
each key and how those paths should be formatted consistently in
git repo info.
One of the main design questions is path formatting. The ideas page
explicitly mentions the need to decide between relative and absolute
paths. I do not want to assume the answer in advance. Instead, I would
review the current discussion, compare the behavior of existing
commands, and propose a small, consistent approach on the mailing list.
I also expect that some preparatory cleanup or refactoring may be useful
before adding new fields. If so, I would keep that work minimal and send
it as small separate patches.
Patch strategy
I expect the implementation to be divided into small patches so that
each change can be reviewed independently.
A likely patch strategy would be:
1. small preparatory cleanup if needed
2. add a first small batch of layout-related fields, together with the
tests and documentation updates needed for those fields
3. extend support with additional agreed path-related fields, again with
matching tests and documentation updates
For the first batch, I currently expect something on the order of 4 to
6 patches, depending on how much preparatory cleanup is useful and on
whether tests and documentation read more clearly combined with the
field patches or as separate follow-up patches in the same series. I do
not want to promise an exact count in advance, but I do want the first
series to stay small enough that each patch still has a clear purpose.
I do not want to treat tests and documentation as a final clean-up
stage. Since these are user-visible additions to git repo info, I think
the tests and documentation should evolve with each field batch, so
that the mailing list can review the interface and its description at
the same time as the implementation.
If existing in-progress series already cover some of these parts, I
would adjust the breakdown accordingly and focus on what remains useful
and open.
Tests
Tests would be added alongside the new behavior rather than at the very
end.
Depending on the exact scope agreed on, test cases may include:
ordinary repositories
linked worktrees
superproject and submodule cases
cases where path values differ from simple defaults
Where possible, I would compare the new git repo info output against
existing git rev-parse behavior, since many of the proposed fields are
already exposed there. I would also look for opportunities to reuse or
mirror the same repository layouts and edge cases that are already
important to rev-parse, instead of inventing unrelated test-only cases.
For fields whose behavior is intentionally meant to correspond closely to
git rev-parse, I would like the tests to make that relationship clear.
For example, if a new git repo info field is meant to expose the same
information as a particular rev-parse query, I would try to test both in
the same repository setup so that any differences are explicit and
intentional rather than accidental.
I would keep the tests focused on observable behavior instead of
overfitting them to a particular implementation detail.
What I will not try to do
To keep the project realistic, I do not plan to:
redesign all of git repo
fully replace git rev-parse
implement every possible repository path query
work on both git repo info and git repo structure at full scope in the
same project
The project should stay focused on a well-defined subset of path-related
metadata for git repo info.
Expected deliverables
By the end of the project, I expect to deliver:
support for a useful set of path-related values in git repo info
tests covering the new functionality for those fields
documentation updates for the new fields and their behavior
one or more patch series discussed and refined on the Git mailing list
Success criteria
I would consider the project successful if, by the end of the GSoC
period, the following are true:
1. a first useful batch of path-related git repo info fields has been
implemented and is in good shape on the mailing list, ideally merged
or close to merge-ready
2. the new fields are covered by tests that clearly exercise the agreed
repository layouts and path behavior
3. the documentation for those fields has been updated together with the
implementation
4. if review and scope permit, at least one further agreed batch of
fields has also been implemented or is well advanced
Timeline
Community bonding period
Study builtin/repo.c and the current git repo info implementation
Review recent and ongoing mailing list discussions related to git repo
Compare current git repo info behavior with git rev-parse
Refine the exact scope with mentors and mailing list feedback
Identify the first small batch of fields that looks realistic for an
initial patch series
Review whether any in-progress series already cover part of that first
batch, so that I can avoid duplicating ongoing work
Week 1
Confirm the exact first batch of fields to target
Prepare and send an initial patch series for that batch
Include tests and documentation updates in that first series instead of
leaving them to the end
Week 2
Address review comments on the initial series
Revise the first batch if there is feedback about key naming, output
shape, or relative versus absolute paths
Decide whether the current direction is stable enough to continue with a
second batch or whether the first batch needs another reroll first
Weeks 3 to 4
Address review comments on the initial series
Refine the first batch if there is feedback about field naming or path
formatting
Settle the first round of tests and documentation
If the first batch is accepted or close to settled, identify the exact
second batch to work on next
Weeks 5 to 6
Implement a second agreed batch of path-related fields, likely selected
git rev-parse --git-path equivalents
Send the next patch series with tests and documentation updates for
that batch
Keep the second batch narrower than the first draft of the overall idea
if review shows that path formatting or naming still needs discussion
Weeks 7 to 8
Address review comments on the second batch
Refine edge cases involving worktrees, submodules, or path formatting
Make sure the user-visible behavior is documented clearly for whatever
subset of fields is actually agreed on
Weeks 9 to 10
Finish remaining agreed work
Use the remaining time as buffer for rerolls, regressions, or scope
reduction if needed
If the core path-related work is in good shape, investigate a small
stretch item closely related to git repo info rather than branching out
into a separate large feature
Weeks 11 to 12
Handle any remaining review rounds on the core patch series
Polish tests and documentation for the agreed set of fields
Use the remaining time for final cleanup, rerolls, and project summary
rather than for opening a new large piece of work
Core goals for the project
1. land or get close to landing a first useful batch of path-related
git repo info fields
2. implement at least one further agreed batch if the first one is in
good shape
3. keep tests and documentation updated as part of the patch series
Stretch work
1. add a small extra batch beyond the core set if review cycles go well
2. investigate a nearby repo info improvement such as a small category
or interface refinement, but only if the main path-related work is
already in good shape
Risks and mitigation
One risk is that design discussion may take longer than expected,
especially around path representation and output structure.
To reduce that risk, I would keep the patch series small and prioritize
the least controversial values first.
Another risk is overlap with work already in progress. If that happens,
I would adjust the project scope to avoid duplication and focus on what
is still useful and open.
Why I think I am a good fit
I have already started learning Git's normal contribution workflow
through a microproject, including building Git from source, running
tests, preparing patches, responding to review, and rerolling them on
the mailing list.
This project is a good fit for me because it requires exactly the kind
of work I have already started doing in Git: reading existing code
paths carefully, making small user-visible improvements, and refining
the result through review rather than trying to force a large one-shot
design.
References
Official and technical references
SoC 2026 idea page
https://git.github.io/SoC-2026-Ideas/
General application information
https://git.github.io/General-Application-Information/
git repo documentation
https://git-scm.com/docs/git-repo
git rev-parse documentation
https://git-scm.com/docs/git-rev-parse
git-sizer project
https://github.com/github/git-sizer
Relevant mailing list threads
Recent patch series adding path-related support to git repo
https://public-inbox.org/git/20260228224252.72788-1-lucasseikioshiro@gmail.com/
More recent work-in-progress series for category and path keys and
path-format
https://public-inbox.org/git/pull.2208.v6.git.git.1772428548.gitgitgadget@gmail.com/
Recent GSoC proposal thread on improving the new git repo command
https://public-inbox.org/git/20260303140732.16886-1-pushkarkumarsingh1970@gmail.com/
Another recent GSoC proposal thread on improving and extending git repo
https://public-inbox.org/git/CA+rGoLd-1Mb5JG1H1PvE-kyjdznrLVFjwQiMLHtd2ETQ-igmXg@mail.gmail.com/
Recent proposal thread focused on the same SoC idea
https://public-inbox.org/git/CAO_P5U3g_+RpnDUmEv_qX-3GVhpxLV97eMxP1apERc0KU_95tQ@mail.gmail.com/
Recent discussion around git repo structure enhancements
https://public-inbox.org/git/CAO_P5U2f4MD-URre+4ocC=YQ570hr03pZHDk1jvuSOKx4aLOCA@mail.gmail.com/
Microproject discussion thread
https://public-inbox.org/git/CAKWWG_nGhD6vqhAS1mkEwBQPrg_YX0+C3-xW=Q3ifFDw4dDviw@mail.gmail.com/
Microproject patch thread
https://public-inbox.org/git/20260315231538.68586-1-jerrywang183@yahoo.com/
Review on the microproject patch thread
https://public-inbox.org/git/CAOLa=ZTpfHUySnMgCFMnvo2JcRSv8zqFP-cLFSs+Ab5Cy2zsvg@mail.gmail.com/
Follow-up patch for header parsing errors
https://public-inbox.org/git/20260316195847.92386-1-jerrywang183@yahoo.com/
Follow-up patch for binary and garbage patch errors
Sent to the mailing list; public archive link to be added once indexed.
Subject: [GSoC PATCH] apply: report input location in binary and garbage patch errors
Thanks,
Jialong
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GSoC] Proposal draft: Improve the new git repo command
[not found] <20260318125303.88730-1-jerrywang183.ref@yahoo.com>
@ 2026-03-18 12:53 ` Jialong Wang
0 siblings, 0 replies; 6+ messages in thread
From: Jialong Wang @ 2026-03-18 12:53 UTC (permalink / raw)
To: karthik.188; +Cc: Jialong Wang, git
Hi Karthik,
I wanted to send a brief follow-up on my proposal draft for the
"Improve the new git repo command" project.
Since sending the revised draft, I have continued working on small Git
patches to get more comfortable with the codebase and with the mailing
list workflow. In particular, I sent/rerolled:
- an apply.c series on input-location reporting, starting from
"apply: report the location of corrupt patches" and then extending it
to header parsing errors and binary/garbage patch errors, which I
rerolled as a single v4 0/3 series after review feedback
- "t2203: avoid suppressing git status exit code"
- "object-name: turn INTERPRET_BRANCH_* constants into enum values"
Working through these patches helped sharpen how I think about the repo
project's scope.
My current understanding is that the project should probably not try to
turn "git repo info" into a large catch-all for every possible
repository path/value right away. Instead, the core scope should be to
define and land a coherent initial set of path-oriented values that are
already grounded in existing repository setup / rev-parse style
plumbing, and to do that as a sequence of small self-contained patches,
each with its own tests and documentation updates.
For testing, I now expect the main command-level coverage to live in
t/t1900-repo-info.sh, while reusing patterns from existing rev-parse and
repository setup tests where the semantics overlap.
This also changed how I think about the timeline: I would prefer to keep
the initial milestones focused on a small useful subset with a clear
interface, and only treat broader expansion and cleanups as follow-up
work once the main direction is in good shape.
If you have time, I would really appreciate any feedback on whether this
updated framing is closer to the direction you had in mind.
Thanks,
Jialong
^ permalink raw reply [flat|nested] 6+ messages in thread
* [GSoC proposal v3][RFC] Improve the new git repo command
2026-03-16 20:59 ` Karthik Nayak
2026-03-17 0:28 ` Jialong Wang
@ 2026-03-18 20:08 ` Jialong Wang
1 sibling, 0 replies; 6+ messages in thread
From: Jialong Wang @ 2026-03-18 20:08 UTC (permalink / raw)
To: git; +Cc: karthik.188, jltobler, ayu.chandekar, siddharthasthana31,
jerrywang183
Hi all,
This is v3 of my proposal draft for the "Improve the new git repo
command" project. I am including the full draft inline below for
convenience.
In this revision, I tried to make the scope more realistic and better
aligned with the current public discussion around `git repo info`. In
particular, I:
- revised the proposal so it does not assume that path-related `git repo
info` work is starting from scratch
- reframed the project around integration, testing, repository-aware
cleanup, and any still-open metadata gaps
- added my more recent Git contributions
- added a short "immediate next steps" section describing the kind of
`git repo` patch I want to work on next before the coding period
I would appreciate any feedback from mentors and reviewers on whether
this revised framing is closer to the right direction.
Thanks for any feedback,
Jialong
Improve the git repo command
Name
Jialong Wang
Email
jerrywang183@yahoo.com
Preferred project size
175 hours
About me
My name is Jialong Wang, and I plan to apply to Git for GSoC 2026.
I have been getting familiar with Git's development workflow by building
Git from source, reading the contribution documents, and working on
patches through the mailing list. My initial microproject focused on
improving corrupt patch location reporting in `git apply` and `git am`.
That work went through mailing-list review, including comments from
Karthik Nayak and Junio C Hamano, and gave me direct experience with
rerolling patches, updating tests, and using CI to catch gaps I had
missed locally.
Since then, I have continued contributing small Git patches and
follow-up work instead of stopping after the microproject. My recent
contributions include:
1. an initial patch series to report the location of corrupt patches
more clearly
2. a follow-up patch to report input locations in header parsing errors
in `apply.c`
3. a follow-up patch to report input locations in binary and garbage
patch error paths in `apply.c`
4. `t2203: avoid suppressing git status exit code`
5. `object-name: turn INTERPRET_BRANCH_* constants into enum values`
This has helped me get comfortable with Git's normal workflow of
starting with a small change, responding to review, rerolling
appropriately, and then continuing with logically related follow-up
work.
Project summary
I would like to work on improving the new `git repo` command, with a
primary focus on `git repo info`.
The `git repo` command was introduced to provide a cleaner interface for
querying repository metadata. Path-related values are a natural part of
that goal, but the public discussion this year has already shown that
this topic is not starting from zero: there is ongoing work around
path-related fields, category-aware key naming, and path-format
behavior.
Because of that, I do not want to frame this proposal as "I will newly
add repository path metadata" in isolation. Instead, my proposal is to
improve `git repo info` by building on the direction already taking
shape upstream, focusing on integration, testing, repository-aware
cleanup, and any remaining path-related or adjacent metadata work that
is still useful and unimplemented by the time GSoC begins.
The goal is not to replace `git rev-parse`, but to make `git repo info`
a more coherent and better-tested structured interface for repository
metadata.
Motivation
Today, scripts and tools still often rely on commands such as:
- `git rev-parse --git-dir`
- `git rev-parse --show-toplevel`
- `git rev-parse --git-path <path>`
These commands are useful, but they were not primarily designed as a
structured repository metadata interface.
Since `git repo info` already exists for this purpose, extending and
refining it would make repository layout information easier to query in
a cleaner and more consistent way. However, given the current public
work in progress, I think the most useful contribution is not to
duplicate existing series, but to help move this area toward a better
integrated and upstream-ready state.
Current context
I am aware that work on path-related `git repo info` fields has already
started. There have already been patch series and proposal discussions
for path keys, category requests, path formatting, and nearby
`git repo structure` ideas.
Because of that, one of my first goals during the bonding period would
be to review the current state of those discussions carefully, identify
what remains open, and refine the exact project scope based on mentor
feedback. I would rather build on the current direction than duplicate
work that is already in progress.
At this point, the direction that seems most realistic to me is:
1. first align with the upstream direction that is already emerging for
`git repo info`
2. improve the command's internal consistency and test coverage
3. implement remaining path-related or adjacent metadata work only where
it is still clearly useful and not already being covered elsewhere
Immediate next steps
Before the coding period, I want to keep contributing in this area
through small reviewable patches instead of waiting until GSoC starts.
My immediate plan is:
1. review the latest upstream state of the path-related and
category-related `git repo info` work
2. identify one small `git repo` patch that does not duplicate an
in-flight series
3. start either with a repository-aware cleanup in `builtin/repo.c` or
with stronger tests in `t/t1900-repo-info.sh`, depending on which
direction is still open and useful
4. use that first patch series to validate the project direction with
the mailing list before committing to a larger implementation batch
Proposed work
The main objective of this project is to improve `git repo info` as a
structured repository metadata interface while avoiding duplication of
public in-flight work.
I expect the work to proceed in four connected parts:
1. review the current implementation and ongoing mailing-list
discussions, then narrow the initial scope to a first small batch of
cleanup, tests, or still-open metadata work
2. discuss design details on the mailing list, especially where there
are open questions about path naming, path formatting, or the
relationship with existing `git rev-parse` behavior
3. implement the agreed functionality through small patch series, with
each patch or small patch group carrying its own tests and
documentation updates for the user-visible behavior
4. if the first batch is in good shape, extend support to a second
agreed batch of improvements, whether that means remaining
path-related fields, repository-aware cleanup, or nearby metadata
work that still appears useful
Initial scope
At the beginning of the project, I would prefer to keep the first
practical batch conservative.
Rather than assuming that the first implementation work should directly
add a large number of path keys, I would prefer to start from one of
these two realistic entry points, depending on the state of upstream
work:
1. a small batch of still-unimplemented layout-related fields with clear
`rev-parse` equivalents, if those remain open
2. repository-aware cleanups and stronger tests around `git repo info`,
if the path-field direction is already substantially covered by
existing series
If path-related values are still a good first target by the beginning of
the coding period, the most likely initial candidates would be a small
set of high-value layout paths such as:
- `git-dir`
- `common-dir`
- `toplevel`
- `superproject-working-tree`
If those are already substantially addressed, I would instead prioritize
cleanups and tests that help the command mature, for example:
- reducing unnecessary reliance on global repository state inside
`builtin/repo.c`
- strengthening coverage in `t/t1900-repo-info.sh`
- covering edge cases such as linked worktrees and `--separate-git-dir`
Technical approach
The implementation of `git repo` is primarily in `builtin/repo.c`. The
first step would be to understand how `git repo info` currently collects
and prints repository metadata, and how that existing structure can be
extended or cleaned up without making the interface inconsistent.
Many relevant repository values are already available internally through
helpers such as:
- `repo_get_git_dir()`
- `repo_get_common_dir()`
- `repo_get_work_tree()`
Similarly, `git rev-parse` and `git rev-parse --git-path` already rely
on existing path resolution logic. So the work is not about inventing
these values from scratch, but about exposing or integrating a selected
subset of them through `git repo info` in a way that fits its current
design.
Patch strategy
I expect the implementation to be divided into small patches so that
each change can be reviewed independently.
A likely patch strategy would be:
1. a small preparatory cleanup if needed
2. a first small batch of `git repo` improvements, together with the
tests and documentation updates needed for those changes
3. a second batch that extends the same direction once the first one is
reviewed
I do not want to treat tests and documentation as a final cleanup
stage. Since these are user-visible changes to `git repo info`, I think
they should evolve with each patch batch so that the mailing list can
review the interface and its description at the same time as the
implementation.
Tests
Tests would be added alongside the new behavior rather than at the very
end.
Depending on the exact scope agreed on, test cases may include:
- ordinary repositories
- linked worktrees
- superproject and submodule cases
- repositories created with `--separate-git-dir`
- cases where path values differ from simple defaults
Where possible, I would compare new `git repo info` behavior against
existing `git rev-parse` behavior when the semantics are intentionally
close. I would also look for opportunities to reuse or mirror repository
layouts and edge cases that are already important elsewhere.
What I will not try to do
To keep the project realistic, I do not plan to:
- redesign all of `git repo`
- fully replace `git rev-parse`
- reimplement path-related work that is already being actively reviewed
- work on both `git repo info` and `git repo structure` at full scope in
the same project
Expected deliverables
By the end of the project, I expect to deliver:
- support for a useful set of `git repo info` improvements that are
still clearly open and upstream-relevant
- tests covering the new functionality and relevant repository layouts
- documentation updates for the new fields or behavior
- one or more patch series discussed and refined on the Git mailing list
Success criteria
I would consider the project successful if, by the end of the GSoC
period, the following are true:
1. a first useful batch of `git repo` improvements has been implemented
and is in good shape on the mailing list, ideally merged or close to
merge-ready
2. the new or refined behavior is covered by tests that clearly
exercise the agreed repository layouts and semantics
3. the documentation has been updated together with the implementation
4. if review and scope permit, at least one further agreed batch of
improvements has also been implemented or is well advanced
Timeline
Community bonding period
- study `builtin/repo.c` and the current `git repo info` implementation
- review recent and ongoing mailing-list discussions related to
`git repo`
- compare current `git repo info` behavior with related
`git rev-parse` behavior
- refine the exact scope with mentors and mailing-list feedback
- identify the first small batch of work that looks realistic for an
initial patch series
Weeks 1-3
- confirm the exact first batch of work to target
- prepare and send an initial patch series for that batch
- include tests and documentation updates in that first series
- address review comments and reroll as needed
Weeks 4-6
- continue strengthening semantics and coverage
- add tests for edge cases such as linked worktrees and
`--separate-git-dir`
- resolve small interface inconsistencies discovered during the early
cleanup work
Weeks 7-9
- finish or polish any remaining path/category/path-format work that
still needs implementation or integration
- coordinate patch scope with the latest upstream discussion
- update documentation to match settled behavior
Weeks 10-12
- implement one or more remaining metadata or interface improvements
that are still clearly useful and unclaimed
- focus on review-driven cleanup, additional tests, and documentation
polish
- prepare final report and project summary
Risks and mitigation
The main risk is overlap with parallel upstream work. I plan to mitigate
that by treating the project as integration-oriented from the beginning,
keeping patch series small, and adjusting scope based on the latest
public discussion and mentor guidance.
A second risk is that some of the path-related work may be largely
settled before the coding period starts. If that happens, I would shift
effort toward repository-aware cleanups, stronger test coverage,
documentation alignment, and other still-open `git repo` improvements
rather than forcing redundant feature work.
Why I think I am a good fit
I have already invested time in learning Git's contribution process
through actual submissions rather than only private experimentation.
That includes building the project, reading tests, sending patches,
rerolling in response to feedback, and adjusting patch structure when
maintainers asked for it.
I believe that experience is directly relevant here. The main challenge
of this project is not only writing code, but also moving an evolving
command forward in an upstream-friendly way without duplicating parallel
work. My recent contributions have helped me understand that process
much better, and I believe they put me in a stronger position to carry
this project successfully.
Relevant links
SoC 2026 idea page
https://git.github.io/SoC-2026-Ideas/
General application information
https://git.github.io/General-Application-Information/
git repo documentation
https://git-scm.com/docs/git-repo
git rev-parse documentation
https://git-scm.com/docs/git-rev-parse
Recent patch series adding path-related support to git repo
https://public-inbox.org/git/20260228224252.72788-1-lucasseikioshiro@gmail.com/
Recent work-in-progress series for category and path keys and
`--path-format`
https://public-inbox.org/git/pull.2208.v6.git.git.1772428548.gitgitgadget@gmail.com/
Recent GSoC proposal thread on improving the new git repo command
https://public-inbox.org/git/20260303140732.16886-1-pushkarkumarsingh1970@gmail.com/
Another recent GSoC proposal thread on improving and extending git repo
https://public-inbox.org/git/CA+rGoLd-1Mb5JG1H1PvE-kyjdznrLVFjwQiMLHtd2ETQ-igmXg@mail.gmail.com/
Recent proposal thread focused on the same SoC idea
https://public-inbox.org/git/CAO_P5U3g_+RpnDUmEv_qX-3GVhpxLV97eMxP1apERc0KU_95tQ@mail.gmail.com/
Microproject discussion thread
https://public-inbox.org/git/CAKWWG_nGhD6vqhAS1mkEwBQPrg_YX0+C3-xW=Q3ifFDw4dDviw@mail.gmail.com/
Microproject patch thread
https://public-inbox.org/git/20260315231538.68586-1-jerrywang183@yahoo.com/
Follow-up patch for header parsing errors
https://public-inbox.org/git/20260316195847.92386-1-jerrywang183@yahoo.com/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-18 20:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <9fc1d23fbc7d46349ac01314fbfc06eb.gsoc-proposal-draft-jerrywang183.ref@yahoo.com>
2026-03-16 11:47 ` [GSoC] Proposal draft: Improve the new git repo command Jialong Wang
2026-03-16 20:59 ` Karthik Nayak
2026-03-17 0:28 ` Jialong Wang
2026-03-18 20:08 ` [GSoC proposal v3][RFC] " Jialong Wang
2026-03-16 21:05 ` [GSoC] Proposal draft: " Jialong Wang
[not found] <20260318125303.88730-1-jerrywang183.ref@yahoo.com>
2026-03-18 12:53 ` Jialong Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox