From: SoutrikDas <valusoutrik@gmail.com>
To: git@vger.kernel.org
Cc: christian.couder@gmail.com, karthik.188@gmail.com,
jltobler@gmail.com, ayu.chandekar@gmail.com,
siddharthasthana31@gmail.com, chandrapratap3519@gmail.com
Subject: [GSOC Proposal] Complete and extend the remote-object-info command for git cat-file
Date: Fri, 6 Mar 2026 02:18:09 +0530 [thread overview]
Message-ID: <20260305204809.54927-1-valusoutrik@gmail.com> (raw)
Hi!
This is my project proposal for GSOC 2026
I am interested in the project idea : "Complete and extend the
remote-object-info command for git cat-file"
# Complete and extend the remote-object-info command for git cat-file
## Contact
- Name: Soutrik Das
- E-mail: valusoutrik@gmail.com
- Github: https://github.com/SoutrikDas
- LinkedIn: https://www.linkedin.com/in/soutrik-das/
## About Me
My name is Soutrik Das, I am a developer and CS bachelor from Indian
Institute of Technology, Dhanbad. Currently I am pursuing a master's
degree in AI from Indian Institute of Technology, Bhubaneswar.
I dont really have much experience in contributing to something as
large as git, but I would love to learn anything and everything I can
gain from this experience. I have experience in C/C++ from my
Btech coursework and participating in codeforces contests.
## Pre GSOC
I started exploring Git's codebase around February 2026 and sent my first patch
as a docfix, followed by a microproject of modernizing tests
- [PATCH] doc: fix repo_config documentation reference [1]
status: merged to master
Merge Commit: 94336d77bcbf4360b67a9454d8bf2e84b3d88ae7
Description: Replace the path for the repo_config() documentation
from 'Documentation/technical/api-config.h' to 'config.h'.
- [GSOC PATCH] t7003: modernize path existence checks using test helpers [2]
status: merged to master
Merge Commit: 11294bb0fa540d214d071b32cf74b1ed37b3bbbd
Description: Replace direct uses of 'test -f' and 'test -d' with
git's helper functions 'test_path_is_file' ,'test_path_is_missing'
and 'test_path_is_dir'
I have read through most of Eric Ju's [4] work and some of Calvin Wan's [5]
work. I am still finding more things to understand from each thread, but
I feel I have grasped the basics.
My work in this project would be focused on implementing the changes
suggested at the end of Eric Ju's [Patch v11].
I wouldn't say I understand every bit of discussion from that thread,
but in general my understanding is :
Calvin Wan and Eric Ju has already implemented a client side command
called get_remote_info but its designed for being batched to reduce
multiple network trips to get a single object's data.
I have added Eric Ju's patch series to an old master commit (2d2a71ce85)
since I could not find a base commit for Eric's patch series. The patch
was properly applied and I also played around and added a very rough
but workin "%(objecttype)" code , ie now it prints like this :
29658341f39210201ff7f72a4be83937cf2288c5 14 blob
## Project : Complete and extend the remote-object-info command for git cat-file
Currently in the case of a partial clone, the user cannot retrieve all
object data without fetching the object beforehand. To solve this problem
Calvin Wan and Eric Ju had designed a patch sreies that can solve that,
by utilising protocolv2 servers capabilities.
This was done in the form of "remote-object-info".
But only the %(objectsize) was implemented, and that patch was not merged.
This project has two goals
1: To Rebase and finalize Calvin Wan and Eric Ju's Work by addressing
the feedback on Eric Ju's Patch v11
2: To add support for objecttype in remote-object-info
3: To discuss other information type like objectsize:disk and deltabase.
Project Duration : 12 week approx
## Timeline
Mar 6-31 : Refine Proposal
If possible I would like to submit small patches... but first I will
have to rebase Eric Ju's Patches ... I am not sure if I can do this
before GSOC...
If not, I plan to contribute to git in other areas.
May 1-24 : Community Bonding
1-7 : Understand relevant underlying/ helper functions
8-24 : Ask about any design related problems/decisions
May 25 - Jun 14 : Start a Patch Series to rebase Calvin Wan and Eric Ju's work
and keep refining
Jun 15 - Aug 15 : Start and keep refining Patch Series to add support for
object type information
Aug 16 - Aug 24 : Discuss and Implement other object information if possible
Concurrently I shall make a report for all the work done.
## Availability
My current semester is ending in the first week of April, so I will be
able to contribute 7-8 hours per day, totalling around 35-40 hrs a week
on the project.
Total weeks = 12 , total hours = 35*12 = 420
It leaves with a lot more room to accomodate any unforeseen circumstances
that may arise during the project.
## RFC
I have a few ideas but do not know if they are worth pursuing, so I will
leave them here in the first draft
- Addition of a remote-object-info outside of batchmode :
Yes it should be optimally used in batch mode .. but if user wants
only one objects size or type then should they be able to just
`git cat-file -r origin <oid>`
and get the size and type ? or something similar , I am not sure if
the way I have depicted it conforms to git's design.
- Addition of commands for common user behaviour :
I dont know if its going to be a common user behaviour but what about
`git cat-file -r --all-absent`
Or inside "git cat-file --batch-command="<format> remote-object-info
--all-absent --type=tree <remote>"
which would basically fill in remote-object-info with all the blobs
that are currently absent from the worktree ?
No need to fill them if its for a common enough use case.
- Sort according to size :
Maybe a user would want to check whats the largest file they dont
have yet.
- Get total missing blob size :
Use case would be when someone wants to know how much exactly there
is to download, before starting the download.
Thank you for your time in revewing my proposal as well as considering
my application. I am excited to learn everything I can from git.
Thanks and Regards,
Soutrik
[1] : pull.2187.git.git.1770293021383.gitgitgadget@gmail.com
[2] : 20260209172445.39536-1-valusoutrik@gmail.com
[3] : 20260225190306.39358-1-valusoutrik@gmail.com
[4] : 20240628190503.67389-1-eric.peijian@gmail.com
[5] : 20220728230210.2952731-1-calvinwan@google.com
next reply other threads:[~2026-03-05 20:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 20:48 SoutrikDas [this message]
2026-03-15 10:11 ` [GSOC Proposal] Complete and extend the remote-object-info command for git cat-file SoutrikDas
2026-03-16 12:08 ` Christian Couder
2026-03-17 13:06 ` SoutrikDas
2026-03-16 20:46 ` Karthik Nayak
2026-03-17 15:13 ` SoutrikDas
2026-03-20 13:12 ` [GSoC Proposal v2] " SoutrikDas
-- strict thread matches above, loose matches on Subject: below --
2026-03-13 10:17 [GSoC] Proposal: " Pablo
2026-03-14 5:58 ` Chandra Pratap
2026-03-14 18:31 ` Pablo
2026-03-15 9:20 ` Chandra Pratap
2026-03-16 11:21 ` Christian Couder
2026-03-16 21:38 ` Karthik Nayak
2026-03-18 10:45 ` Pablo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260305204809.54927-1-valusoutrik@gmail.com \
--to=valusoutrik@gmail.com \
--cc=ayu.chandekar@gmail.com \
--cc=chandrapratap3519@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=jltobler@gmail.com \
--cc=karthik.188@gmail.com \
--cc=siddharthasthana31@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox