* [GSOC] Discuss and Introduction: Improve disk space recovery for partial clones
@ 2026-02-25 6:17 Amisha Chhajed
2026-02-25 11:49 ` Derrick Stolee
0 siblings, 1 reply; 3+ messages in thread
From: Amisha Chhajed @ 2026-02-25 6:17 UTC (permalink / raw)
To: git, karthik nayak, jltobler@gmail.com, Siddharth Asthana,
Ayush Chandekar, christian.couder
Hello everyone!
I am Amisha, I have made some contributions to Git highlighted below,
https://lore.kernel.org/git/20260121130005.72375-1-amishhhaaaa@gmail.com/
sparse-checkout: optimize string_list construction and add tests to
verify deduplication.
Improve O(n^2) complexity to O(n log n) while building a sorted
'string_list' by constructing it unsorted then sorting it
followed by removing duplicates.
https://lore.kernel.org/git/20260129121220.69267-1-amishhhaaaa@gmail.com/
u-string-list: add unit tests for string-list methods
string-list: add string_list_sort_u() that mimics "sort -u"
[WIP] https://lore.kernel.org/git/20260221162359.43336-2-amishhhaaaa@gmail.com/
My time while contributing to this project has been very rewarding and amazing!
I am aspiring to apply for project 'Improve disk space recovery for
partial clones',
I am aware of sparse-checkout and surrounding code while working on my
first patch,
hence i believe if we are in cone mode we can easily free up the space
in partial clone
for files outside of cone mode whenever user runs cleanup command, however
figuring out what to free in non cone mode is a fairly new topic for
me, i would love to have
discussions surrounding this, i believe a lot inspiration about what
we can clean can be
derived from git gc and git maintenance.
I would love to hear opinions and ideas on this!
--
Thanks,
Amisha
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [GSOC] Discuss and Introduction: Improve disk space recovery for partial clones
2026-02-25 6:17 [GSOC] Discuss and Introduction: Improve disk space recovery for partial clones Amisha Chhajed
@ 2026-02-25 11:49 ` Derrick Stolee
2026-03-01 15:34 ` Amisha Chhajed
0 siblings, 1 reply; 3+ messages in thread
From: Derrick Stolee @ 2026-02-25 11:49 UTC (permalink / raw)
To: Amisha Chhajed, git, karthik nayak, jltobler@gmail.com,
Siddharth Asthana, Ayush Chandekar, christian.couder
On 2/25/26 1:17 AM, Amisha Chhajed wrote:
> I am aspiring to apply for project 'Improve disk space recovery for
> partial clones',
I think this is a noble goal. Removing blobs that you don't expect to
need again would be valuable.
> I am aware of sparse-checkout and surrounding code while working on my
> first patch,
> hence i believe if we are in cone mode we can easily free up the space
> in partial clone
> for files outside of cone mode whenever user runs cleanup command, however
> figuring out what to free in non cone mode is a fairly new topic for
> me, i would love to have
> discussions surrounding this, i believe a lot inspiration about what
> we can clean can be
> derived from git gc and git maintenance.
I think you will have a larger impact if you focus on _old_ blobs that
were maybe necessary for a previous checkout of an old commit but the
paths have been updated in more recent checkouts so those blobs are
unlikely to be needed again other than for history queries.
You should keep in mind that some tools automatically populate stale
data (such as VS Code running 'git blame' in the background of every
open file) and so you want to consider how any decision you make here
may lead to _increased_ resource usage by redownloading data you
removed.
These are just things to think about. It's an interesting space to
help users save disk.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [GSOC] Discuss and Introduction: Improve disk space recovery for partial clones
2026-02-25 11:49 ` Derrick Stolee
@ 2026-03-01 15:34 ` Amisha Chhajed
0 siblings, 0 replies; 3+ messages in thread
From: Amisha Chhajed @ 2026-03-01 15:34 UTC (permalink / raw)
To: Derrick Stolee
Cc: git, karthik nayak, jltobler@gmail.com, Siddharth Asthana,
Ayush Chandekar, christian.couder
> I think this is a noble goal. Removing blobs that you don't expect to
> need again would be valuable.
Thank you!
> I think you will have a larger impact if you focus on _old_ blobs that
> were maybe necessary for a previous checkout of an old commit but the
> paths have been updated in more recent checkouts so those blobs are
> unlikely to be needed again other than for history queries.
Thank you for the insight!
> You should keep in mind that some tools automatically populate stale
> data (such as VS Code running 'git blame' in the background of every
> open file) and so you want to consider how any decision you make here
> may lead to _increased_ resource usage by redownloading data you
> removed.
>
> These are just things to think about. It's an interesting space to
> help users save disk.
I thought about making the command user-driven, which is when the user
wants to free up space they can run something like 'git evict
--older-than=30.days'
or 'git evict --outside-cone' and such similar commands, that way they
can remove
exactly what they intend to remove in case of commits and blobs.
I would love to hear opinions on if the command would look better like this or
automatic like git maintenance, like a background task.
Thank you again, I really appreciate it.
--
Thanks,
Amisha
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-03-01 15:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 6:17 [GSOC] Discuss and Introduction: Improve disk space recovery for partial clones Amisha Chhajed
2026-02-25 11:49 ` Derrick Stolee
2026-03-01 15:34 ` Amisha Chhajed
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox