* Running out of inodes on an NFS which stores repos @ 2025-09-06 14:16 Kousik Sanagavarapu 2025-09-06 15:28 ` rsbecker 2025-09-06 15:34 ` brian m. carlson 0 siblings, 2 replies; 6+ messages in thread From: Kousik Sanagavarapu @ 2025-09-06 14:16 UTC (permalink / raw) To: git; +Cc: Kousik Sanagavarapu Hello everyone, At my $(DAYJOB), we have an NFS which stores different git repos. Due to how git stores objects, we have started to run out of inodes on the NFS as the number of repos coming into the NFS increased. These git repos come from another service and there are typically thousands of them each day. It is important to note that we only store the .git dir and expose a url which is configured as the remote by default to read and write into this repo. All of these are small repos; usually not many files and not many commits too - I'd say ~5 commits on average. Historically, when we ran out of inodes, we had implemented a few strategies where we used to repack the objects or archive the older repos and move them into another store and bring them back into this NFS and unarchive the repo. However, none of these totally mitigated the issue and we still run into issue as the traffic increases. As a last resort, we increased the disk size even though there was ton of free space left - just for increasing the number of inodes. We can't delete any of these repos, no matter how old, because they are valuable data. I was wondering if there was some other strategy that we could implement here as this seems like a problem that people might often run into. It would really help to here your thoughts or if you could point me to anywhere else. Thanks ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Running out of inodes on an NFS which stores repos 2025-09-06 14:16 Running out of inodes on an NFS which stores repos Kousik Sanagavarapu @ 2025-09-06 15:28 ` rsbecker 2025-09-06 15:34 ` brian m. carlson 1 sibling, 0 replies; 6+ messages in thread From: rsbecker @ 2025-09-06 15:28 UTC (permalink / raw) To: 'Kousik Sanagavarapu', git On September 6, 2025 10:16 AM, Kousik Sanagavarapu wrote: >Hello everyone, >At my $(DAYJOB), we have an NFS which stores different git repos. >Due to how git stores objects, we have started to run out of inodes on the NFS as >the number of repos coming into the NFS increased. > >These git repos come from another service and there are typically thousands of >them each day. It is important to note that we only store the .git dir and expose a url >which is configured as the remote by default to read and write into this repo. > >All of these are small repos; usually not many files and not many commits too - I'd >say ~5 commits on average. > >Historically, when we ran out of inodes, we had implemented a few strategies >where we used to repack the objects or archive the older repos and move them into >another store and bring them back into this NFS and unarchive the repo. > >However, none of these totally mitigated the issue and we still run into issue as the >traffic increases. As a last resort, we increased the disk size even though there was >ton of free space left - just for increasing the number of inodes. > >We can't delete any of these repos, no matter how old, because they are valuable >data. > >I was wondering if there was some other strategy that we could implement here as >this seems like a problem that people might often run into. It would really help to >here your thoughts or if you could point me to anywhere else. I would suggest running git gc --aggressive on your repos. This might help compress your pack files. I have seen customers with thousands of pack files who have never run a garbage collection. Another thing you might want to try is to use sparse-checkout to only keep the directories you absolutely need if that is an option. Also, check your /tmp and lost+found directories. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Running out of inodes on an NFS which stores repos 2025-09-06 14:16 Running out of inodes on an NFS which stores repos Kousik Sanagavarapu 2025-09-06 15:28 ` rsbecker @ 2025-09-06 15:34 ` brian m. carlson 2025-09-08 7:05 ` Kousik Sanagavarapu 1 sibling, 1 reply; 6+ messages in thread From: brian m. carlson @ 2025-09-06 15:34 UTC (permalink / raw) To: Kousik Sanagavarapu; +Cc: git [-- Attachment #1: Type: text/plain, Size: 2780 bytes --] On 2025-09-06 at 14:16:12, Kousik Sanagavarapu wrote: > Hello everyone, Hi, > These git repos come from another service and there are typically > thousands of them each day. It is important to note that we only store > the .git dir and expose a url which is configured as the remote by > default to read and write into this repo. > > All of these are small repos; usually not many files and not many > commits too - I'd say ~5 commits on average. > > Historically, when we ran out of inodes, we had implemented a few > strategies where we used to repack the objects or archive the older > repos and move them into another store and bring them back into this > NFS and unarchive the repo. > > However, none of these totally mitigated the issue and we still run > into issue as the traffic increases. As a last resort, we increased > the disk size even though there was ton of free space left - just > for increasing the number of inodes. > > We can't delete any of these repos, no matter how old, because they are > valuable data. > > I was wondering if there was some other strategy that we could implement > here as this seems like a problem that people might often run into. It > would really help to here your thoughts or if you could point me to > anywhere else. There are a couple things that come to mind here. You can try to set `fetch.unpackLimit` to 1, which will cause of the objects pushed into the repository to end up in a pack. That means you'll usually have only two files, the pack and index, rather than the loose objects. If you have a large number of references, you may wish to convert the repositories to use the reftable backend instead of the files backend (via `git refs migrate --ref-format=reftable`), which will also tend to use fewer files on disk. Note that this requires a relatively new Git, so if you need to access these repositories with an older Git version, don't do this. You can also periodically repack more frequently if you set `gc.autoPackLimit` to a smaller number (in conjunction with `fetch.unpackLimit` above). If you have repositories that are not packed at all, running `git gc` (or, if you don't want to remove any objects, `git repack -d --cruft`), which will likely reduce the number of loose objects and result in more objects being packed. Finally, it may be useful to you to reformat the underlying file system in a way that has more inodes. I know ext4 supports a larger inode ratio for repositories with many small files. Alternatively, apparently btrfs does not have a fixed inode ratio, so that may be helpful to avoid running out of inodes. I can't speak to non-Linux file systems, though. -- brian m. carlson (they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Running out of inodes on an NFS which stores repos 2025-09-06 15:34 ` brian m. carlson @ 2025-09-08 7:05 ` Kousik Sanagavarapu 2025-09-09 0:29 ` brian m. carlson 0 siblings, 1 reply; 6+ messages in thread From: Kousik Sanagavarapu @ 2025-09-08 7:05 UTC (permalink / raw) To: brian m. carlson, Kousik Sanagavarapu, git On Sat Sep 6, 2025 at 9:04 PM IST, brian m. carlson wrote: > On 2025-09-06 at 14:16:12, Kousik Sanagavarapu wrote: >> Hello everyone, > > Hi, > >> These git repos come from another service and there are typically >> thousands of them each day. It is important to note that we only store >> the .git dir and expose a url which is configured as the remote by >> default to read and write into this repo. >> >> All of these are small repos; usually not many files and not many >> commits too - I'd say ~5 commits on average. >> >> Historically, when we ran out of inodes, we had implemented a few >> strategies where we used to repack the objects or archive the older >> repos and move them into another store and bring them back into this >> NFS and unarchive the repo. >> >> However, none of these totally mitigated the issue and we still run >> into issue as the traffic increases. As a last resort, we increased >> the disk size even though there was ton of free space left - just >> for increasing the number of inodes. >> >> We can't delete any of these repos, no matter how old, because they are >> valuable data. >> >> I was wondering if there was some other strategy that we could implement >> here as this seems like a problem that people might often run into. It >> would really help to here your thoughts or if you could point me to >> anywhere else. > > There are a couple things that come to mind here. You can try to set > `fetch.unpackLimit` to 1, which will cause of the objects pushed into > the repository to end up in a pack. That means you'll usually have > only two files, the pack and index, rather than the loose objects. Thanks for this, I have tried this out and while going through the surrounding documentation, found `transfer.unpackLimit`. This was exactly what I was looking for. > If you have a large number of references, you may wish to convert the > repositories to use the reftable backend instead of the files backend > (via `git refs migrate --ref-format=reftable`), which will also tend to > use fewer files on disk. Note that this requires a relatively new Git, > so if you need to access these repositories with an older Git version, > don't do this. > > You can also periodically repack more frequently if you set > `gc.autoPackLimit` to a smaller number (in conjunction with > `fetch.unpackLimit` above). If you have repositories that are not > packed at all, running `git gc` (or, if you don't want to remove any > objects, `git repack -d --cruft`), which will likely reduce the number > of loose objects and result in more objects being packed. Yes, I have now set the following config surrounding gc [receive] autogc = true [gc] auto = 1 autopacklimit = 1 Curious to know if this will have any noticable performance impact though. As I mentioned in my previous msg, these are small repos but the number of repos being created and the operations performed on them are large - mostly pushes, > Finally, it may be useful to you to reformat the underlying file system > in a way that has more inodes. I know ext4 supports a larger inode > ratio for repositories with many small files. Alternatively, apparently > btrfs does not have a fixed inode ratio, so that may be helpful to avoid > running out of inodes. I can't speak to non-Linux file systems, though. Unfourtunately, I can't reformat the NFS. It is currently on ext4 and even though there are quite a few filesystems which don't impose a threshold on inodes, I can't migrate to them. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Running out of inodes on an NFS which stores repos 2025-09-08 7:05 ` Kousik Sanagavarapu @ 2025-09-09 0:29 ` brian m. carlson 2025-09-09 7:39 ` Kousik Sanagavarapu 0 siblings, 1 reply; 6+ messages in thread From: brian m. carlson @ 2025-09-09 0:29 UTC (permalink / raw) To: Kousik Sanagavarapu; +Cc: git [-- Attachment #1: Type: text/plain, Size: 930 bytes --] On 2025-09-08 at 07:05:08, Kousik Sanagavarapu wrote: > Yes, I have now set the following config surrounding gc > > [receive] > autogc = true > [gc] > auto = 1 > autopacklimit = 1 > > Curious to know if this will have any noticable performance impact > though. As I mentioned in my previous msg, these are small repos but the > number of repos being created and the operations performed on them are > large - mostly pushes, The `transfer.unpackLimit` will not have any impact; it's in use at at least some major forges. Packed objects can use things like bitmaps and other functionality, which forges like for performance. The gc settings you have will cause everything to repacked after every push, and repacking data can be quite expensive. At work, we repack after about every 40 pushes or so. You may wish to use a different value. -- brian m. carlson (they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Running out of inodes on an NFS which stores repos 2025-09-09 0:29 ` brian m. carlson @ 2025-09-09 7:39 ` Kousik Sanagavarapu 0 siblings, 0 replies; 6+ messages in thread From: Kousik Sanagavarapu @ 2025-09-09 7:39 UTC (permalink / raw) To: brian m. carlson, Kousik Sanagavarapu, git On Tue Sep 9, 2025 at 5:59 AM IST, brian m. carlson wrote: > On 2025-09-08 at 07:05:08, Kousik Sanagavarapu wrote: >> Yes, I have now set the following config surrounding gc >> >> [receive] >> autogc = true >> [gc] >> auto = 1 >> autopacklimit = 1 >> >> Curious to know if this will have any noticable performance impact >> though. As I mentioned in my previous msg, these are small repos but the >> number of repos being created and the operations performed on them are >> large - mostly pushes, > > The `transfer.unpackLimit` will not have any impact; it's in use at at > least some major forges. Packed objects can use things like bitmaps and > other functionality, which forges like for performance. Oh, got it. > The gc settings you have will cause everything to repacked after every > push, and repacking data can be quite expensive. At work, we repack > after about every 40 pushes or so. You may wish to use a different > value. Got it, thanks for the info. I will try with a higher value and see how it goes. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-09-09 7:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-06 14:16 Running out of inodes on an NFS which stores repos Kousik Sanagavarapu 2025-09-06 15:28 ` rsbecker 2025-09-06 15:34 ` brian m. carlson 2025-09-08 7:05 ` Kousik Sanagavarapu 2025-09-09 0:29 ` brian m. carlson 2025-09-09 7:39 ` Kousik Sanagavarapu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).