git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Issue : Writing commits into the git repository takes longer than expected
@ 2024-06-10 10:17 Arpit Gupta
  2024-06-10 12:58 ` rsbecker
  0 siblings, 1 reply; 6+ messages in thread
From: Arpit Gupta @ 2024-06-10 10:17 UTC (permalink / raw)
  To: git@vger.kernel.org

Hi,

We are maintaining the different versions of data in git repository using jgit maven library. So, a commit is done on the repository containing properties such as author name, date and time, action, and the file path.
The file path refers the xml file which contains the action performed and is stored inside the repository.

We have a job running every 5 minutes that commits the information onto the repository and the XML file content is over-written every time. Usually, the commits and writing of XML file takes around 4-5 seconds but sometimes the time while committing as well as writing the data increases which also increase the overall CPU utilization of the machine. This behavior is inconsistent with respect to the process and occurs randomly but during this behavior, there is a time when the CPU utilization becomes high that all other running processes hangs up which demands the restart of the server.

Could you please suggest which areas should we look for while identifying the cause of this issue? Also, does frequent commit of the content onto repository can trigger this issue? 
In your view, what might be the trigger of this issue and how we can proceed to resolve it?

Thanks & Regards,
Arpit Gupta

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Issue : Writing commits into the git repository takes longer than expected
  2024-06-10 10:17 Arpit Gupta
@ 2024-06-10 12:58 ` rsbecker
  2024-06-12 11:45   ` Arpit Gupta
  0 siblings, 1 reply; 6+ messages in thread
From: rsbecker @ 2024-06-10 12:58 UTC (permalink / raw)
  To: 'Arpit Gupta', git

On Monday, June 10, 2024 6:17 AM, Arpit Gupta wrote:
>We are maintaining the different versions of data in git repository using
jgit maven
>library. So, a commit is done on the repository containing properties such
as author
>name, date and time, action, and the file path.
>The file path refers the xml file which contains the action performed and
is stored
>inside the repository.
>
>We have a job running every 5 minutes that commits the information onto the
>repository and the XML file content is over-written every time. Usually,
the commits
>and writing of XML file takes around 4-5 seconds but sometimes the time
while
>committing as well as writing the data increases which also increase the
overall CPU
>utilization of the machine. This behavior is inconsistent with respect to
the process
>and occurs randomly but during this behavior, there is a time when the CPU
>utilization becomes high that all other running processes hangs up which
demands
>the restart of the server.
>
>Could you please suggest which areas should we look for while identifying
the cause
>of this issue? Also, does frequent commit of the content onto repository
can trigger
>this issue?
>In your view, what might be the trigger of this issue and how we can
proceed to
>resolve it?

Are your XML files single line file or is each tag on its own line? Changes
to single-line XML files can cause complete rewrites. If the file is large
enough, this can cause performance issues.

Do you have virus scans running on your repository? These can also cause
issues. Some scanners are more friendly to developers than others. Also, is
this an NFS drive? Is Git LFS involved?

If you have two commits to the same repo happening at once, this can also
cause one commit to be delayed waiting on the lock file. More info is needed
to comment further.

--Randall


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Issue : Writing commits into the git repository takes longer than expected
  2024-06-10 12:58 ` rsbecker
@ 2024-06-12 11:45   ` Arpit Gupta
  2024-06-24  5:46     ` Arpit Gupta
  0 siblings, 1 reply; 6+ messages in thread
From: Arpit Gupta @ 2024-06-12 11:45 UTC (permalink / raw)
  To: rsbecker@nexbridge.com, git@vger.kernel.org
  Cc: Anuradha Patial, Madhurima Pandey

>> The XML files that are being written as content are multi-line. There are 2 tags present in the file and each tag are on their own line (one tag being the child of the other). The file size isn't large. It is hardly 2-3kb. Below is the sample structure of the XML file being added as a part of content:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ServiceName type=ServiceType>
<property>PropertyValue</property>
</ServiceName>

>> There are no virus scans running in the repository. Also, the git LFS isn't involved in this scenario.
>> There might be a case as when the commit time starts increasing (initially from 4-5s to 30s to 1min to 6-7min) and during that time another commit call also starts as there is a scheduler of 5 minutes which triggers this action. But this will only cause a certain amount of delay and it shouldn't be the factor to increase the CPU Utilization.
Also, the machine memory size is 32GB. 

The commit time starts increasing from 4-5s and goes up to 6-7mins, what could be the trigger for the commit to increase from 4-5s to 1min and so on in this scenario since before that there can't be any parallel commits ongoing onto the repository? Also, as I mentioned before, this issue is totally inconsistent.
Let me know in case any other information is required.

Thanks & Regards,
Arpit

-----Original Message-----
From: rsbecker@nexbridge.com <rsbecker@nexbridge.com> 
Sent: Monday, June 10, 2024 6:29 PM
To: Arpit Gupta <argupta@axway.com>; git@vger.kernel.org
Subject: RE: Issue : Writing commits into the git repository takes longer than expected

[You don't often get email from rsbecker@nexbridge.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Monday, June 10, 2024 6:17 AM, Arpit Gupta wrote:
>We are maintaining the different versions of data in git repository 
>using
jgit maven
>library. So, a commit is done on the repository containing properties 
>such
as author
>name, date and time, action, and the file path.
>The file path refers the xml file which contains the action performed 
>and
is stored
>inside the repository.
>
>We have a job running every 5 minutes that commits the information onto 
>the repository and the XML file content is over-written every time. 
>Usually,
the commits
>and writing of XML file takes around 4-5 seconds but sometimes the time
while
>committing as well as writing the data increases which also increase 
>the
overall CPU
>utilization of the machine. This behavior is inconsistent with respect 
>to
the process
>and occurs randomly but during this behavior, there is a time when the 
>CPU utilization becomes high that all other running processes hangs up 
>which
demands
>the restart of the server.
>
>Could you please suggest which areas should we look for while 
>identifying
the cause
>of this issue? Also, does frequent commit of the content onto 
>repository
can trigger
>this issue?
>In your view, what might be the trigger of this issue and how we can
proceed to
>resolve it?

Are your XML files single line file or is each tag on its own line? Changes to single-line XML files can cause complete rewrites. If the file is large enough, this can cause performance issues.

Do you have virus scans running on your repository? These can also cause issues. Some scanners are more friendly to developers than others. Also, is this an NFS drive? Is Git LFS involved?

If you have two commits to the same repo happening at once, this can also cause one commit to be delayed waiting on the lock file. More info is needed to comment further.

--Randall


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Issue : Writing commits into the git repository takes longer than expected
  2024-06-12 11:45   ` Arpit Gupta
@ 2024-06-24  5:46     ` Arpit Gupta
  0 siblings, 0 replies; 6+ messages in thread
From: Arpit Gupta @ 2024-06-24  5:46 UTC (permalink / raw)
  To: rsbecker@nexbridge.com, git@vger.kernel.org

Hi there! 

Is there any update on this?

Thanks

-----Original Message-----
From: Arpit Gupta 
Sent: Wednesday, June 12, 2024 5:15 PM
To: rsbecker@nexbridge.com; git@vger.kernel.org
Cc: Anuradha Patial <anpatial@axway.com>; Madhurima Pandey <madhupandey@axway.com>
Subject: RE: Issue : Writing commits into the git repository takes longer than expected

>> The XML files that are being written as content are multi-line. There are 2 tags present in the file and each tag are on their own line (one tag being the child of the other). The file size isn't large. It is hardly 2-3kb. Below is the sample structure of the XML file being added as a part of content:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <ServiceName type=ServiceType> <property>PropertyValue</property>
</ServiceName>

>> There are no virus scans running in the repository. Also, the git LFS isn't involved in this scenario.
>> There might be a case as when the commit time starts increasing (initially from 4-5s to 30s to 1min to 6-7min) and during that time another commit call also starts as there is a scheduler of 5 minutes which triggers this action. But this will only cause a certain amount of delay and it shouldn't be the factor to increase the CPU Utilization.
Also, the machine memory size is 32GB. 

The commit time starts increasing from 4-5s and goes up to 6-7mins, what could be the trigger for the commit to increase from 4-5s to 1min and so on in this scenario since before that there can't be any parallel commits ongoing onto the repository? Also, as I mentioned before, this issue is totally inconsistent.
Let me know in case any other information is required.

Thanks & Regards,
Arpit

-----Original Message-----
From: rsbecker@nexbridge.com <rsbecker@nexbridge.com>
Sent: Monday, June 10, 2024 6:29 PM
To: Arpit Gupta <argupta@axway.com>; git@vger.kernel.org
Subject: RE: Issue : Writing commits into the git repository takes longer than expected

[You don't often get email from rsbecker@nexbridge.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Monday, June 10, 2024 6:17 AM, Arpit Gupta wrote:
>We are maintaining the different versions of data in git repository 
>using
jgit maven
>library. So, a commit is done on the repository containing properties 
>such
as author
>name, date and time, action, and the file path.
>The file path refers the xml file which contains the action performed 
>and
is stored
>inside the repository.
>
>We have a job running every 5 minutes that commits the information onto 
>the repository and the XML file content is over-written every time.
>Usually,
the commits
>and writing of XML file takes around 4-5 seconds but sometimes the time
while
>committing as well as writing the data increases which also increase 
>the
overall CPU
>utilization of the machine. This behavior is inconsistent with respect 
>to
the process
>and occurs randomly but during this behavior, there is a time when the 
>CPU utilization becomes high that all other running processes hangs up 
>which
demands
>the restart of the server.
>
>Could you please suggest which areas should we look for while 
>identifying
the cause
>of this issue? Also, does frequent commit of the content onto 
>repository
can trigger
>this issue?
>In your view, what might be the trigger of this issue and how we can
proceed to
>resolve it?

Are your XML files single line file or is each tag on its own line? Changes to single-line XML files can cause complete rewrites. If the file is large enough, this can cause performance issues.

Do you have virus scans running on your repository? These can also cause issues. Some scanners are more friendly to developers than others. Also, is this an NFS drive? Is Git LFS involved?

If you have two commits to the same repo happening at once, this can also cause one commit to be delayed waiting on the lock file. More info is needed to comment further.

--Randall


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Issue : Writing commits into the git repository takes longer than expected
@ 2024-07-12  4:10 Arpit Gupta
  2024-07-12 11:35 ` rsbecker
  0 siblings, 1 reply; 6+ messages in thread
From: Arpit Gupta @ 2024-07-12  4:10 UTC (permalink / raw)
  To: git@vger.kernel.org

Hi,

We are maintaining the different versions of data in git repository using jgit maven library. So, a commit is done on the repository containing properties such as author name, date and time, action, and the file path.
The file path refers the xml file which contains the action performed and is stored inside the repository.

We have a job running every 5 minutes that commits the information onto the repository and the XML file content is over-written every time. Usually, the commits and writing of XML file takes around 4-5 seconds but sometimes the time while committing as well as writing the data increases which also increase the overall CPU utilization of the machine. This behavior is inconsistent with respect to the process and occurs randomly but during this behavior, there is a time when the CPU utilization becomes high that all other running processes hangs up which demands the restart of the server.

Furthermore,
>> The XML files that are being written as content are multi-line. There are 2 tags present in the file and each tag are on their own line (one tag being the child of the other). The file size isn't large. It is hardly 2-3kb. Below is the sample structure of the XML file being added as a part of content:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <ServiceName type=ServiceType> <property>PropertyValue</property>
</ServiceName>

>> There are no virus scans running in the repository. Also, the git LFS isn't involved in this scenario.
>> There might be a case as when the commit time starts increasing (initially from 4-5s to 30s to 1min to 6-7min) and during that time another commit call also starts as there is a scheduler of 5 minutes which triggers this action. But this will only cause a certain amount of delay and it shouldn't be the factor to increase the CPU Utilization.
Also, the machine memory size is 32GB and the machine type is /dev/nvme2n1

The commit time starts increasing from 4-5s and goes up to 6-7mins, what could be the trigger for the commit to increase from 4-5s to 1min and so on in this scenario since before that there can't be any parallel commits ongoing onto the repository? Also, as I mentioned before, this issue is totally inconsistent.
Let me know in case any other information is required.

Could you please suggest which areas should we look for while identifying the cause of this issue? Also, does frequent commit of the content onto repository can trigger this issue? 
In your view, what might be the trigger of this issue and how we can proceed to resolve it?

Thanks & Regards,
Arpit Gupta

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Issue : Writing commits into the git repository takes longer than expected
  2024-07-12  4:10 Issue : Writing commits into the git repository takes longer than expected Arpit Gupta
@ 2024-07-12 11:35 ` rsbecker
  0 siblings, 0 replies; 6+ messages in thread
From: rsbecker @ 2024-07-12 11:35 UTC (permalink / raw)
  To: 'Arpit Gupta', git

On Friday, July 12, 2024 12:11 AM, Arpit Gupta wrote:
>We are maintaining the different versions of data in git repository using
jgit maven
>library. So, a commit is done on the repository containing properties such
as author
>name, date and time, action, and the file path.
>The file path refers the xml file which contains the action performed and
is stored
>inside the repository.

Can you confirm you are using JGit rather than Git core commands? If this is
a JGit situation, you may be hitting Java garbage collection or there may be
a git garbage collection happening. If this is JGit, please raise this with
the JGit team rather than here.

>We have a job running every 5 minutes that commits the information onto the
>repository and the XML file content is over-written every time. Usually,
the commits
>and writing of XML file takes around 4-5 seconds but sometimes the time
while
>committing as well as writing the data increases which also increase the
overall CPU
>utilization of the machine. This behavior is inconsistent with respect to
the process
>and occurs randomly but during this behavior, there is a time when the CPU
>utilization becomes high that all other running processes hangs up which
demands
>the restart of the server.
>
>Furthermore,
>>> The XML files that are being written as content are multi-line. There
are 2 tags
>present in the file and each tag are on their own line (one tag being the
child of the
>other). The file size isn't large. It is hardly 2-3kb. Below is the sample
structure of
>the XML file being added as a part of content:
>
><?xml version="1.0" encoding="UTF-8" standalone="yes"?> <ServiceName
>type=ServiceType> <property>PropertyValue</property>
></ServiceName>
>
>>> There are no virus scans running in the repository. Also, the git LFS
isn't involved
>in this scenario.
>>> There might be a case as when the commit time starts increasing
(initially from 4-
>5s to 30s to 1min to 6-7min) and during that time another commit call also
starts as
>there is a scheduler of 5 minutes which triggers this action. But this will
only cause a
>certain amount of delay and it shouldn't be the factor to increase the CPU
>Utilization.

There is a lock file (.git/index.lock) that acts as a semaphore to prevent
two processes from writing to the repo at once. If your scheduler starts a
log-running job that interacts with the repo, this may cause essentially a
lock wait. Check how long it is around during your delay.

>Also, the machine memory size is 32GB and the machine type is /dev/nvme2n1
>
>The commit time starts increasing from 4-5s and goes up to 6-7mins, what
could be
>the trigger for the commit to increase from 4-5s to 1min and so on in this
scenario
>since before that there can't be any parallel commits ongoing onto the
repository?
>Also, as I mentioned before, this issue is totally inconsistent.
>Let me know in case any other information is required.
>
>Could you please suggest which areas should we look for while identifying
the cause
>of this issue? Also, does frequent commit of the content onto repository
can trigger
>this issue?
>In your view, what might be the trigger of this issue and how we can
proceed to
>resolve it?


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-07-12 11:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-12  4:10 Issue : Writing commits into the git repository takes longer than expected Arpit Gupta
2024-07-12 11:35 ` rsbecker
  -- strict thread matches above, loose matches on Subject: below --
2024-06-10 10:17 Arpit Gupta
2024-06-10 12:58 ` rsbecker
2024-06-12 11:45   ` Arpit Gupta
2024-06-24  5:46     ` Arpit Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).