public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Bursty I/O in ext3
@ 2006-03-14  7:32 Tong Li
  2006-03-14 15:29 ` Theodore Ts'o
  2006-03-14 16:46 ` Avishay Traeger
  0 siblings, 2 replies; 5+ messages in thread
From: Tong Li @ 2006-03-14  7:32 UTC (permalink / raw)
  To: linux-kernel

I'm running kernbench (make -j 128 on a kernel source) back to back 
multiple times on an SMP. Among every 10 runs, there's always at least one 
run that has a run time around 40% longer than the other runs. (Before 
kernbench starts timing, it does a sync.) 'vmstat 1' indicates that the 
longer runs always have a couple of 1-sec intervals during which there are 
10 times more block-outs (bo field) than the average traffic in the rest 
of the run, and during these intervals, many cc1 processes are in the D 
state. My file system is ext3 and all the things like journal commit 
interval, pdflush interval, etc. have the default values.

I'm trying to understand why such variability occurs. I tested the same 
thing with ext2 and did not see any variability. So I'm thinking about two 
things: (1) for some reason, ext3/jbd occasionally issues a large volume 
of bursty writes to the disk (but why does it occur just sometimes, not 
always?), and (2) when there are bursty writes, the block device driver is 
not able to handle them, causing I/O waits. But I don't really have a 
clear understanding of the problem here...

Does anyone have any insight on this, or any suggestion on how to figure 
it out?

Thanks,

   tong

PS. I'm not subscribed to the list, so please cc me.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bursty I/O in ext3
  2006-03-14  7:32 Bursty I/O in ext3 Tong Li
@ 2006-03-14 15:29 ` Theodore Ts'o
  2006-03-14 21:51   ` Tong Li
  2006-03-14 16:46 ` Avishay Traeger
  1 sibling, 1 reply; 5+ messages in thread
From: Theodore Ts'o @ 2006-03-14 15:29 UTC (permalink / raw)
  To: Tong Li; +Cc: linux-kernel

On Tue, Mar 14, 2006 at 02:32:17AM -0500, Tong Li wrote:
> I'm running kernbench (make -j 128 on a kernel source) back to back 
> multiple times on an SMP. Among every 10 runs, there's always at least one 
> run that has a run time around 40% longer than the other runs. (Before 
> kernbench starts timing, it does a sync.) 'vmstat 1' indicates that the 
> longer runs always have a couple of 1-sec intervals during which there are 
> 10 times more block-outs (bo field) than the average traffic in the rest 
> of the run, and during these intervals, many cc1 processes are in the D 
> state. My file system is ext3 and all the things like journal commit 
> interval, pdflush interval, etc. have the default values.
> 
> I'm trying to understand why such variability occurs. I tested the same 
> thing with ext2 and did not see any variability. So I'm thinking about two 
> things: (1) for some reason, ext3/jbd occasionally issues a large volume 
> of bursty writes to the disk (but why does it occur just sometimes, not 
> always?), and (2) when there are bursty writes, the block device driver is 
> not able to handle them, causing I/O waits. But I don't really have a 
> clear understanding of the problem here...

If you are using an e2fsprogs older than version 1.38, you should try
expanding the journal size from the default of 32M to 128M; with the
filesystem unmounted do:

	tune2fs -O ^has_journal /dev/hdXX
	tune2fs -O has_journal -J journal_size=128 /dev/hdXX

If the journal gets full and the filesystem has to do a forced journal
truncate, that can cause I/O's to stall and writes can thus get bursty
with performance becoming nasty as a result.  Increasing the journal
size can avoid this, at the cost of potentially having more disk
buffers be pinned in memory, thus increasing the overhead of
unswappable kernel memory.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bursty I/O in ext3
  2006-03-14  7:32 Bursty I/O in ext3 Tong Li
  2006-03-14 15:29 ` Theodore Ts'o
@ 2006-03-14 16:46 ` Avishay Traeger
  2006-03-14 21:52   ` Tong Li
  1 sibling, 1 reply; 5+ messages in thread
From: Avishay Traeger @ 2006-03-14 16:46 UTC (permalink / raw)
  To: Tong Li; +Cc: linux-kernel

On Tue, 2006-03-14 at 02:32 -0500, Tong Li wrote:
> Does anyone have any insight on this, or any suggestion on how to figure 
> it out?

I tried to recreate the condition, but failed (10 runs, all about the
same amount of time).  Is it possible that you have some other process
accessing the partition?

Avishay Traeger
http://www.fsl.cs.sunysb.edu/~avishay/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bursty I/O in ext3
  2006-03-14 15:29 ` Theodore Ts'o
@ 2006-03-14 21:51   ` Tong Li
  0 siblings, 0 replies; 5+ messages in thread
From: Tong Li @ 2006-03-14 21:51 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-kernel

> If you are using an e2fsprogs older than version 1.38, you should try
> expanding the journal size from the default of 32M to 128M; with the
> filesystem unmounted do:
>
> 	tune2fs -O ^has_journal /dev/hdXX
> 	tune2fs -O has_journal -J journal_size=128 /dev/hdXX
>

I did this and yes, it fixed the problem.

Thank you so much,

   tong

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bursty I/O in ext3
  2006-03-14 16:46 ` Avishay Traeger
@ 2006-03-14 21:52   ` Tong Li
  0 siblings, 0 replies; 5+ messages in thread
From: Tong Li @ 2006-03-14 21:52 UTC (permalink / raw)
  To: Avishay Traeger; +Cc: linux-kernel

> I tried to recreate the condition, but failed (10 runs, all about the
> same amount of time).  Is it possible that you have some other process
> accessing the partition?

I don't have other processes running on the system, so I don't know...

Thanks,

   tong

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-03-14 21:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-14  7:32 Bursty I/O in ext3 Tong Li
2006-03-14 15:29 ` Theodore Ts'o
2006-03-14 21:51   ` Tong Li
2006-03-14 16:46 ` Avishay Traeger
2006-03-14 21:52   ` Tong Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox