* ext3 writeback mode slower than ordered mode?
@ 2001-12-08 21:10 ` Zlatko Calusic
0 siblings, 0 replies; 13+ messages in thread
From: Zlatko Calusic @ 2001-12-08 21:10 UTC (permalink / raw)
To: sct, linux-mm, linux-kernel
Hi!
My apologies if this is an FAQ, and I'm still catching up with
the linux-kernel list.
Today I decided to convert my /tmp partition to be mounted in
writeback mode, as I noticed that ext3 in ordered mode syncs every 5
seconds and that is something defenitely not needed for /tmp, IMHO.
Then I did some tests in order to prove my theory. :)
But, alas, writeback is slower.
[ordered]
{atlas} [~]% writer 200 1
Wrote 200.00 MB in 2 seconds -> 70.92 MB/s (100.0 %CPU)
[writeback]
{atlas} [/tmp]% writer 200 1
Wrote 200.00 MB in 5 seconds -> 37.11 MB/s (96.8 %CPU)
"writer" is a simple application that just writes to a file and
deletes it afterwards. As I have 768MB RAM, 200MB doesn't trigger I/O
in neither case, so the numbers are the measure of the speed of the FS
internals, and as you can see writeback is running at half
speed (extra copy? why?). Strange...
Just to be on a safe side, I decided to test a real application, sort,
which uses $TMPDIR for temporary files. Once again, if I point $TMPDIR
to an ext3/writeback partition, sort takes longer to do its work. And
its repeatable.
[$TMPDIR=/tmp writeback]
{atlas} [~]% time sort bigfile -o outfile
sort bigfile -o outfile 40.14s user 19.84s system 95% cpu 1:02.60 total
[$TMPDIR=~ ordered]
{atlas} [~]% time sort bigfile -o outfile
sort bigfile -o outfile 40.74s user 14.78s system 97% cpu 57.196 total
Notice +5 seconds in sys time for a writeback case, and adequate
increase in wallclock time.
All tests were done on the 2.4.16, but 2.5.x series exhibit the same
behaviour. Eventually, I decided to continue mounting /tmp in the
default, ordered mode.
I'm confused, TIA for anybody clarifying this to me!
--
Zlatko
^ permalink raw reply [flat|nested] 13+ messages in thread* ext3 writeback mode slower than ordered mode? @ 2001-12-08 21:10 ` Zlatko Calusic 0 siblings, 0 replies; 13+ messages in thread From: Zlatko Calusic @ 2001-12-08 21:10 UTC (permalink / raw) To: sct, linux-mm, linux-kernel Hi! My apologies if this is an FAQ, and I'm still catching up with the linux-kernel list. Today I decided to convert my /tmp partition to be mounted in writeback mode, as I noticed that ext3 in ordered mode syncs every 5 seconds and that is something defenitely not needed for /tmp, IMHO. Then I did some tests in order to prove my theory. :) But, alas, writeback is slower. [ordered] {atlas} [~]% writer 200 1 Wrote 200.00 MB in 2 seconds -> 70.92 MB/s (100.0 %CPU) [writeback] {atlas} [/tmp]% writer 200 1 Wrote 200.00 MB in 5 seconds -> 37.11 MB/s (96.8 %CPU) "writer" is a simple application that just writes to a file and deletes it afterwards. As I have 768MB RAM, 200MB doesn't trigger I/O in neither case, so the numbers are the measure of the speed of the FS internals, and as you can see writeback is running at half speed (extra copy? why?). Strange... Just to be on a safe side, I decided to test a real application, sort, which uses $TMPDIR for temporary files. Once again, if I point $TMPDIR to an ext3/writeback partition, sort takes longer to do its work. And its repeatable. [$TMPDIR=/tmp writeback] {atlas} [~]% time sort bigfile -o outfile sort bigfile -o outfile 40.14s user 19.84s system 95% cpu 1:02.60 total [$TMPDIR=~ ordered] {atlas} [~]% time sort bigfile -o outfile sort bigfile -o outfile 40.74s user 14.78s system 97% cpu 57.196 total Notice +5 seconds in sys time for a writeback case, and adequate increase in wallclock time. All tests were done on the 2.4.16, but 2.5.x series exhibit the same behaviour. Eventually, I decided to continue mounting /tmp in the default, ordered mode. I'm confused, TIA for anybody clarifying this to me! -- Zlatko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? 2001-12-08 21:10 ` Zlatko Calusic (?) @ 2001-12-08 21:57 ` Jan H. Schrewe -1 siblings, 0 replies; 13+ messages in thread From: Jan H. Schrewe @ 2001-12-08 21:57 UTC (permalink / raw) To: zlatko.calusic; +Cc: linux-kernel Zlatko Calusic schrieb: > > I'm confused, TIA for anybody clarifying this to me! > -- > Zlatko Have a look at http://www-106.ibm.com/developerworks/linux/library/l-fs8.html cheers Jan > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? 2001-12-08 21:10 ` Zlatko Calusic @ 2001-12-09 1:59 ` Andrew Morton -1 siblings, 0 replies; 13+ messages in thread From: Andrew Morton @ 2001-12-09 1:59 UTC (permalink / raw) To: zlatko.calusic; +Cc: sct, linux-mm, linux-kernel Zlatko Calusic wrote: > > Hi! > > My apologies if this is an FAQ, and I'm still catching up with > the linux-kernel list. > > Today I decided to convert my /tmp partition to be mounted in > writeback mode, as I noticed that ext3 in ordered mode syncs every 5 > seconds and that is something defenitely not needed for /tmp, IMHO. > > Then I did some tests in order to prove my theory. :) > > But, alas, writeback is slower. > I cannot reproduce this. Using http://www.zip.com.au/~akpm/writer.c ext2: 0.03s user 1.43s system 97% cpu 1.501 total ext3 writeback: 0.02s user 2.33s system 96% cpu 2.431 total ext3 ordered: 0.02s user 2.52s system 98% cpu 2.574 total ext3 is significantly more costly in either journalling mode, probably because of the bitmap manipulation - each time we allocate a block to the file, we have to muck around doing all sorts of checks and list manipulations against the buffer which holds the bitmap. Not only is this costly, but ext2 speculatively sets a bunch of bits at the same time, which ext3 cannot do for consistency reasons. There are a few things we can do to pull this back, but given that this is all pretty insignificant once you actually start doing disk IO, we couldn't justify the risk of destabilising the filesystem for small gains. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? @ 2001-12-09 1:59 ` Andrew Morton 0 siblings, 0 replies; 13+ messages in thread From: Andrew Morton @ 2001-12-09 1:59 UTC (permalink / raw) To: zlatko.calusic; +Cc: sct, linux-mm, linux-kernel Zlatko Calusic wrote: > > Hi! > > My apologies if this is an FAQ, and I'm still catching up with > the linux-kernel list. > > Today I decided to convert my /tmp partition to be mounted in > writeback mode, as I noticed that ext3 in ordered mode syncs every 5 > seconds and that is something defenitely not needed for /tmp, IMHO. > > Then I did some tests in order to prove my theory. :) > > But, alas, writeback is slower. > I cannot reproduce this. Using http://www.zip.com.au/~akpm/writer.c ext2: 0.03s user 1.43s system 97% cpu 1.501 total ext3 writeback: 0.02s user 2.33s system 96% cpu 2.431 total ext3 ordered: 0.02s user 2.52s system 98% cpu 2.574 total ext3 is significantly more costly in either journalling mode, probably because of the bitmap manipulation - each time we allocate a block to the file, we have to muck around doing all sorts of checks and list manipulations against the buffer which holds the bitmap. Not only is this costly, but ext2 speculatively sets a bunch of bits at the same time, which ext3 cannot do for consistency reasons. There are a few things we can do to pull this back, but given that this is all pretty insignificant once you actually start doing disk IO, we couldn't justify the risk of destabilising the filesystem for small gains. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? 2001-12-09 1:59 ` Andrew Morton @ 2001-12-09 12:58 ` Juan Piernas Canovas -1 siblings, 0 replies; 13+ messages in thread From: Juan Piernas Canovas @ 2001-12-09 12:58 UTC (permalink / raw) To: Andrew Morton; +Cc: zlatko.calusic, sct, linux-mm, linux-kernel On Sat, 8 Dec 2001, Andrew Morton wrote: > Zlatko Calusic wrote: > > > > Hi! > > > > My apologies if this is an FAQ, and I'm still catching up with > > the linux-kernel list. > > > > Today I decided to convert my /tmp partition to be mounted in > > writeback mode, as I noticed that ext3 in ordered mode syncs every 5 > > seconds and that is something defenitely not needed for /tmp, IMHO. > > > > Then I did some tests in order to prove my theory. :) > > > > But, alas, writeback is slower. > > > > I cannot reproduce this. Using http://www.zip.com.au/~akpm/writer.c > > ext2: 0.03s user 1.43s system 97% cpu 1.501 total > ext3 writeback: 0.02s user 2.33s system 96% cpu 2.431 total > ext3 ordered: 0.02s user 2.52s system 98% cpu 2.574 total > > ext3 is significantly more costly in either journalling mode, > probably because of the bitmap manipulation - each time we allocate > a block to the file, we have to muck around doing all sorts > of checks and list manipulations against the buffer which holds > the bitmap. Not only is this costly, but ext2 speculatively > sets a bunch of bits at the same time, which ext3 cannot do > for consistency reasons. > > There are a few things we can do to pull this back, but given that > this is all pretty insignificant once you actually start doing disk > IO, we couldn't justify the risk of destabilising the filesystem > for small gains. Hi! Sorry, but I can confirm that Ext3 is slower with "-o data=writeback" option than with "-o data=ordered" option when you create and delete a lot of files. I use 2.2.19 Linux kernel along with 0.0.7a Ext3 version. Bye! Juan. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? @ 2001-12-09 12:58 ` Juan Piernas Canovas 0 siblings, 0 replies; 13+ messages in thread From: Juan Piernas Canovas @ 2001-12-09 12:58 UTC (permalink / raw) To: Andrew Morton; +Cc: zlatko.calusic, sct, linux-mm, linux-kernel On Sat, 8 Dec 2001, Andrew Morton wrote: > Zlatko Calusic wrote: > > > > Hi! > > > > My apologies if this is an FAQ, and I'm still catching up with > > the linux-kernel list. > > > > Today I decided to convert my /tmp partition to be mounted in > > writeback mode, as I noticed that ext3 in ordered mode syncs every 5 > > seconds and that is something defenitely not needed for /tmp, IMHO. > > > > Then I did some tests in order to prove my theory. :) > > > > But, alas, writeback is slower. > > > > I cannot reproduce this. Using http://www.zip.com.au/~akpm/writer.c > > ext2: 0.03s user 1.43s system 97% cpu 1.501 total > ext3 writeback: 0.02s user 2.33s system 96% cpu 2.431 total > ext3 ordered: 0.02s user 2.52s system 98% cpu 2.574 total > > ext3 is significantly more costly in either journalling mode, > probably because of the bitmap manipulation - each time we allocate > a block to the file, we have to muck around doing all sorts > of checks and list manipulations against the buffer which holds > the bitmap. Not only is this costly, but ext2 speculatively > sets a bunch of bits at the same time, which ext3 cannot do > for consistency reasons. > > There are a few things we can do to pull this back, but given that > this is all pretty insignificant once you actually start doing disk > IO, we couldn't justify the risk of destabilising the filesystem > for small gains. Hi! Sorry, but I can confirm that Ext3 is slower with "-o data=writeback" option than with "-o data=ordered" option when you create and delete a lot of files. I use 2.2.19 Linux kernel along with 0.0.7a Ext3 version. Bye! Juan. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? 2001-12-09 1:59 ` Andrew Morton @ 2001-12-09 19:46 ` Zlatko Calusic -1 siblings, 0 replies; 13+ messages in thread From: Zlatko Calusic @ 2001-12-09 19:46 UTC (permalink / raw) To: Andrew Morton; +Cc: sct, linux-mm, linux-kernel Andrew Morton <akpm@zip.com.au> writes: > Zlatko Calusic wrote: > > > > Hi! > > > > My apologies if this is an FAQ, and I'm still catching up with > > the linux-kernel list. > > > > Today I decided to convert my /tmp partition to be mounted in > > writeback mode, as I noticed that ext3 in ordered mode syncs every 5 > > seconds and that is something defenitely not needed for /tmp, IMHO. > > > > Then I did some tests in order to prove my theory. :) > > > > But, alas, writeback is slower. > > > > I cannot reproduce this. Using http://www.zip.com.au/~akpm/writer.c > > ext2: 0.03s user 1.43s system 97% cpu 1.501 total > ext3 writeback: 0.02s user 2.33s system 96% cpu 2.431 total > ext3 ordered: 0.02s user 2.52s system 98% cpu 2.574 total > Hm, at first I got exactly the same results for writeback/ordered cases, as you did above, so my theory fell on the ground. Later, bloody thing resurected again. Something really fishy is goin' on here... {atlas} [/mnt]# time ~zcalusic/try/awriter ~zcalusic/try/awriter 0.07s user 3.50s system 99% cpu 3.594 total {atlas} [/mnt]# cd /tmp {atlas} [/tmp]# time ~zcalusic/try/awriter ~zcalusic/try/awriter 0.00s user 6.05s system 98% cpu 6.129 total {atlas} [/tmp]# mount | egrep '/tmp|/mnt' /dev/hde2 on /tmp type ext3 (rw,data=writeback) /dev/hde3 on /mnt type ext3 (rw) So /tmp is writeback and /mnt is ordered (doublechecked!). See for yourself how ext3 is slower in writeback mode. awriter is your small program, of course. Just for the record, I mke2fs-ed /dev/hde3 again and made it pure ext2. {atlas} [~]# mount | grep '/mnt' /dev/hde3 on /mnt type ext2 (rw) {atlas} [~]# cd /mnt {atlas} [/mnt]# time ~zcalusic/try/awriter ~zcalusic/try/awriter 0.01s user 1.86s system 98% cpu 1.893 total To sumarize: ext2 0.01s user 1.86s system 98% cpu 1.893 total ext3/ordered 0.07s user 3.50s system 99% cpu 3.594 total ext3/writeback 0.00s user 6.05s system 98% cpu 6.129 total What is strange is that not always I've been able to get different results for writeback case (comparing to ordered), but when I get it, it is repeatable. This is a SMP machine, if that makes any difference. Regards, -- Zlatko ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? @ 2001-12-09 19:46 ` Zlatko Calusic 0 siblings, 0 replies; 13+ messages in thread From: Zlatko Calusic @ 2001-12-09 19:46 UTC (permalink / raw) To: Andrew Morton; +Cc: sct, linux-mm, linux-kernel Andrew Morton <akpm@zip.com.au> writes: > Zlatko Calusic wrote: > > > > Hi! > > > > My apologies if this is an FAQ, and I'm still catching up with > > the linux-kernel list. > > > > Today I decided to convert my /tmp partition to be mounted in > > writeback mode, as I noticed that ext3 in ordered mode syncs every 5 > > seconds and that is something defenitely not needed for /tmp, IMHO. > > > > Then I did some tests in order to prove my theory. :) > > > > But, alas, writeback is slower. > > > > I cannot reproduce this. Using http://www.zip.com.au/~akpm/writer.c > > ext2: 0.03s user 1.43s system 97% cpu 1.501 total > ext3 writeback: 0.02s user 2.33s system 96% cpu 2.431 total > ext3 ordered: 0.02s user 2.52s system 98% cpu 2.574 total > Hm, at first I got exactly the same results for writeback/ordered cases, as you did above, so my theory fell on the ground. Later, bloody thing resurected again. Something really fishy is goin' on here... {atlas} [/mnt]# time ~zcalusic/try/awriter ~zcalusic/try/awriter 0.07s user 3.50s system 99% cpu 3.594 total {atlas} [/mnt]# cd /tmp {atlas} [/tmp]# time ~zcalusic/try/awriter ~zcalusic/try/awriter 0.00s user 6.05s system 98% cpu 6.129 total {atlas} [/tmp]# mount | egrep '/tmp|/mnt' /dev/hde2 on /tmp type ext3 (rw,data=writeback) /dev/hde3 on /mnt type ext3 (rw) So /tmp is writeback and /mnt is ordered (doublechecked!). See for yourself how ext3 is slower in writeback mode. awriter is your small program, of course. Just for the record, I mke2fs-ed /dev/hde3 again and made it pure ext2. {atlas} [~]# mount | grep '/mnt' /dev/hde3 on /mnt type ext2 (rw) {atlas} [~]# cd /mnt {atlas} [/mnt]# time ~zcalusic/try/awriter ~zcalusic/try/awriter 0.01s user 1.86s system 98% cpu 1.893 total To sumarize: ext2 0.01s user 1.86s system 98% cpu 1.893 total ext3/ordered 0.07s user 3.50s system 99% cpu 3.594 total ext3/writeback 0.00s user 6.05s system 98% cpu 6.129 total What is strange is that not always I've been able to get different results for writeback case (comparing to ordered), but when I get it, it is repeatable. This is a SMP machine, if that makes any difference. Regards, -- Zlatko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? 2001-12-09 19:46 ` Zlatko Calusic @ 2001-12-10 18:18 ` Stephen C. Tweedie -1 siblings, 0 replies; 13+ messages in thread From: Stephen C. Tweedie @ 2001-12-10 18:18 UTC (permalink / raw) To: Zlatko Calusic; +Cc: Andrew Morton, sct, linux-mm, linux-kernel Hi, On Sun, Dec 09, 2001 at 08:46:02PM +0100, Zlatko Calusic wrote: > To sumarize: > > ext2 0.01s user 1.86s system 98% cpu 1.893 total > ext3/ordered 0.07s user 3.50s system 99% cpu 3.594 total > ext3/writeback 0.00s user 6.05s system 98% cpu 6.129 total > > What is strange is that not always I've been able to get different > results for writeback case (comparing to ordered), but when I get it, > it is repeatable. So it could be something as basic as disk layout or allocation pattern. Hmm. Could you profile the kernel and see where writeback is spending all the time, in that case? Thanks, Stephen ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? @ 2001-12-10 18:18 ` Stephen C. Tweedie 0 siblings, 0 replies; 13+ messages in thread From: Stephen C. Tweedie @ 2001-12-10 18:18 UTC (permalink / raw) To: Zlatko Calusic; +Cc: Andrew Morton, sct, linux-mm, linux-kernel Hi, On Sun, Dec 09, 2001 at 08:46:02PM +0100, Zlatko Calusic wrote: > To sumarize: > > ext2 0.01s user 1.86s system 98% cpu 1.893 total > ext3/ordered 0.07s user 3.50s system 99% cpu 3.594 total > ext3/writeback 0.00s user 6.05s system 98% cpu 6.129 total > > What is strange is that not always I've been able to get different > results for writeback case (comparing to ordered), but when I get it, > it is repeatable. So it could be something as basic as disk layout or allocation pattern. Hmm. Could you profile the kernel and see where writeback is spending all the time, in that case? Thanks, Stephen -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? 2001-12-10 18:18 ` Stephen C. Tweedie @ 2001-12-11 22:31 ` Zlatko Calusic -1 siblings, 0 replies; 13+ messages in thread From: Zlatko Calusic @ 2001-12-11 22:31 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Andrew Morton, linux-mm, linux-kernel "Stephen C. Tweedie" <sct@redhat.com> writes: > Hi, > > On Sun, Dec 09, 2001 at 08:46:02PM +0100, Zlatko Calusic wrote: > > > To sumarize: > > > > ext2 0.01s user 1.86s system 98% cpu 1.893 total > > ext3/ordered 0.07s user 3.50s system 99% cpu 3.594 total > > ext3/writeback 0.00s user 6.05s system 98% cpu 6.129 total > > > > What is strange is that not always I've been able to get different > > results for writeback case (comparing to ordered), but when I get it, > > it is repeatable. > > So it could be something as basic as disk layout or allocation > pattern. Hmm. Hm, I'm not that sure about disk layout, as nothing actually hits the disk platter in these tests, but the latter reason is possible. > > Could you profile the kernel and see where writeback is spending all > the time, in that case? I have made a simple test and collected kernel profiling data. The test consists of repetitive writing of a 100MB file (on a 768MB machine) and immediately deleting it after the write is finished. In a loop, 100 times. ordered: 51611 total 0.0392 34550 default_idle 664.4231 4941 generic_file_write 3.0575 741 journal_dirty_metadata 1.9500 727 get_hash_table 4.5438 566 journal_add_journal_head 2.2109 561 do_get_write_access 0.4510 514 journal_get_write_access 5.5870 371 journal_cancel_revoke 2.0163 368 ext3_do_update_inode 0.4000 323 journal_unlock_journal_head 2.9907 311 ext3_new_block 0.1747 293 rmqueue 0.6315 272 ext3_get_inode_loc 0.7234 192 handle_IRQ_event 1.5484 182 __brelse 5.6875 175 ext3_get_block_handle 0.2701 174 kmem_cache_alloc 0.6493 161 ext3_commit_write 0.3073 147 journal_flushpage 0.5176 writeback: 53652 total 0.0407 23781 default_idle 457.3269 4700 generic_file_write 2.9084 2429 get_hash_table 15.1813 2026 journal_dirty_metadata 5.3316 1423 do_get_write_access 1.1439 1348 journal_get_write_access 14.6522 1056 journal_cancel_revoke 5.7391 1025 journal_add_journal_head 4.0039 869 ext3_new_block 0.4882 807 journal_unlock_journal_head 7.4722 755 ext3_do_update_inode 0.8207 580 ext3_get_inode_loc 1.5426 572 ext3_get_block_handle 0.8827 454 __brelse 14.1875 347 journal_flushpage 1.2218 329 rmqueue 0.7091 317 ext3_mark_iloc_dirty 4.4028 315 do_generic_file_read 0.2853 308 unlock_buffer 4.8125 Notice how the numbers for the writeback case are much bigger. But, strange thing is that the total time hasn't changed?! So my program reports half the throughput and profile numbers are much bigger for the writeback case, but in both cases tests finish in about the same time. Tell me I'm not goin' nuts?! Yes, I have reseted the profile counter correctly between the runs. Also, if I change to another writeback mounted partition (on the same disk, nearby) it behaves normally (similar numbers as on the ordered mounted one). Why is my /tmp so special? :) *** And now, something completely different. When mounted in ordered mode, and doing the test above (writing & deleting), kernel leaks memory. In fact, such memory can be easily recovered, but still, such behaviour makes unwanted memory pressure, forces stuff to disk too early and even produces some swapping. Every time a file of 100MB was written and unlinked immediately afterwards (before FS had a chance to commit it to disk) ~100MB of memory stayed allocated. Looks like buffer heads which are pinning page cache pages, but as I deleted a file, shouldn't that memory be freed? Another writing and there goes another 100MB of RAM... This is how things looked just before the test (most of the memory free) procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 50124 516816 40220 53356 0 0 0 60 1696 949 0 3 96 and after the test (memory gone) procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 58888 99988 11608 35008 0 0 0 60 1546 924 0 1 99 /proc/slabinfo (the only suspicious entry) buffer_head 134513 159440 96 3543 3986 1 : 252 126 I remind, that only happens when the partition is mounted in the ordered mode. OK, I know this is all confusing, but I'm just trying to help weed bugs and maybe understand a thing or two about the ext3. :) Regards, -- Zlatko ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3 writeback mode slower than ordered mode? @ 2001-12-11 22:31 ` Zlatko Calusic 0 siblings, 0 replies; 13+ messages in thread From: Zlatko Calusic @ 2001-12-11 22:31 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Andrew Morton, linux-mm, linux-kernel "Stephen C. Tweedie" <sct@redhat.com> writes: > Hi, > > On Sun, Dec 09, 2001 at 08:46:02PM +0100, Zlatko Calusic wrote: > > > To sumarize: > > > > ext2 0.01s user 1.86s system 98% cpu 1.893 total > > ext3/ordered 0.07s user 3.50s system 99% cpu 3.594 total > > ext3/writeback 0.00s user 6.05s system 98% cpu 6.129 total > > > > What is strange is that not always I've been able to get different > > results for writeback case (comparing to ordered), but when I get it, > > it is repeatable. > > So it could be something as basic as disk layout or allocation > pattern. Hmm. Hm, I'm not that sure about disk layout, as nothing actually hits the disk platter in these tests, but the latter reason is possible. > > Could you profile the kernel and see where writeback is spending all > the time, in that case? I have made a simple test and collected kernel profiling data. The test consists of repetitive writing of a 100MB file (on a 768MB machine) and immediately deleting it after the write is finished. In a loop, 100 times. ordered: 51611 total 0.0392 34550 default_idle 664.4231 4941 generic_file_write 3.0575 741 journal_dirty_metadata 1.9500 727 get_hash_table 4.5438 566 journal_add_journal_head 2.2109 561 do_get_write_access 0.4510 514 journal_get_write_access 5.5870 371 journal_cancel_revoke 2.0163 368 ext3_do_update_inode 0.4000 323 journal_unlock_journal_head 2.9907 311 ext3_new_block 0.1747 293 rmqueue 0.6315 272 ext3_get_inode_loc 0.7234 192 handle_IRQ_event 1.5484 182 __brelse 5.6875 175 ext3_get_block_handle 0.2701 174 kmem_cache_alloc 0.6493 161 ext3_commit_write 0.3073 147 journal_flushpage 0.5176 writeback: 53652 total 0.0407 23781 default_idle 457.3269 4700 generic_file_write 2.9084 2429 get_hash_table 15.1813 2026 journal_dirty_metadata 5.3316 1423 do_get_write_access 1.1439 1348 journal_get_write_access 14.6522 1056 journal_cancel_revoke 5.7391 1025 journal_add_journal_head 4.0039 869 ext3_new_block 0.4882 807 journal_unlock_journal_head 7.4722 755 ext3_do_update_inode 0.8207 580 ext3_get_inode_loc 1.5426 572 ext3_get_block_handle 0.8827 454 __brelse 14.1875 347 journal_flushpage 1.2218 329 rmqueue 0.7091 317 ext3_mark_iloc_dirty 4.4028 315 do_generic_file_read 0.2853 308 unlock_buffer 4.8125 Notice how the numbers for the writeback case are much bigger. But, strange thing is that the total time hasn't changed?! So my program reports half the throughput and profile numbers are much bigger for the writeback case, but in both cases tests finish in about the same time. Tell me I'm not goin' nuts?! Yes, I have reseted the profile counter correctly between the runs. Also, if I change to another writeback mounted partition (on the same disk, nearby) it behaves normally (similar numbers as on the ordered mounted one). Why is my /tmp so special? :) *** And now, something completely different. When mounted in ordered mode, and doing the test above (writing & deleting), kernel leaks memory. In fact, such memory can be easily recovered, but still, such behaviour makes unwanted memory pressure, forces stuff to disk too early and even produces some swapping. Every time a file of 100MB was written and unlinked immediately afterwards (before FS had a chance to commit it to disk) ~100MB of memory stayed allocated. Looks like buffer heads which are pinning page cache pages, but as I deleted a file, shouldn't that memory be freed? Another writing and there goes another 100MB of RAM... This is how things looked just before the test (most of the memory free) procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 50124 516816 40220 53356 0 0 0 60 1696 949 0 3 96 and after the test (memory gone) procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 58888 99988 11608 35008 0 0 0 60 1546 924 0 1 99 /proc/slabinfo (the only suspicious entry) buffer_head 134513 159440 96 3543 3986 1 : 252 126 I remind, that only happens when the partition is mounted in the ordered mode. OK, I know this is all confusing, but I'm just trying to help weed bugs and maybe understand a thing or two about the ext3. :) Regards, -- Zlatko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2001-12-11 22:35 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-12-08 21:10 ext3 writeback mode slower than ordered mode? Zlatko Calusic 2001-12-08 21:10 ` Zlatko Calusic 2001-12-08 21:57 ` Jan H. Schrewe 2001-12-09 1:59 ` Andrew Morton 2001-12-09 1:59 ` Andrew Morton 2001-12-09 12:58 ` Juan Piernas Canovas 2001-12-09 12:58 ` Juan Piernas Canovas 2001-12-09 19:46 ` Zlatko Calusic 2001-12-09 19:46 ` Zlatko Calusic 2001-12-10 18:18 ` Stephen C. Tweedie 2001-12-10 18:18 ` Stephen C. Tweedie 2001-12-11 22:31 ` Zlatko Calusic 2001-12-11 22:31 ` Zlatko Calusic
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.