* cfq misbehaving on 2.6.11-1.14_FC3 @ 2005-06-10 22:54 spaminos-ker 2005-06-11 9:29 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: spaminos-ker @ 2005-06-10 22:54 UTC (permalink / raw) To: linux-kernel Hello, I am running into a very bad problem on one of my production servers. * the config Linux Fedora core 3 latest everything, kernel 2.6.11-1.14_FC3 AMD Opteron 2 GHz, 1 G RAM, 80 GB Hard drive (IDE, Western Digital) I have a log processor running in the background, it's using sqlite for storing the information it finds in the logs. It takes a few hours to complete a run. It's clearly I/O bound (SleepAVG = 98%, according to /proc/pid/status). I have to use the cfq scheduler because it's the only scheduler that is fair between processes (or should be, keep reading). * the problem Now, after an hour or so of processing, the machine becomes very unresponsive when trying to do new disk operations. I say new because existing processes that stream data to disk don't seem to suffer so much. On the other hand, opening a blank new file in vi and saving it takes about 5 minutes or so. Logging in with ssh just times out (so I have to keep a connection open to avoid being locked out). << that's where it's a really bad problem for me :) Now, if I switch the disk to anticipatory or deadline, by setting /sys/block/hda/queue/scheduler, things go back to regular times very quickly. Saving a file in vi takes about 12 seconds (slow, but not unbearable, considering the machine is doing a lot of things). Logging in takes less than a second. I did a strace on the process that is causing havock, and the pattern of usage is: * open files * about 5000 of combinations of llseek+read llseek+write in 1000 bytes requests. * close files The process is also niced to 8, but it doesn't seem to make any difference. I found references to a "ionice" or "iorenice" syscall, but that doesn't seem to exist anymore. I thought that the i/o scheduler was taking the priority into account? Is this a know problem? I also thought that timed cfq was supposed to take care of such workloads? Any idea on how I could improve the situation? Thanks Nicolas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-10 22:54 cfq misbehaving on 2.6.11-1.14_FC3 spaminos-ker @ 2005-06-11 9:29 ` Andrew Morton 2005-06-14 2:19 ` spaminos-ker 0 siblings, 1 reply; 16+ messages in thread From: Andrew Morton @ 2005-06-11 9:29 UTC (permalink / raw) To: spaminos-ker; +Cc: linux-kernel <spaminos-ker@yahoo.com> wrote: > > Hello, I am running into a very bad problem on one of my production servers. > > * the config > Linux Fedora core 3 latest everything, kernel 2.6.11-1.14_FC3 > AMD Opteron 2 GHz, 1 G RAM, 80 GB Hard drive (IDE, Western Digital) > > I have a log processor running in the background, it's using sqlite for storing > the information it finds in the logs. It takes a few hours to complete a run. > It's clearly I/O bound (SleepAVG = 98%, according to /proc/pid/status). > I have to use the cfq scheduler because it's the only scheduler that is fair > between processes (or should be, keep reading). > > * the problem > Now, after an hour or so of processing, the machine becomes very unresponsive > when trying to do new disk operations. I say new because existing processes > that stream data to disk don't seem to suffer so much. It might be useful to test 2.6.12-rc6-mm1 - it has a substantially rewritten CFQ implementation. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-11 9:29 ` Andrew Morton @ 2005-06-14 2:19 ` spaminos-ker 2005-06-14 7:03 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: spaminos-ker @ 2005-06-14 2:19 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel --- Andrew Morton <akpm@osdl.org> wrote: > It might be useful to test 2.6.12-rc6-mm1 - it has a substantially > rewritten CFQ implementation. > Just did, and while things seem to be a little better, cfq still gets performance even worst than noop. For this type of load, I think that cfq should get latencies much lower than noop. I ran an automated vi "write to file", to get a more persistant test, on the different i/o scheduler. while true ; do time vi -c '%s/a/aa/g' -c '%s/aa/a/g' -c 'x' /root/somefile > /dev/null ; sleep 1m ; done For some reason, doing a "cp" or appending to files is very fast. I suspect that vi's mmap calls are the reason for the latency problem. the times I got (to save a 200 bytes file on ext3) in seconds: cfq 13,19,23,19,23,15,14,16,14 = 17.3 avg deadline 7,12,11,15,15,8,17,14,16,11 = 12.6 avg noop 23,12,14,12,12,13,14,14,14 = 14.2 avg anticipatory 9,13,13,15,19,15,23,15,12 = 14.8 avg Here is the memory status top - 17:07:44 up 1:42, 1 user, load average: 3.74, 3.62, 3.29 Tasks: 55 total, 2 running, 53 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 0.0% id, 99.0% wa, 1.0% hi, 0.0% si Mem: 1035156k total, 1019344k used, 15812k free, 30092k buffers Swap: 4192956k total, 0k used, 4192956k free, 671724k cached and the disk activity (as you can see, mostly writes at this point, as I think most of the data is cached in memory). # vmstat 1 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 1 0 20368 30320 670780 0 0 45 1189 498 201 26 2 20 52 0 3 0 19376 30320 671916 0 0 128 1052 512 211 77 5 0 18 0 3 0 19376 30320 671960 0 0 0 1220 543 231 3 0 0 97 0 3 0 19128 30320 672136 0 0 0 2284 658 250 13 1 0 86 0 3 0 19128 30320 672220 0 0 0 1160 535 222 7 0 0 93 1 2 0 18880 30320 672376 0 0 0 1040 509 204 13 0 0 87 0 3 0 18756 30320 672496 0 0 0 1076 514 210 11 1 0 88 0 3 0 18260 30320 672680 0 0 0 1052 559 356 18 3 0 79 1 1 0 19376 30328 671692 0 0 0 876 529 187 64 3 0 33 1 3 0 18384 30340 672620 0 0 128 2856 515 197 64 5 0 31 0 4 0 18136 30340 672856 0 0 0 1204 546 234 21 0 0 79 0 4 0 18136 30340 672916 0 0 0 1124 530 231 5 2 0 93 0 4 0 18136 30340 672976 0 0 0 2212 627 255 7 1 0 92 0 4 0 18012 30340 673064 0 0 0 1092 523 235 7 1 0 92 0 4 0 17888 30340 673228 0 0 0 1188 545 239 12 0 0 88 1 3 0 17640 30340 673500 0 0 0 1092 515 229 26 0 0 74 0 4 0 17392 30340 673684 0 0 0 1032 515 236 15 1 0 84 1 1 0 17888 30348 672480 0 0 0 1560 568 249 41 4 0 55 1 3 0 16896 30360 673524 0 0 128 1976 586 223 74 3 0 23 0 4 0 16524 30360 673800 0 0 0 1112 522 233 25 1 0 74 0 4 0 16524 30360 673844 0 0 0 1600 588 257 4 1 0 95 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-14 2:19 ` spaminos-ker @ 2005-06-14 7:03 ` Andrew Morton 2005-06-14 23:21 ` spaminos-ker 0 siblings, 1 reply; 16+ messages in thread From: Andrew Morton @ 2005-06-14 7:03 UTC (permalink / raw) To: spaminos-ker; +Cc: linux-kernel <spaminos-ker@yahoo.com> wrote: > > --- Andrew Morton <akpm@osdl.org> wrote: > > It might be useful to test 2.6.12-rc6-mm1 - it has a substantially > > rewritten CFQ implementation. > > > > Just did, and while things seem to be a little better, cfq still gets > performance even worst than noop. > > For this type of load, I think that cfq should get latencies much lower than > noop. > > I ran an automated vi "write to file", to get a more persistant test, on the > different i/o scheduler. > > while true ; do time vi -c '%s/a/aa/g' -c '%s/aa/a/g' -c 'x' /root/somefile > > /dev/null ; sleep 1m ; done Bear in mind that after one minute, all of vi's text may have been reclaimed from pagecache, so the above would have to do a lot of randomish reads to reload vi into memory. Try reducing the sleep interval a lot. > For some reason, doing a "cp" or appending to files is very fast. I suspect > that vi's mmap calls are the reason for the latency problem. Don't know. Try to work out (from vmstat or diskstats) how much reading is going on. Try stracing the check, see if your version of vi is doing a sync() or something odd like that. > the times I got (to save a 200 bytes file on ext3) in seconds: > > cfq 13,19,23,19,23,15,14,16,14 = 17.3 avg > > deadline 7,12,11,15,15,8,17,14,16,11 = 12.6 avg > > noop 23,12,14,12,12,13,14,14,14 = 14.2 avg > > anticipatory 9,13,13,15,19,15,23,15,12 = 14.8 avg > OK, well if the latency is mainly due to reads then one would hope that the anticipatory scheduler would do better than that. But what happened to this, from your first report? > On the other hand, opening a blank new file in vi and saving it takes about 5 > minutes or so. Are you able to reproduce that 5-minute stall in the more recent testing? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-14 7:03 ` Andrew Morton @ 2005-06-14 23:21 ` spaminos-ker 2005-06-17 14:10 ` Jens Axboe 0 siblings, 1 reply; 16+ messages in thread From: spaminos-ker @ 2005-06-14 23:21 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel --- Andrew Morton <akpm@osdl.org> wrote: > > For some reason, doing a "cp" or appending to files is very fast. I suspect > > that vi's mmap calls are the reason for the latency problem. > > Don't know. Try to work out (from vmstat or diskstats) how much reading is > going on. > > Try stracing the check, see if your version of vi is doing a sync() or > something odd like that. The read/write patterns of the background process is about 35% reads. vi is indeed doing a sync on the open file, and that's where the time was spend. So I just changed my test to simply opening a file, writing some data in it and calling flush on the fd. I also reduced the sleep to 1s instead of 1m, and here are the results: cfq: 20,20,21,21,20,22,20,20,18,21 - avg 20.3 noop: 12,12,12,13,5,10,10,12,12,13 - avg 11.1 deadline: 16,9,16,14,10,6,8,8,15,9 - avg 11.1 as: 6,11,14,11,9,15,16,9,8,9 - avg 10.8 As you can see, cfq stands out (and it should stand out the other way). > OK, well if the latency is mainly due to reads then one would hope that the > anticipatory scheduler would do better than that. I suspect the latency is due to writes: it seems (and correct me if I am wrong) that write requests are enqueued in one giant queue, thus the cfq algorithm can not be applied to the requests. Either that, or there is a different queue that cancels out the benefits of cfq when writing (because even though the writes are down the right way, this other queue to the device keeps way too much data). But then, why would other i/o schedulers perform better in that case? > > But what happened to this, from your first report? > > > On the other hand, opening a blank new file in vi and saving it takes about > 5 > > minutes or so. > > Are you able to reproduce that 5-minute stall in the more recent testing? > > The most I got with this kernel, is a 1 minute stall, so there is improvement there. Yet, a single process should not be able to cause this kind of stall with cfq. Nicolas ------------------------------------------------------------ video meliora proboque deteriora sequor ------------------------------------------------------------ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-14 23:21 ` spaminos-ker @ 2005-06-17 14:10 ` Jens Axboe 2005-06-17 15:51 ` Andrea Arcangeli 2005-06-17 23:01 ` spaminos-ker 0 siblings, 2 replies; 16+ messages in thread From: Jens Axboe @ 2005-06-17 14:10 UTC (permalink / raw) To: spaminos-ker; +Cc: Andrew Morton, linux-kernel On Tue, Jun 14 2005, spaminos-ker@yahoo.com wrote: > --- Andrew Morton <akpm@osdl.org> wrote: > > > For some reason, doing a "cp" or appending to files is very fast. I suspect > > > that vi's mmap calls are the reason for the latency problem. > > > > Don't know. Try to work out (from vmstat or diskstats) how much reading is > > going on. > > > > Try stracing the check, see if your version of vi is doing a sync() or > > something odd like that. > > The read/write patterns of the background process is about 35% reads. > > vi is indeed doing a sync on the open file, and that's where the time > was spend. So I just changed my test to simply opening a file, > writing some data in it and calling flush on the fd. > > I also reduced the sleep to 1s instead of 1m, and here are the > results: > > cfq: 20,20,21,21,20,22,20,20,18,21 - avg 20.3 noop: > 12,12,12,13,5,10,10,12,12,13 - avg 11.1 deadline: > 16,9,16,14,10,6,8,8,15,9 - avg 11.1 as: 6,11,14,11,9,15,16,9,8,9 - avg > 10.8 > > As you can see, cfq stands out (and it should stand out the other > way). This doesn't look good (or expected) at all. In the initial posting you mention this being an ide driver - I want to make sure if it's hda or sata driven (eg sda or similar)? > > OK, well if the latency is mainly due to reads then one would hope that the > > anticipatory scheduler would do better than that. > > I suspect the latency is due to writes: it seems (and correct me if I > am wrong) that write requests are enqueued in one giant queue, thus > the cfq algorithm can not be applied to the requests. That is correct. Each process has a sync queue associated with it, async requests like writes go to a per-device async queue. The cost of tracking who dirtied a given page was too large and not worth it. Perhaps rmap could be used to lookup who has a specific page mapped... > But then, why would other i/o schedulers perform better in that case? Yeah, the global write queue doesn't explain anything, the other schedulers either share read/write queue or have a seperate single write queue as well. I'll try and reproduce (and fix) your problem. -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-17 14:10 ` Jens Axboe @ 2005-06-17 15:51 ` Andrea Arcangeli 2005-06-17 18:16 ` Jens Axboe 2005-06-17 23:01 ` spaminos-ker 1 sibling, 1 reply; 16+ messages in thread From: Andrea Arcangeli @ 2005-06-17 15:51 UTC (permalink / raw) To: Jens Axboe; +Cc: spaminos-ker, Andrew Morton, linux-kernel On Fri, Jun 17, 2005 at 04:10:40PM +0200, Jens Axboe wrote: > Perhaps rmap could be used to lookup who has a specific page mapped... I doubt, the computing and locking cost for every single page write would be probably too high. Doing it during swapping isn't a big deal since cpu is mostly idle during swapouts, but doing it all the time sounds a bit overkill. A mechanism to pass down a pid would be much better. However I'm unsure where you could put the info while dirtying the page. If it was an uid it might be reasonable to have it in the address_space, but if you want a pid as index, then it'd need to go in the page_t, which would waste tons of space. Having a pid in the address space, may not work well with a database or some other app with multiple processes. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-17 15:51 ` Andrea Arcangeli @ 2005-06-17 18:16 ` Jens Axboe 0 siblings, 0 replies; 16+ messages in thread From: Jens Axboe @ 2005-06-17 18:16 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: spaminos-ker, Andrew Morton, linux-kernel On Fri, Jun 17 2005, Andrea Arcangeli wrote: > On Fri, Jun 17, 2005 at 04:10:40PM +0200, Jens Axboe wrote: > > Perhaps rmap could be used to lookup who has a specific page mapped... > > I doubt, the computing and locking cost for every single page write > would be probably too high. Doing it during swapping isn't a big deal > since cpu is mostly idle during swapouts, but doing it all the time > sounds a bit overkill. We could cut the lookup down to per-request, it's not very likely that seperate threads would be competing for the exact same disk location. But it's still not too nice... > A mechanism to pass down a pid would be much better. However I'm unsure > where you could put the info while dirtying the page. If it was an uid > it might be reasonable to have it in the address_space, but if you want > a pid as index, then it'd need to go in the page_t, which would waste > tons of space. Having a pid in the address space, may not work well with > a database or some other app with multiple processes. The previous patch just added a pid_t to struct page, but I knew all along that this was just for testing, I never intended to merge that part. -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-17 14:10 ` Jens Axboe 2005-06-17 15:51 ` Andrea Arcangeli @ 2005-06-17 23:01 ` spaminos-ker 2005-06-22 9:24 ` Jens Axboe 1 sibling, 1 reply; 16+ messages in thread From: spaminos-ker @ 2005-06-17 23:01 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, linux-kernel --- Jens Axboe <axboe@suse.de> wrote: > This doesn't look good (or expected) at all. In the initial posting you > mention this being an ide driver - I want to make sure if it's hda or > sata driven (eg sda or similar)? This is a regular IDE drive (a WDC WD800JB), no SATA, using hda I didn't mention it before, but this is on a AMD8111 board. > > I'll try and reproduce (and fix) your problem. I don't know how all this works, but would there be a way to slow down the offending writer by not allowing too many pending write requests per process? Is there a tunable for the size of the write queue for a given device? Reducing it will reduce the throughput, but the latency as well. Of course, there has to be a way to get this to work right. To go back to high latencies, maybe a different problem (but at least closely related): If I start in the background the command dd if=/dev/zero of=/tmp/somefile2 bs=1024 and then run my test program in a loop, with while true ; do time ./io 1; sleep 1s ; done I get: cfq: 47,33,27,48,32,29,26,49,25,47 -> 36.3 avg deadline: 32,28,52,33,35,29,49,39,40,33 -> 37 avg noop: 62,47,57,39,59,44,56,49,57,47 -> 51.7 avg Now, cfq doesn't behave worst than the others, like expected (now, why it behaved worst with the real daemons, I don't know). Still > 30 seconds has to be improved for cfq. the test program being: #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> int main(int argc, char **argv) { int fd,bytes; fd = open("/tmp/somefile", O_WRONLY | O_CREAT | O_CREAT, S_IRWXU); if (fd < 0) { perror("Could not open file"); return 1; } bytes = write(fd, &fd, sizeof(fd)); if (bytes < sizeof(fd)) { perror("Could not write"); return 2; } if (argc != 1) { fsync(fd); } close(fd); return 0; } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-17 23:01 ` spaminos-ker @ 2005-06-22 9:24 ` Jens Axboe 2005-06-22 17:54 ` spaminos-ker 0 siblings, 1 reply; 16+ messages in thread From: Jens Axboe @ 2005-06-22 9:24 UTC (permalink / raw) To: spaminos-ker; +Cc: Andrew Morton, linux-kernel On Fri, 2005-06-17 at 16:01 -0700, spaminos-ker@yahoo.com wrote: > I don't know how all this works, but would there be a way to slow down the > offending writer by not allowing too many pending write requests per process? > Is there a tunable for the size of the write queue for a given device? > Reducing it will reduce the throughput, but the latency as well. The 2.4 SUSE kernel actually has something in place to limit in-flight write requests against a single device. cfq will already limit the number of write requests you can have in-flight against a single queue, but it's request based and not size based. > Of course, there has to be a way to get this to work right. > > To go back to high latencies, maybe a different problem (but at least closely > related): > > If I start in the background the command > dd if=/dev/zero of=/tmp/somefile2 bs=1024 > > and then run my test program in a loop, with > while true ; do time ./io 1; sleep 1s ; done > > I get: > > cfq: 47,33,27,48,32,29,26,49,25,47 -> 36.3 avg > deadline: 32,28,52,33,35,29,49,39,40,33 -> 37 avg > noop: 62,47,57,39,59,44,56,49,57,47 -> 51.7 avg > > Now, cfq doesn't behave worst than the others, like expected (now, why it > behaved worst with the real daemons, I don't know). > Still > 30 seconds has to be improved for cfq. THe problem here is that cfq (and the other io schedulers) still consider the io async even if fsync() ends up waiting for it to complete. So there's no real QOS being applied to these pending writes, and I don't immediately see how we can improve that situation right now. What file system are you using? I ran your test on ext2, and it didn't give me more than ~2 seconds latency for the fsync. Tried reiserfs now, and it's in the 23-24 range. -- Jens Axboe <axboe@suse.de> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-22 9:24 ` Jens Axboe @ 2005-06-22 17:54 ` spaminos-ker 2005-06-22 20:43 ` Jens Axboe 0 siblings, 1 reply; 16+ messages in thread From: spaminos-ker @ 2005-06-22 17:54 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, linux-kernel --- Jens Axboe <axboe@suse.de> wrote: > THe problem here is that cfq (and the other io schedulers) still > consider the io async even if fsync() ends up waiting for it to > complete. So there's no real QOS being applied to these pending writes, > and I don't immediately see how we can improve that situation right now. <I might sound stupid> I still don't understand why async requests are in a different queue than the sync ones? Wouldn't it be simpler to consider all the IO the same, and like you pointed out, consider synced IO to be equivalent to async + some sync (as in wait for completion) call (fsync goes a little too far). </I might sound stupid> > > What file system are you using? I ran your test on ext2, and it didn't > give me more than ~2 seconds latency for the fsync. Tried reiserfs now, > and it's in the 23-24 range. > I am using ext3 on Fedora Core 3. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-22 17:54 ` spaminos-ker @ 2005-06-22 20:43 ` Jens Axboe 2005-06-23 18:30 ` spaminos-ker 0 siblings, 1 reply; 16+ messages in thread From: Jens Axboe @ 2005-06-22 20:43 UTC (permalink / raw) To: spaminos-ker; +Cc: Andrew Morton, linux-kernel On Wed, Jun 22 2005, spaminos-ker@yahoo.com wrote: > --- Jens Axboe <axboe@suse.de> wrote: > > THe problem here is that cfq (and the other io schedulers) still > > consider the io async even if fsync() ends up waiting for it to > > complete. So there's no real QOS being applied to these pending writes, > > and I don't immediately see how we can improve that situation right now. > <I might sound stupid> > I still don't understand why async requests are in a different queue than the > sync ones? > Wouldn't it be simpler to consider all the IO the same, and like you pointed > out, consider synced IO to be equivalent to async + some sync (as in wait for > completion) call (fsync goes a little too far). > </I might sound stupid> First, lets cover a little terminology. All io is really async in Linux, the block io model is inherently async in nature. So sync io is really just async io that is being waited on immediately. When I talk about sync and async io in the context of the io scheduler, the sync io refers to io that is wanted right away. That would be reads or direct writes. The async io is something that we can complete at will, where latency typically doesn't matter. That would be normal dirtying of data that needs to be flushed to disk. Another property of sync io in the io scheduler is that it usually implies that another sync io request will follow immediately (well, almost) after one has completed. So there's a depedency relation between sync requests, that async requests don't share. So there are different requirements for sync and async io. The io scheduler tries to minimize latencies for async requests somewhat, mainly just by making sure that it isn't starved for too long. However, when you do an fsync, you want to complete lots of writes, but the io scheduler doesn't get this info passed down. If you keep flooding the queue with new writes, this could take quite a while to finish. We could improve this situation by only flushing out the needed data, or just a simple hack to onlu flush out already queued io (provided the fsync() already made sure that the correct data is already queued). I will try and play a little with this, it's definitely something that would be interesting and worthwhile to improve. > > What file system are you using? I ran your test on ext2, and it didn't > > give me more than ~2 seconds latency for the fsync. Tried reiserfs now, > > and it's in the 23-24 range. > > > I am using ext3 on Fedora Core 3. Journalled file systems will behave worse for this, because it has to tend to the journal as well. Can you try mounting that partition as ext2 and see what numbers that gives you? -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-22 20:43 ` Jens Axboe @ 2005-06-23 18:30 ` spaminos-ker 2005-06-23 23:33 ` Con Kolivas 0 siblings, 1 reply; 16+ messages in thread From: spaminos-ker @ 2005-06-23 18:30 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, linux-kernel --- Jens Axboe <axboe@suse.de> wrote: > Journalled file systems will behave worse for this, because it has to > tend to the journal as well. Can you try mounting that partition as ext2 > and see what numbers that gives you? I did the tests again on a partition that I could mkfs/mount at will. On ext3, I get about 33 seconds average latency. And on ext2, as predicted, I have latencies in average of about 0.4 seconds. I also tried reiserfs, and it gets about 22 seconds latency. As you pointed out, it seems that there is a flow in the way IO queues and journals (that are in some ways queues as well), interact in the presence of flushes. Nicolas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-23 18:30 ` spaminos-ker @ 2005-06-23 23:33 ` Con Kolivas 2005-06-24 2:33 ` spaminos-ker 0 siblings, 1 reply; 16+ messages in thread From: Con Kolivas @ 2005-06-23 23:33 UTC (permalink / raw) To: linux-kernel, spaminos-ker; +Cc: Jens Axboe, Andrew Morton [-- Attachment #1: Type: text/plain, Size: 913 bytes --] On Fri, 24 Jun 2005 04:30, spaminos-ker@yahoo.com wrote: > --- Jens Axboe <axboe@suse.de> wrote: > > Journalled file systems will behave worse for this, because it has to > > tend to the journal as well. Can you try mounting that partition as ext2 > > and see what numbers that gives you? > > I did the tests again on a partition that I could mkfs/mount at will. > > On ext3, I get about 33 seconds average latency. > > And on ext2, as predicted, I have latencies in average of about 0.4 > seconds. > > I also tried reiserfs, and it gets about 22 seconds latency. > > As you pointed out, it seems that there is a flow in the way IO queues and > journals (that are in some ways queues as well), interact in the presence > of flushes. I found the same, and the effect was blunted by noatime and journal_data_writeback (on ext3). Try them one at a time and see what you get. Cheers, Con [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-23 23:33 ` Con Kolivas @ 2005-06-24 2:33 ` spaminos-ker 2005-06-24 3:27 ` Con Kolivas 0 siblings, 1 reply; 16+ messages in thread From: spaminos-ker @ 2005-06-24 2:33 UTC (permalink / raw) To: Con Kolivas, linux-kernel; +Cc: Jens Axboe, Andrew Morton --- Con Kolivas <kernel@kolivas.org> wrote: > I found the same, and the effect was blunted by noatime and > journal_data_writeback (on ext3). Try them one at a time and see what you > get. I had to move to a different box, but get the same kind of results (for ext3 default mount options). Here are the latencies (all cfq) I get with different values for the mount parameters ext2 default 0.1s ext3 default 52.6s avg reiser defaults 29s avg 5 minutes then, 12.9s avg ext3 rw,noatime,data=writeback 0.1s avg reiser rw,noatime,data=writeback 4s avg for 20 seconds then 0.1 seconds avg So, indeed adding noatime,data=writeback to the mount options improves things a lot. I also tried without the noatime, and that doesn't make much difference to me. That looks like a good workaround, I'll now try with the actual server and see how things go. Nicolas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cfq misbehaving on 2.6.11-1.14_FC3 2005-06-24 2:33 ` spaminos-ker @ 2005-06-24 3:27 ` Con Kolivas 0 siblings, 0 replies; 16+ messages in thread From: Con Kolivas @ 2005-06-24 3:27 UTC (permalink / raw) To: spaminos-ker; +Cc: linux-kernel, Jens Axboe, Andrew Morton [-- Attachment #1: Type: text/plain, Size: 1434 bytes --] On Fri, 24 Jun 2005 12:33, spaminos-ker@yahoo.com wrote: > --- Con Kolivas <kernel@kolivas.org> wrote: > > I found the same, and the effect was blunted by noatime and > > journal_data_writeback (on ext3). Try them one at a time and see what you > > get. > > I had to move to a different box, but get the same kind of results (for > ext3 default mount options). > > Here are the latencies (all cfq) I get with different values for the mount > parameters > > ext2 default > 0.1s > > ext3 default > 52.6s avg > > reiser defaults > 29s avg 5 minutes > then, > 12.9s avg > > ext3 rw,noatime,data=writeback > 0.1s avg > > reiser rw,noatime,data=writeback > 4s avg for 20 seconds > then 0.1 seconds avg > > > So, indeed adding noatime,data=writeback to the mount options improves > things a lot. > I also tried without the noatime, and that doesn't make much difference to > me. > > That looks like a good workaround, I'll now try with the actual server and > see how things go. That's more or less what I found, although I found noatime also helped my test cases, but also less than the journal options. Coincidentally I only discovered this recently and hadn't gotten around to telling anyone how dramatic this was and this seemed as good a time as any. I am suspicious that it wasn't this bad in past kernels but haven't been able to instrument earlier kernels to check. Cheers, Con [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2005-06-24 3:31 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-06-10 22:54 cfq misbehaving on 2.6.11-1.14_FC3 spaminos-ker 2005-06-11 9:29 ` Andrew Morton 2005-06-14 2:19 ` spaminos-ker 2005-06-14 7:03 ` Andrew Morton 2005-06-14 23:21 ` spaminos-ker 2005-06-17 14:10 ` Jens Axboe 2005-06-17 15:51 ` Andrea Arcangeli 2005-06-17 18:16 ` Jens Axboe 2005-06-17 23:01 ` spaminos-ker 2005-06-22 9:24 ` Jens Axboe 2005-06-22 17:54 ` spaminos-ker 2005-06-22 20:43 ` Jens Axboe 2005-06-23 18:30 ` spaminos-ker 2005-06-23 23:33 ` Con Kolivas 2005-06-24 2:33 ` spaminos-ker 2005-06-24 3:27 ` Con Kolivas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox