On 2008-07-21 21:45, Alan D. Brunelle wrote:
> Hi Edwin -
>
> With the patches sent out today (kernel & application), you can then use
> the updated script attached here. It only asks for getrq & sleeprq
> traces - so it will cut down a lot, but most likely there will still be
> a lot of getrq's (in particular).
>
> Will now have some time to look at the more general issue concerning how
> to see the effects of the sleeprq's...
>   

Thanks Alan.

I have been using a slightly modified version of your qsg.py script, I
want the time delta between
a sleep, and it's corresponding get. Your initial script gave me time
between sleep and *first* get (which probably was a write anyway).
I have attached my qsg.py, after applying your recent modifications.

I noticed something, the longest latency occurs when preload *and* find
is running, so I guess it has something to do with readaheads.
I used the attached iotime.stp script to show process and latency of
get_request_wait:
28199122: latency: 1135ms, pid: 6386 (dd)
60057798: latency: 28232ms, pid: 2918 (preload)
60063004: latency: 25904ms, pid: 6503 (find)
61159927: latency: 1346ms, pid: 211 (kswapd0)

I've been also trying some stuff in blk-core.c, namely forcing READA to
be put on sleep if queue is congested,
and modified mpage.c to actually pass readaheads as READA, and made
page_cache_sync_readahead, and force_page_cache_readahead to
congestion_wait if the bdi is congested.
However all this had no effect.

So either:
- my READA hacks are totally wrong, and readaheads do still end up in
the queue above the limit
- something else is causing latency in get_request_wait, but there isn't
much else there besides a spinlock, a waitqueue, and a device unplug.

I will do some more experiments with the new blktrace.

Best regards,
--Edwin